Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't understand all the hype for generating SVG with LLM. The task is not really useful, doesn't seem that interesting in single shot as it's really hard, and no human could do it (it would be more useful if the model has visual feedback and could correct the result).

And also, since it becomes a popular task, companies will add the examples in their training set, so you're just benchmarking who has the better text to SVG training set, not the overall quality of the model.



My take is no one really cares about generating SVG, but it's a structured "code" format with very direct visual results. I can't look at 3 piles of code and instantly tell which is best (assuming minimum competence) , but I can judge the SVG outputs very easily. As a quick shot it gets a point across faster and with easier comparison. As a technical comparison it's not so strong, but thats harder to do and judge and less fun to read.


One of my co-founders lost the SVG of our startup logo, and the designer who helped us was away on vacation. I really wanted to experiment with some logo animations for an upcoming demo, so I decided to take matters into my own hands.

I grabbed a high-quality PNG, gave it to ChatGPT, and managed to recreate the SVG from the image, after quite a bit of prompting and tweaking. But it worked out great!


But isn't this something Inkscape can do since forever?


It goes back to Sparks of AGI [0] unless I am mistaken. Can recommend the talk, one that has stayed in the back of my mind since I first saw it two years ago. Personally, still have major reservations about throwing claims of intelligence or understanding around, but I do agree that SVG code generation can be a very effective source to get a quick and easy to present understanding of a models ability to output code with a rather open ended prompt that needs a high degree of coherence and were a lot of layers depend/build on each other.

Helps that these are eye catching (literally as the output is visual) and easy to grasp. Same reason a lot of hype is created around the web desktops.

[0] https://youtu.be/qbIk7-JPB2c?si=_TNRrxN-_5FOlfy5&t=1342


It's obviously a pointless benchmark-but it's fun so people like doing it




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: