Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Imagine being asked to draw what the op said, but you couldn’t see what you’d drawn - only a description that said “a man and a woman in a Honda “

Asked to draw a new picture with the history of :

Draw a picture of a man in the driver seat and a woman in the passenger seat.

(Picture of a man and a woman in a car)

No, the man in the drivers seat!

——

How well do you think a very intelligent model could draw the next picture? It failed the first time and the descriptions mean it has no idea what it even drew before.



Coding agents have had good success doing this. Providing the errors allows it to potentially figure out how to fix it. It's able to do more with this iterative approach than without.


But fundamentally it requires that it can actually see the thing it’s trying to fix. Lots of these models can essentially barely see.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: