Imagine being asked to draw what the op said, but you couldn’t see what you’d drawn - only a description that said “a man and a woman in a Honda “
Asked to draw a new picture with the history of :
Draw a picture of a man in the driver seat and a woman in the passenger seat.
(Picture of a man and a woman in a car)
No, the man in the drivers seat!
——
How well do you think a very intelligent model could draw the next picture? It failed the first time and the descriptions mean it has no idea what it even drew before.
Coding agents have had good success doing this. Providing the errors allows it to potentially figure out how to fix it. It's able to do more with this iterative approach than without.
Asked to draw a new picture with the history of :
Draw a picture of a man in the driver seat and a woman in the passenger seat.
(Picture of a man and a woman in a car)
No, the man in the drivers seat!
——
How well do you think a very intelligent model could draw the next picture? It failed the first time and the descriptions mean it has no idea what it even drew before.