Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It would be great to have it as a backup, but it will always be the heaviest in computation and responsiveness solution so it should be the last one used.


Have you played around with the current vision features? I am pretty sure even gpt-4.1 can give you pretty good descriptions of e.g. screen captures, including being able to "read" and reproduce text.


yes, there are multiple addons giving screen readers the ability to prompt ai-s for image recognition. they work rather well, btw, though the value is often situational. agentic behavior might help further, though it will need some polishing.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: