It would be great to have it as a backup, but it will always be the heaviest in ...

fho · 2025-08-12T04:40:14 1754973614

Have you played around with the current vision features? I am pretty sure even gpt-4.1 can give you pretty good descriptions of e.g. screen captures, including being able to "read" and reproduce text.

gostsamo · 2025-08-12T16:27:21 1755016041

yes, there are multiple addons giving screen readers the ability to prompt ai-s for image recognition. they work rather well, btw, though the value is often situational. agentic behavior might help further, though it will need some polishing.