Screen readers doing OCR is a big no-no. There is no possible way that will have the effect you want it to have. The text-to-speech part is the absolute bare minimum of what a screen reader is actually doing. For example that would break horribly with the situation you describe, where some piece of text gets clipped.
Just IMO, it's unrealistic to expect to have an accessible GUI that stores no state, does none of that type of bookkeeping, has no OOP model, and only outputs to pixels. The point with these type of assistive technologies is that the user can't work with that type of visual data, they need more state presented to them. Sure, it makes everything easier to develop when you cut it all out and only use immediate mode, but that's exactly the problem: everything else has been cut out, on purpose.
> For example that would break horribly with the situation you describe, where some piece of text gets clipped.
Forgive my ignorance, but what should happen here? Let say I have a scrollable text pane with an entire novel in it.
I agree there should still be a mechanism to pass more accessibility data than OCR could extract. What is the bare minimum information that is required? Pretend there is no UX model -- arbitrary things could be presented (like in a video game).
In particular, with Gio there isn't necessarily a single set of widgets or UX. Gio has a small set of Material Design compliant widgets, but is mostly a library for composing immediate mode graphics. For example I have several custom widgets, some that only interact with the keyboard, some that don't have any text at all (just graphics or animations), one is a akin to a 2D-scrollable map, click-to-drag Google maps style. I'm not really sure where I'd begin in making these accessible. How should something like Google Earth be made accessible, ideally?
Have you heard of the Screen Recognition feature in iOS 14? I haven't had a chance to use it myself yet. From what I've heard, it's impressive, but not a complete solution, at least not yet. Apple published a paper about it here:
That's quite interesting, thank you. I haven't heard of that. My only concern with that is, I hope some app developers don't see it as an excuse to avoid using the native accessibility APIs and adding the necessary properties to their controls.
Just IMO, it's unrealistic to expect to have an accessible GUI that stores no state, does none of that type of bookkeeping, has no OOP model, and only outputs to pixels. The point with these type of assistive technologies is that the user can't work with that type of visual data, they need more state presented to them. Sure, it makes everything easier to develop when you cut it all out and only use immediate mode, but that's exactly the problem: everything else has been cut out, on purpose.