Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"How would this impact people who rely on screen readers" was exactly my first thought. Unfortunately, it seems there is no middle-ground. Screen-reader-friendly means computer-friendly.


Worse: Scrapers that care enough will probably just take a screenshot using a headless browser and then OCR that if they care enough.


When building a mini corporate filings digest generator, I very quickly switched to using tesseract over reading the selection layer in the pdf.

Unfortunately it is the most reliable way to get readable text out...

Also does guard against prompt injection via white text eh?


Or they'll just strip those Unicode characters out of the text. Automation is trivial.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: