Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Congrats on the launch! Does it search within pdf files?


Thanks! Yes, it does do PDFs. We don't do anything fancy with it though like Optical Character Recognition (OCR). So pictures of text, as well as images and graphs will be lost. This is something we will work on though.

Is this something that you would find a lot of value in or is simple text processing of PDFs sufficient?


Not OP, but I would definitely find a lot of value from processing PDFs in such a way that it could eg understand tables and images. I work in mining and having it digest a 43-101 technical report with images and tables would be supremely valuable.

I know that might be a niche case tho.

Absolutely incredible work you’re doing tho wow, I’m very impressed by what you’re doing and the way you’re doing it. Even if you stopped now this is a masterpiece, so while yes I would definitely find a lot of value from being able to process images and graphs/tables, simply being able to process the text and cite it is already a superpower. Thank you for your amazing work!!!


I'd benefit from OCR too. Not just PDFs, but OCR on images could be super useful to.

For a personal use case, I'm thinking things like receipts. For work, I'm thinking OCR on architecture diagrams/etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: