Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I believe I ran into this issue a few years ago and discovered the patent case when trying to work around. The xml file format allowed for arbitrary properties to be added (as xml does), and we were trying to embed metadata in word files. But when MS Word opened a file with anything extra in it it gave a warning like "this file has extra stuff in it" and it automatically removed anything that wasn't explicitly expected.


Not sure why this is downvoted, it’s absolutely correct. I tried this myself; it would have -greatly- simplified scraping Word docs because the custom tags would have been available for XPath querying. Alas, Word strips it all on open.


Yep, we had a similar use-case. I remember the error message pointed to a help page which pointed to an article about this patent.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: