Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I know in php it's possible to load an html document and parse the DOM tree using XPath expressions, presumably that capability exists in python.

So i guess in theory you could write a frontend (firefox extension?) where you could highlight / select a screen area (webdeveloper already does this), then pass it's DOM information (i.e. #body table tr td#username ) to your backend, which would then scrape that field(s) from any applicable site pages.

This of course assumes that 1) The website(s) are well formed enough for your parser and 2) Well programmed enough that the same info is in the same place in the DOM tree, and preferably ID'd, which are pretty HUGE assumptions, but could be worked around if you were determined enough.

Not sure if this is what you're looking for, and it seems a bit circuitous, but it's a plausible idea anyway.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: