Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nice, how does the Webscraper work? Will it work on e.g. a React site?

Would be a nice use-case to plop a website into a chatbot easily, for example.



The WebScraper tool uses Trafilatura [1] to scrape and parse HTML—nothing too fancy. "Scraping" a React site would require a totally different approach, probably something more akin to Adept's ACT-1 [2].

I run a local chat app built with Griptape and I use it to give me summaries of web pages or answer specific questions all the time :)

1. https://github.com/adbar/trafilatura/

2. https://www.adept.ai/blog/act-1


Do you see an advantage with Trafilatura compared to say BeautifulSoup and other packages for scaping?


I think BeautifulSoup is great but it's more of a set of building blocks that requires developers to implement their custom scraping logic. Trafilatura is awesome because it just works out of the box for most common tasks related to web scraping.


Ah that’s nice to know, thank you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: