Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hey Curt, Most of my own runs, using the 2 small VM default, resulted in 3 normalized hours of usage, which equated to around 25 cents per run.


that's for the crawl sample, not the entire 4TB index, right?

how much data was that?


That was just for the crawl sample, yes, and was approximately 100M of data, though you can specify as much as you'd prefer.

The cool thing about running this job inside Elastic MapReduce right now is the ability to get at S3 data for free, and for cost of access outside of it, both pretty reasonable sums. Right now, you can analyze the entire dataset for around $150, and if you build a good enough algorithm you'll be able to get a lot of good information back.

We're working to index this information so you can process it even more inexpensively, so stay tuned for more updates!


How is the $150 broken down?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: