Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Working with Twitter Decahose
3 points by srvmshr on June 10, 2022 | hide | past | favorite
I have been following this discussion [1] & challenges on using datahose came up in few places. That leads me to earnestly ask:

We are a brand new research group with just a few hands. This Twitter data is enormous (45-50GB/day for E Asia in JSON). We have limited experience & hence saving it out as daily logs in flat JSON files

For people using decahose, what kind of system architecture have you put in place for storing & searching such data. We explored AWS DynamoDB & MongoDB datalake but the cost seemed just too high. Feedback & suggestions needed.

[1] Twitter plans to comply with Musk’s demands for data : https://news.ycombinator.com/item?id=31686055




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: