That's a good pattern for straight data retrieval. Unfortunately if you need to ... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

wenc on Jan 25, 2018 | parent | context | favorite | on: It’s About Time for Time Series Databases

That's a good pattern for straight data retrieval.

Unfortunately if you need to do aggregated queries on all of the SQLLite tables, things may be challenging.

But if you could somehow connect Spark to a folder (on a distributed FS) of these SQLite files...

Edit: Also SQLite has a limitation that only one process can write to it at a given a time. For this particular use case though, it shouldn't be a problem unless you have rewrites coming from various sources (which can happen when correcting data)

foota on Jan 25, 2018 [–]

I mean, depends on the aggregation you need imo. Shouldn't be too hard (tm) to rig up some distributed query pipeline. (as long as you are ok with coding per query, instead of the convenience of sql)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact