Hacker Newsnew | past | comments | ask | show | jobs | submit | kernelsanderz's commentslogin


I’ve been excited about lancedb and its ability to support vector indexes and efficient row level lookups. I wonder if this approach would work for their design goals and still allow broader backwards compatibility with the parquet ecosystem. Have been intrigued by Ducklake, and they’ve leaned into parquet. Perhaps this approach will allow more flexible indexing approaches with support for the broader parquet ecosystem which is significant.


Marimo is really special and solves most of the problems that you have with Jupyter. For those Marimo curious I strongly recommend checking out their YouTube channel. So much effort gone into making these videos really great. https://youtube.com/@marimo-team?si=ZGaf8Zgq5WN3LKRg


I’ll read this tomorrow


For another library that has great performance and features like full text indexing and the ability to version changes I’d recommend lancedb https://lancedb.github.io/lancedb/

Yes, it’s a vector database and has more complexity. But you can use it without creating indexes and it has excellent polars and pandas zero copy arrow support also.


Since a lot of ML data is stored as parquet, I found this to be a useful tidbit from lancedb's documentation:

> Data storage is columnar and is interoperable with other columnar formats (such as Parquet) via Arrow

https://lancedb.github.io/lancedb/concepts/data_management/

Edit: That said, I am personally a fan of parquet, arrow, and ibis. So many data wrangling options out there it's easy to get analysis paralysis.


Lance is made for this stuff; parquet is not.


How well does it scale?


Also worth checking out https://github.com/jasonwhite/rudolfs

Been using it to store datasets via lfs. Written in rust and has been very reliable.


I’ve been using https://github.com/jasonwhite/rudolfs - which is written in rust. It’s high performance but doesn’t have all the features (auth) that you might need.




Some heroes don’t wear capes. They wield scripts, API calls, and a bit of luck.


100%


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: