Hacker Newsnew | past | comments | ask | show | jobs | submit | hawaiianSpork's commentslogin

How do succinct data structures do with vector operations on cpu?

Not sure if they are succinct, but the Apache arrow format encodes data in several ways that is compact in memory but also allows operations on these structures.


Parquet has been the lakehouse file format of choice for nearly half a decade. But we are starting to see other contenders that are optimized more for lower latency like lance https://github.com/lancedb/lance


5 years is not a super long time. It just can feel that way sometimes.


I looked for the York Abstract Machine used in the paper and couldn't find anything about it outside of this paper: https://www.cs.york.ac.uk/plasma/publications/pdf/ManningPlu...

It would be nice to play with the code.


I wonder if exposing this as a language server would be helpful?


Yes, I've been considering that since the beginning. Website (backend in C++, frontend in Svelte) takes priority, because this solution is good enough so far, and I'd really like to have access to my ZK on my phone. Probably not a website meant for the open internet: I have a server at home + use Wireguard VPN, so my phone can connect to local services/sites at all times.


If you are looking to do data validation from the JVM, you may try Baleen (written in Kotlin): https://github.com/ShopRunner/baleen/

I'm one of the contributors. We created a DSL in the language to describe the data and create tests. You can then use that data description to validate against json, csv, avro... One of the neat things we came up with was the concept of a data trace which is like a stack trace but is a path through the data to a particular error.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: