A soon-to-be-released sqlite-loadable extension that's the fasted CSV parser for SQLite, and rivals DuckDB's parser at non-analytical queries: https://github.com/asg017/sqlite-xsv
The API for virtual tables is tricky, because you don't ever consume the API in Rust: SQLite does it. However, as it is presented the interface is unsafe, I think.
In the sqlite-xsv VTab::open impl for XsvTable [0], you pass a &'vtab XsvTable to XsvCursor. But the means that if open is called again, to create a second cursor, then a `&mut XsvTable` and a `&XsvTable` reference exist at the same time, which is unsafe. Note that sqlite-xsv doesn't actually keep the `&XsvTable`, so it's fine, but it serves as an example where a safety problem could surface.
The same problem also applies to VTabWriteable::update. The fix is to make both of these methods receive an immutable reference and force implementors to use interior mutability. Note this isn't hypothetical, it's actually unsafe, and would appear if you had a unit test that implemented a writable virtual table backed by a rust data structure, and attempted to iterate and update at the same time.
I have an as-yet-unpublished code base that experiments with this. It's not published yet because it doesn't cover everything I want to and I'm trying to avoid publishing something where the API might have to break. My goals are explicitly safety first, performance second. https://github.com/CGamesPlay/sqlite3_ext
Please consider GIS/spatial extension. SpatiaLite is pretty bad, for example the KNN feature is deprecated and the replacement has not yet made it to a release yet.
Where can I get a good understanding of the limitations of SQLite for production use cases?
I’ve used it frequently for prototyping, but the dialect is different enough from Postgres, that I usually switch over pretty early. But for production, it seems people are really advancing it as a solution for applications, which doesn’t fit with the “When To Use” guide on the website. https://www.sqlite.org/whentouse.html
The idea that SQLite shouldn't be use for web applications is about a decade out-of-date at this point.
But... actual guidance as to how to use it there is still pretty thin on the ground!
Short version: use WAL mode (which is not the default). Only send writes from a single process (maybe via a queue). Run your own load tests before you go live. Don't use it if you're going to want to horizontally scale to handle more than 1,000 requests/second or so (though vertically scaling will probably work really well).
I'd love to see more useful written material about this. I hope to provide more myself at some point.
depending on your use case, using multiple writer processes can be fine, e.g. multiple writer processes storing periodic metrics (once per minute) into a single sqlite database. the occasional "database is locked" error (it does happen a few times a day) is handled by simply returning and trying again at the next interval. that's fine for this particular use case - the metrics for the last interval are just delayed a bit.
One thing that confused me initially with sqlite - if you want it to wait for writing locks to be free instead of erroring on write contention, you have to start write transactions with BEGIN IMMEDIATE. Since if you start with a read transaction that gets upgraded to write in the middle of it, you can't really wait for locks because you cant have a consistent snapshot of an old version of the DB that is read/write.
yes, we use BEGIN IMMEDIATE with a 2 second busy timeout. that has a worst case wait time of 2 seconds, but the 95th percentile wait time is around 100 to 1000 microseconds and the 99th percentile is around 1 to 10 milliseconds. for a high throughput and/or low latency service, a worst case wait time of 2 seconds might be unacceptable, but it's fine in our case.
> The idea that SQLite shouldn't be use for web applications is about a decade out-of-date at this point
I think it's still very relevant. I mean you have service XYZ replicated two times behind a load balancer, how do you use SQLite out of the box? You can't, you have to use one of those service that will proxy / queue queries, so then why even trying to use SQLite.
The fact that it's just a file on disk make it a bad solution for a lot of simple use cases.
If you're worried about high availability should your instance fail (which is something that's worth worrying about, though many smaller web apps will likely work perfectly fine on a single instance with good scheduled backups) you should absolutely investigate https://litestream.io - it's the missing piece of the SQLite-for-web-apps story.
You could partition your data to 2 parts, and direct your users to whichever service serves their partition of that data. Every database turns out to be just a set of files on disk, but for using SQLite you have to think differently than most databases due to the extreme locality of the data.
Did any one tried https://github.com/rqlite/rqlite
How well it can handle horizontal scalability. With added distributed features, it is not good as any other RDBMS (postgres or mysql)
I worked on a set of extensions for mattn/go-sqlite3, written in Go, inspired but a little further along (in functionality and test coverage) than SQLean. Useful if you want to see examples of this in Go or are using Go rather than Rust to embed SQLite.
Just in time! I've been planning a SQLite extension written in Rust for Marmot (https://github.com/maxpert/marmot) for evaluation purposes. I didn't wanted to touch C/C++ because I feel more comfortable with rust tooling. Thank you for doing this!
I'm quietly confident that you could compile sqlite with a rust extension into a single .wasm file.
The first thing I'd try is to use crate-type = ["staticlib"] to compile the sqlite extension to a static .a file (cargo build --target wasm32-unknown-unknown), and compile sqlite to a staticlib separately then link them together into a .wasm file and run wasm-opt & friends on that.
There might be some configuration needed to let the extension know the sqlite functions its calling will be linked together later, but that should all be possible. Maybe with some whack-a-mole of wading through compiler versions and build flags and whatnot.
The workflow is basically prototype your virtual tables in python and then write some optimized code in a lower level language later. Just an idea for those interested.
I've never written an extension for Postgres, but for the ones I've installed, it's amounted to copying some files and running "CREATE EXTENSION" This feels pretty trivial to me already.
Direct GitHub link: https://github.com/asg017/sqlite-loadable-rs
A soon-to-be-released sqlite-loadable extension that's the fasted CSV parser for SQLite, and rivals DuckDB's parser at non-analytical queries: https://github.com/asg017/sqlite-xsv
A regular expression extension in Rust, the fastest SQLite regex implementation: https://github.com/asg017/sqlite-regex
And in the near future, expect more extensions for postgres, parquet, XML, S3, image manipulation, and more!