Implementing a B+ tree was the final assignment in a grad-level Algorithms & Data Structures call that I took. I thought it was a reasonable assignment in that context. It felt like a good capstone to the class, building on what we had learned and finally crossing the line from toy examples to something that approached real-world usefulness.
It seems very out-of-place for a databases course. Yes, B-Trees are important for a database, but I don't see how implementing one teaches you about how a database uses one or why it's important.
As another data point, I implemented a B+ tree with rebalancing deletion functionality back in CS3090 at UVU circa 2008 from Dr. Curtis Wellborn.
It was an undergraduate course, but the professor emphasized that everywhere else it would be a graduate-level of course.
My B+ tree implementation served as the foundation for my own (shitty) DBMS ;D
As a funny sidenote, the professor claimed that even Stanford DB courses had a buggy / broken algorithm for balanced deletes. While preparing the course, he'd spent a chunk of the summer doing researching it, and had developed his alogrithm in Prolog. He shared the concepts and our implementations worked fine. However, it was ultimately mostly an academic exercise, as balanced deletions come with significant tradeoffs which make it not as useful as I initially thought.
Even full-fledged database systems don't typically implement B+ tree indexes with deletion+rebalancing support (the usual technique is marking nodes deleted and cleaning them up when the index is rebuilt).
> Even full-fledged database systems don't typically implement B+ tree indexes with deletion+rebalancing support (the usual technique is marking nodes deleted and cleaning them up when the index is rebuilt).
It's intuitively clear that worst-case performance [1] can degrade to a function of the number of insertions since a rebuild (as opposed to the number of currently live items) when you use relaxed deletion. If you can afford to globally rebuild the tree you get to play the usual amortization tricks, but global rebuilds can be problematic in a large-scale live database; one solution is to rebuild from a snapshot in the background (on another thread or machine) and then replay late arriving operations before switching over to the rebuilt tree. But if someone is okay with the amortization spikes from rehashing a hash table, they should be okay with non-incremental global rebuilds for search trees; a hash table where deletions leave behind tombstones and the table is rehashed when the ratio of tombstones gets too high is basically the same idea.
It seems very out-of-place for a databases course. Yes, B-Trees are important for a database, but I don't see how implementing one teaches you about how a database uses one or why it's important.