Hacker News new | past | comments | ask | show | jobs | submit login

Not even close.

The real problems you will have to solve when working with a database anyone would want to use are, for example:

* Memory layout of the buffers used to write / read / cache this data. In Python, you don't even have a concept of memory alignment / layout -- it's all happening somewhere in the interpreter.

* How to best service multiple requests concurrently. Again, Python offers nothing here, and nothing to experiment with. But this is a huge part of working with databases. The whole two-stage commit, transaction, consistency guarantees -- it's the central point of databases, but Python gives you no tools to even try anything like that.

* Working with various underlying storage... most of it has no Python interface (but does have C interface). Eg. if you want to understand how to optimize storing data by using some filesystem -- those filesystems do often expose similar interfaces that go beyond VFS, but they won't be immediately available to Python.

* Working with vectorization of queries -- again, Python doesn't have a concept of ISA, doesn't have any way to instruct the code to utilize any particular CPU instructions... but this is where a lot of work is done by people who work on real databases. Not being able to get any meaningful access to query optimization, planning becomes pointless / useless -- what are your concerns going to be when you write a query planner if you still have no idea how it's going to be executed?

* Similar stuff goes for networking -- whenever you need to solve something that goes beyond the absolute trivial you will at best rely on Python bindings to some library that actually does that rather than on Python code proper. In other words, if your goal was to learn how to do it, you will not achieve that goal because the actual work will happen elsewhere.

So... I don't know... there is no way you can really learn how to make databases in Python. You can probably learn something, depending on what's your baseline, but you will not be ready to make a real thing, if all you have is Python. It's a difference between toy doctor set and learning to be a surgeon...




My learning style is piecemeal: I do a bit of that, a bit of another thing with the long term goal to combine everything I did into one project. It's proof of concept or prototype approach.

The experts are doing databases in C and C++ and Rust. But I'm not a C, Rust or C++ expert, so I need to start somewhere, where I am today.

I start small accomplishable goals to get the idea of the problem solution so I'm not distracted by boilerplate C, C++ or Rust. My multiversion concurrency control is in Java.

You might think all the things are easy but they're not easy to everyone. We have to start somewhere and one way to start is to write the parser in a simple language so you're not wrestling with memory management.

If I tried to do all the things you mentioned in C++, Rust or C it would be too much work in one step. I need to start small to have an achievable result.

Not everyone is Stonebraker or Linus Torvalds.


Guess to get back to the post. He did it for learning, not to be an exact duplicate or competitor to replace other RDBMS.

Often when learning, you do not implement every difficult edge case, the most complex. You are just trying to get the jist of it. You want a smaller problem to solve.

Maybe as a learning project, it isn't important to have concurrency, networking, memory management. Unless, any of those things happens to be of interest to learn about also, then add them back in.

I don't think this is trying to be an argument for Python as good to build an RDB in. (of course it isn't for all the reasons you list)

Python just happens to be an easy language for beginners, so why not build a basic RDB to learn about that too.


Because, especially for learning, you need tools adequate for the task... Masters can be more flexible and use inferior tools and still be successful; students need every bit of help they can get, and they need the tools that aren't going to betray them every step of the way.

Giving someone Python to make a database, is like giving a student in a culinary school a dull knife: it's hard to do it with the right tools, but it becomes mission impossible when you are also crippled by your tools.

It's the same analogy I used before: using toy "doctor set" vs. learning to be a surgeon. There's no path that will bring you from using a toy set to be a surgeon. It serves a different purpose: entertainment / roleplay. You don't mean to roleplay as a programmer by using Python, right?


You need tools adequate to the task, but the task isn't necessarily what other people think it is. In this case, I'd bet the tasks are to learn what's going on inside a database and become better at programming in Python, not to write a high-performance production-ready database implementation.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: