Show HN: Mathesar – open-source collaborative UI for Postgres databases

bendews · on March 3, 2023

Nothing constructive to say other than this is amazing. A very common idea but rarely done this well, great job!

kgodey · on March 3, 2023

Thank you very much!

derefr · on March 3, 2023

Not to distract from this project — it's very cool! — but I'd love to know if there's something similar in principle to this, but that's focused less on collaborative querying, and more on collaborative database administration.

Specifically, I've been in the market for a while now for a webapp you can run on a server, that presents a UX for submitting, watching the progress, and managing, long-running schema changes / data migrations (think: index creations, table re-partitionings) that have been written out as long screeds of custom SQL, and then either submitted through said UI, or which are uploaded to the server out-of-band and then pointed at. (And also, support for kicking off + observing + managing native DB-server commands like Postgres's `vacuumdb --analyze-only` or `pg_dump` / `pg_restore`, and relating them to their DB activity.)

Ideally, the UX would be designed so that you could have many of these jobs running in parallel, without everything descending into incoherence. (I'm picturing: the UI of a torrent client. A table of progress bars + error counts; click in to see the SQL execution log and per-job metrics.)

Ideally also, there'd be a library of SQL templates you could create, where you could generate and fire off N parallel copies of a query with the template-variables populated from some combination of running other SQL queries, and supplied static lists of values; with the meta-job having a configurable max concurrency limit for how many concrete jobs it will ask the DB to schedule at once, separate from the max concurrency of jobs in the DB generally.

Maybe as well, some macro-language for those templates, to let you write "if table_exists(foo), do: [sql], else: [other sql]", or "create_index_if_none_satisfy(table, {colA, colB})", etc — rather than having to look up the unintuitive tables and columns in ANSI-SQL INFORMATION_SCHEMA for the Nth time to then write either "compile-time" SQL or sproc SQL to get this effect.

Right now at $JOB we're doing this kind of thing ad-hoc, by writing scripts that take arguments and query the DB to generate directories full of .sql files, that we then run concurrently using shuf(1), split(1), bash for loops, and named shared tmux(1) sessions; and then visualizing progress in various weak ways, e.g. querying `pg_stat_progress_create_index` during index creation.

mathemancer · on March 3, 2023

Mathesar dev here.

I like the idea of being able to track multiple data migrations or schema changes in parallel. We do currently support basic point-and-click DDL, and the data gets migrated when you (for example) set up a linked table based on some columns from an initial table, but we don't really have a way to track multiple such requests in one view.

I suppose one general solution would be to provide a UI for `pg_stat_activity` or other similar system tables that lets you filter easily.

winrid · on March 3, 2023

Mongo's Ops Manager is kind of like this with all its automation (managing machines, shards, index rollouts, etc) and monitoring, but not really the long running query part (there is a profiler though).

It's also not free and breaks in all kinds of fun ways.

debarshri · on March 2, 2023

Interesting how there are some airtable inspired UI elements combined with postgres.

Super interesting, will definitely use it.

PS. I would recommend making the demo account ready-only.

seancolsen · on March 2, 2023

(Mathesar core team member here.) Thanks for the kind words! Our demo server actually spins up a unique database for each session, giving you the capability to make changes to the database and try Mathesar's features to the fullest extent. Your changes are only visible to you unless you send the link to someone else.

debarshri · on March 2, 2023

Could you elaborate that a bit?

That sounds interesting too.

kgodey · on March 2, 2023

Yes! We have set up a Postgres template database[1] with our demo data sets. When a user logs in, we create a new database based on the template database and associate the database with the session key set up for that login. So every session is associated with a different database and we have some Django middleware that routes you to the URL associated with your session's database. If you log out and log back in (thus ending your session), you'll be routed to a brand new database.

However, all databases are accessible to the demo user, so if you send a link to someone else, they will see your database (since the URL includes the database identifier).

You can see our code for it here: https://github.com/centerofci/mathesar/tree/develop/demo

[1] https://www.postgresql.org/docs/current/manage-ag-templatedb...

xupybd · on March 2, 2023

How on earth are you handling HN front page traffic if you spin up a template copy per session? That is impressive.

koolba · on March 2, 2023

You’d be (pleasantly) surprised how easy it is to copy a postgres database. It literally copies the data files for the database to a new directory so the operation is relatively quick if the total volume of data is not large.

What’s more challenging is handling many concurrent connections as a different database would mandate a separate connection and you can’t perform any meaningful connection pooling.

dmos62 · on March 3, 2023

That's accurate! Another bottleneck we have is Mathesar-level state management. We keep some state to reduce redundant Postgres queries. During most of development we focused on single-database use cases, but for the demo we had to expand to managing state for multiple databases in as isolated way as possible. That forced us to face some technical debt in assumptions we made about our state and how we organize it. State-related technical debt is the most challenging, imo.

debarshri · on March 2, 2023

Thanks!

metasj · on March 2, 2023

The schema migrations are a delight. And I appreciate the shoutout to Dabble! Thanks for working on this + congrats on the release.

An optimized hosted version with permalinks to Explorer views would open access to even more users.

pavish · on March 3, 2023

Thank you!

A hosted version is definitely on our plans for Mathesar, and being able to share tables and explorations publicly via a permalink is part of our roadmap[1][2].

[1] https://mathesar.org/roadmap.html [2] https://github.com/centerofci/mathesar/discussions/2266

nickreese · on March 3, 2023

Wow, this is the best explanation of db relationship types I've seen. https://capture.dropbox.com/A3a1aybKNDy1ttEx

Really polished. Congrats on the launch!

kgodey · on March 3, 2023

Thank you very much! We found that a lot of similar "relational database" tools ended up not actually using foreign keys for relationships because they are difficult to explain to non-technical users. We took that as a UX challenge. :). The power of relational databases comes from the relationships, so we want to help users understand and use them.

nickreese · on March 3, 2023

Tried tossing in some of the datasets found here: https://data.cms.gov/provider-data/archived-data/nursing-hom... and none of the file uploads worked.

Either way, really stoked for this.

pavish · on March 3, 2023

We tested importing the files from the link and right on, some of the imported tables do not seem to work as expected. You've just helped us discover a bug.

Here's the issue that tracks it: https://github.com/centerofci/mathesar/issues/2634

Thanks!

mathemancer · on March 3, 2023

I think some of those data sets are likely too big for our poor demo server. It's had a stressful day.

I did try with some of the smaller sets, though, and they seem to work. An advantage of Mathesar is that you can always self-host, and provision enough resources for whatever data size you have!

kgodey · on March 2, 2023

Our live demo is having trouble keeping up with traffic; we're spinning up a load balancer right now and the server will be down temporarily. I'll edit this post when it's fixed.

EDIT: demo server is back up.

bitwize · on March 2, 2023

Love the reference to GalaxyQuest. (Mathesar was the Thermian leader.)

pavish · on March 3, 2023

Haha, yes!

We also intend to release our Svelte component-library as a separate package and considering if we should establish a GalaxyQuest theme when naming it, perhaps Laliari.

Here's the discussion: https://github.com/centerofci/mathesar/discussions/833

kgodey · on March 3, 2023

Never give up, never surrender!

Originally, Mathesar was just an internal codename (we needed to name the repo something), but we decided to stick with it.

earleybird · on March 3, 2023

We need your help.

vladsanchez · on March 2, 2023

So happy to realize your stack includes Python, Postgres, Typescript and Svelte because I once flirted with such idea for a personal project. I'm inspired.

vhanda · on March 2, 2023

If I may ask, have you found using a Typed language such as Typescript has changed your tendency to use Python for other tasks, given that it isn't typed? [0]

[0] - I understand that types can be added on top, but I never found the integration to work that well with mypy when I last tried it many years ago.

seancolsen · on March 2, 2023

Yeah, interesting question! Our [team][0] is eight people who would probably give eight different answers. We've had some [debate][1] about whether or not to use types in Python, and eventually we settled on not using them. However we do use TypeScript. That makes our team a bit stratified across these different typing paradigms. This stratification seems to work okay though because each engineer's role is focused on either front-end work in TypeScript or back-end work in Python (without types). The front end people tend to like types, and the back end people tend not to.

Personally, I love types! (Can you tell that I'm on the front end team??) I used mypy extensively at my last job and liked it okay, but mypy is just not as nice as TypeScript. It's come a long way, but it can still be pretty cumbersome. TypeScript on the other hand is fantastic. I'm certainly never going back to JavaScript. But to get back to your original question: yes, for me using TypeScript has changed my tendency to use Python for other tasks. I just don't want to use Python as much. For backend work, I'd reach either for TypeScript, or for Rust. But keep in mind I'm just a front end dev! :)

[0]: https://mathesar.org/about.html [1]: https://github.com/centerofci/mathesar/discussions/913

dmos62 · on March 3, 2023

I'm on Mathesar's backend team (hi Sean!) and I love types. It's actually a controversial topic amongst us (as might be gleaned from the discussion linked in above comment). I'd say we've come down to using a reduced set of Python features for the sake of community accessibility and maintainability. Maintainability argument being that types can be misused! I personally enjoy having the fanciest toolset, and I imagine that future community members will appreciate more interesting language features being used (because I expect them to be like me), but I also can't say that my personal viewpoint is more valid than the conservative approach we've chosen to adopt. That said, we've been focused on launch and, when that's done, language-related choices might yet be revisited!

There's something to be said about balancing what's interesting and what's accessible. If anyone has stories to share, I'd love to hear it.

noisy_boy · on March 3, 2023

I work on back and front-end and universally like types. Compile time guarantees are a huge reliability booster.

vladsanchez · on March 2, 2023

I haven't because I'm Python biased. ;) However, for Web I've found myself leaning to TypeScript for type-safety and DX reasons.

Have you ever tried [Pydantic](https://docs.pydantic.dev)!? It may be what you need for type safety / data validation.

westurner · on March 2, 2023

JSONLD types are specified with @type, and the range of a @type attribute includes rdfs:Class.

icontract and pycontracts (Design-by-Contract programming) have runtime type and constraint checking; data validation. Preconditions, Command, Postconditions (assertions, assertions of invariance after command C_funcname executed) https://github.com/Parquery/icontract

pydantic_schemaorg: https://github.com/lexiq-legal/pydantic_schemaorg

> Pydantic_schemaorg contains all the models defined by schema.org. The pydantic classes are auto-generated from the schema.org model definitions that can be found on https://schema.org/version/latest/schemaorg-current-https.js... [ https://github.com/schemaorg/schemaorg/tree/main/data/releas... ]

jamesfmilne · on March 2, 2023

I started using type annotations heavily in my Python code recently, and coupled with IDEs like VSCode and PyCharm that leverage the typing, and the underlying tools like pyright/pylance, it's a game changer.

I wouldn't try and write a decent-sized Python project without typing now.

erinnh · on March 2, 2023

Lots of stuff has changed for typed Python over the last few years. (starting with 3.7, but picking up with 3.8)

Waterluvian · on March 2, 2023

Typescript has made me stop loving Python. I wish Python typing was powerful enough to be as complete as TS.

revskill · on March 3, 2023

Vanilla JS is just fine.

Existenceblinks · on March 3, 2023

Their local development docs is also nice: https://docs.mathesar.org/contributing/local-dev/

I'm jumping into a django/react project right now and found that there's so much hassle from integrating frontend + django + livereload + deployment. This repo is a good example.

hultner · on March 3, 2023

Looks cool however I don’t like that it needs to be setup with a super-user and that it’s limited to your three predetermined access roles. I have a project in mind where this tool could be useful but everything is built with row level security policies and roles, the database includes sensitive data and authentication user for the connection have basically no access by itself, so it looks like Mathesar wouldn’t be useful here.

I also know of another similar project heavily utilizing RLS where end users actually have a 1:1 database user with access to their own data, here something like Mathesar could be nice as well if it would play nice with ACL/RLS.

mathemancer · on March 3, 2023

I agree that we have some work to do on access control. In fact, we have setting up Mathesar users with the 1:1 database user mapping in our roadmap[1]. Assuming you're referring to the Database super user as being a problem, we also have giving more granular control over that to the installer in the future. I completely understand reluctance to give a webapp super user access to a production database.

[1] https://mathesar.org/roadmap.html

hultner · on March 3, 2023

Very nice to hear, I will follow your project. One of the projects I was thinking about is financial systems used by government actors such as correctional services. For these customers there’s strong legal requirements for data handling so a database super user with full access is a big no-no even for the most trusted developers internally.

kgodey · on March 3, 2023

We completely understand. We want to have the ability for Mathesar to gracefully handle DB users with different permissions (i.e. we'd disable features in the UI based on the permissions of the DB user set up during installation), but we couldn't get that done in time for launch. We'll continue working on this.

hebrox · on March 3, 2023

Looks really cool!

I tried the install script and it asked me for my computer password. It feels really dangerous to give a random script from the internet my password. Could you maybe change this, so this is not necessary?

dmos62 · on March 3, 2023

We might do well to add to the script a more in-depth explanation of why elevated privileges are needed. We did take care to not run the whole script as sudo, but only parts of it (always read a script before giving it privileges!).

So elevated privileges are needed because we're installing using Docker Compose and you can't use Docker without some form of elevated privileges (most people always run it with sudo). MacOS is a slightly different story, but anyway.

The reason we use Docker Compose is because we need to not only deploy Mathesar, but also a Postgres database (if you're not using your own), and we want maximally simple installs and in-app upgrading (upgrading is implemented via Watchtower, which is basically fancy docker pulls with automatic container restarts).

x98asfd · on March 3, 2023

Thanks for the explanation. Is there a way to NOT use docker. For those of us who are NOT on linux, and are quite comfortable with postgres install and running. Would prefer just a postgres connection string.

silentninja · on March 3, 2023

We are actively working on adding a new install process to help users install Mathesar without Docker. We have an issue[1] to track it.

[1] https://github.com/centerofci/mathesar/issues/2427

kgodey · on March 3, 2023

To add to what silentninja said, we are already deploying Mathesar without Docker on some test servers, we just need to document our setup. Coming soon!

mathemancer · on March 3, 2023

At the moment, the only installation method we support uses Docker, which requires elevated privileges on most systems. We're probably going to reduce the privileges needed for some macOS installations in the near future, since Docker runs differently there.

jtr1 · on March 2, 2023

This is super cool! I really dig the way airtable makes relational databases accessible to non-developers, but I've been looking for open source alternatives. Going to give this a spin this weekend :)

seancolsen · on March 2, 2023

Thanks! We have taken a lot of inspiration from Airtable -- but we've deliberately avoided calling ourselves an "Open Source Airtable alternative" because we are actually trying to stay more true to the concept of "relational data" than Airtable does. Airtable sweeps a lot under the rug. Primary keys and foreign keys are heavily abstracted in Airtable, giving rise to some surprising behavior for anyone more accustomed to actual databases.

In Mathesar, we want the abstractions to be as thin as possible, while still giving non-technical users the capability to model and edit their data. You can point Mathesar and an existing Postgres database and, as long as all the foreign keys are in place, you should be able to use it a lot like Airtable. But when you set up a new relationship between tables, you'll find that it works very differently in Mathesar than in basically any other spreadsheet-database hybrid tool.

One-to-many relationships use a foreign key column in one table (instead of in two tables in Airtable). Many-to-many relationships use two FK columns in a join table (instead of by showing a column in each of the two related tables). All of the data is kept normalized. That normalization can make it hard to work with the data though! So we are developing a robust graphical query builder (called the "Data Explorer") which lets you to see data joined across multiple tables.

iambcam · on March 2, 2023

I'm sorry if this question is stupid, I'm still learning a little about technology. The little bit of knowledge I have tells me that my product team shouldn't directly access the same production database as my application. The idea for this product would be to create a copy of my production base and give access to it through this tool? Wouldn't the cost of this replica be too expensive?

pavish · on March 3, 2023

(Mathesar core team member here)

My previous job was in a biggish e-commerce company and the business teams would send spreadsheets with corrections to the data (like updating descriptions, changes in pricing etc.,) almost every day, and the developers had to make those changes in production.

We ended up building an inhouse application to let the product team edit production data directly, with some restrictions. One of Mathesar's goals is to solve collaboration issues like these.

Mathesar allows setting up users with different access levels[1], so it's easier to create users with limited access. As part of our roadmap[2], we also intend to allow configuring permissions at a more granular level, such as row & column level access to tables. So, developers could breathe freely when providing business teams access to the underlying database.

[1] https://docs.mathesar.org/product/users/ [2] https://mathesar.org/roadmap.html

dmos62 · on March 3, 2023

To add to what Pavish said, reads are always safe, so you can examine, monitor, make reports and what we call Explorations (our user-friendly spin on Postgres views) of your data without worries. When it comes to adding data and updating it, that can be safe as well in many cases, but definitely should consult with someone that knows the specifics of your application. Schema changes, however, are very likely to break something, hence the permission system Pavish described.

mistersys · on March 2, 2023

In my experience in tech, most teams directly access and manipulate production frequently. Bigger companies usually have restrictions on this but in the start up world not so much.

TeeWEE · on March 3, 2023

That’s a big red flag if those teams are not engineering teams. Don’t you have CI CD pipelines setup with something like flyway?

TeeWEE · on March 3, 2023

You’re actually right. You don’t want people to edit the DB or schema in this manner.

Schema migrations should go via CI/CD and data changes via your app.

Otherwise it’s gets messy indeed.

kgodey · on March 3, 2023

Mathesar is designed to be used for a bunch of different types of use cases.

I agree you shouldn't allow users to change the schema if you're connecting Mathesar to a production database (that would mess a lot of things up!) We've designed our user roles to account for this – you would assign "Editor" permissions to users, and they can only edit data, but not change anything related to the structure.

On the other hand, if you're using Mathesar as your primary tool to work with your data, it's really nice to be able to change the schema to align with changes in your mental models or workflows.

mrslave · on March 3, 2023

I won't repeat supportive comments except to say well done.

Feature request: historical tracking. E.g. on update, expire the old record with who & when.

pavish · on March 3, 2023

Thank you! We're glad to take your feature request into consideration.

We have had discussions regarding auditing in Mathesar[1], though it's not explicitly documented in the roadmap yet. We do have plans to implement Undo & Redo support for data editing operations in our roadmap[2] which would depend on the auditing implementation.

[1] https://github.com/centerofci/mathesar/discussions/113#discu... [2] https://mathesar.org/roadmap.html

samanator · on March 2, 2023

Wow I wish this had been published a year ago!

This is exactly what I've been looking for (but now I'm heavily invested in a full fledged CRM)!

Thank you for this

kgodey · on March 3, 2023

Thank you! :)

ryan_platofy · on March 3, 2023

This looks great and congratulations on the alpha launch! Reminds me of another open source project in this space, NocoDB. Are you guys familiar with that project? If so, what are the main differences between them and what you are offering?

pavish · on March 3, 2023

Thank you!

We've documented the differences between Mathesar and other similar products including NocoDB on the FAQ section[1] in our website.

[1] https://mathesar.org/faq.html#how-is-mathesar-different-from...

ryan_platofy · on March 3, 2023

Apologies, somehow missed that. Thanks for the speedy response. Your explanation is very clear. Something that is keeping me tied to Airtable is their 'interface designer'. Allows for rapidly building internal admin tools. Something similar is apparently on NocoDB's roadmap (they are calling it page designer I believe). Am I correct in saying that this not something that is on your current Roadmap (had a look, but did not spot something similar, but could have misread)?

pavish · on March 3, 2023

You're right, it's not in our current roadmap, which focuses mainly on features that improve data entry, analysis, visualization, and collaboration. Having said that, Mathesar could grow as a product and have a diverse set of features we've not yet planned, in the next few years.

A feature to build interfaces for internal admin tools sounds very cool, and while we cannot say that it's coming for sure, we'll definitely be discussing it. Please feel free to open a feature request issue on our Github repo, we'd appreciate it.

kgodey · on March 3, 2023

To add to what Pavish said: Our designer is excited about building an Airtable-like interface designer, but there's a lot of complexity involved, and it will cater to a whole different category of users. We want to get our current features to a more stable state before jumping into that. It's definitely on our radar, though.

cadbox1 · on March 2, 2023

Nice work! It looks like Sequel Pro for Postgres.

This might be the first free, user-friendly interface for Postgres out there. Closest would be TablePlus but it’s paid.

kgodey · on March 2, 2023

Thank you! I was pretty surprised that nothing like Mathesar existed already.

Unlike SequelPro and TablePlus, we're a web application (although we do want to do a desktop version eventually). And Mathesar works even if you don't have an existing DB to connect to – you can set up a DB from scratch during installation.

phamilton · on March 2, 2023

https://github.com/beekeeper-studio/beekeeper-studio is pretty decent.

minusf · on March 2, 2023

tableplus is freemium like. in the free version can have 2 tabs open. it also works with way more than postgres.

victorbjorklund · on March 3, 2023

Looks really nice. Is there any way to export an exploration as SQL? I guess this can be useful when creating complicated views etc.

mathemancer · on March 3, 2023

Not at the moment, but it is in our roadmap[1]. In fact, we're planning to help users start or tinker with data explorer, then switch to an SQL editor starting from their exploration so far. That way, they can use features or queries beyond what we currently support in the GUI.

[1] https://mathesar.org/roadmap.html

victorbjorklund · on March 3, 2023

Awesome! Looks great!

winrid · on March 3, 2023

Was just thinking about how it looks like a fancy Django Admin

Then I saw - its actually built on Django! Neat. :)

kgodey · on March 3, 2023

Thanks! We are currently using a freakish mix of Django and SQLAlchemy, but we're aiming to simplify that soon-ish (by removing SQLAlchemy).