During my tenure as CTO at a fintech company I built a banking engine using post...

SanderNL · on Sept 25, 2023

Interesting how the immediate reaction is “postgres does not scale” when there is a single table lacking an index.

This also tells how important competence and knowledge of the system is. People that came in new and didn’t know the system like you do probably lacked the confidence/skills to just “get in” like that.

zacksiri · on Sept 25, 2023

Yeah, I think though what happened in this scenario probably happens a lot everywhere else also. In my entire career, this type of scenario is very typical. Lots of Meetings / discussions, standups and talking uselessly without jumping in face to face with the actual problem. Things get in the way of the science and facts. Which is why it's important to remove fear, think from first principles and break things down and get your hands dirty.

ansc · on Sept 25, 2023

>There was a lot of back and forth between engineers that discussed whether we should add the index.

Jeez. What was the idea behind not adding? Disk space I presume?

zacksiri · on Sept 25, 2023

There was a fear that having to create an index on a table that large would take a long time, and I think some of it was also ego "I intentionally didn't add it in, because so and so reason". This was why I dug in and did my thing, debunk all the fear / opinions / rationalization. Sometimes you just gotta be able to tell people they're wrong supported with empirical evidence. That's how the team will grow. There is just no need to dance around facts. I remember having to tell the team, "taking a long time to run an index is no reason to avoid creating the index".

ezekiel68 · on Sept 25, 2023

>> I simply conducted a small experiment and PG analyze clearly showed a missing index in one of the key tables.

Based on this sentence, I interpreted that part as representing that the engineers did not believe the missing index was causing the problem (until the experiment was run).

zacksiri · on Sept 25, 2023

Yes, one of the theory was that the index wasn't the problem because there was already a multi column index on that particular column. However the PG analyze tool showed some particular query didn't utilize the index, so there needed to be a separate index just for that particular column.

sgarland · on Sept 25, 2023

The number of reasons why an RDBMS - especially Postgres - can choose to not use an index is wide. Sometimes it’s your fault, sometimes it’s the table statistics fault.

Good on you for actually empirically determining reality.

mixmastamyk · on Sept 25, 2023

Nice. How did you split the data and queue records? Tables, dbs, partitions, etc?

zacksiri · on Sept 26, 2023

The job queue had it's own table. It's basically whatever was the default of the job queue library we were using.