Hey folks—I wrote the post! This was my biggest Postgres project to date, and it proved quite tricky since I didn't rehearse with a test database of the same size. I learned a bunch about Postgres, not least the incredibly powerful NOT VALID option for safely and quickly adding constraints.
Happy to stick around and answer any questions you have.
Not in my experience. You can use the —-link flag to get it to use hard links so it doesn’t need to move the data at all. Have been through this process myself a few times and it only took seconds on a 50-100GB db. I’m always a little surprised with how week it works.
I'm curious why a test run on a proper sized replica database wasn't in the testing plans. That is something I've been ensuring happens for a while now for similar projects.
I'm probably missing something, but it sounds like using Warp has a bunch of downsides vs "just" creating a read only replica using logical replication and then failing over. Did you choose Warp only because of Azure's limitations or were there other reasons?
Great post! Did you migrate your old database to Azure Flexible Server, Hyperscale (Citus) or a standalone VM, as Postgres 13 does not seem to be available for Azure Postgres Single Server.
This technique saved us from seriously increasing the cost of our Heroku Postgres instance. Thank goodness it exists and works so well. Multiple 80+ GB indexes shrinks down to less than 10GB after just a couple of hours.
Exact strategy to be determined—we're looking at various data layers at the moment. I wish we could do something simple like a rotating log file, but we want to be able to query it in the app (for instance, to show recent logins).
Have you considered an OLAP database like Clickhouse or QuestDB? An OLAP database would be a much better fit for audit tables given the read and append-only writing requirements, would compress better, and you might be able to fit it directly without changing any app code with a Postgresql foreign data wrapper.
You can partition your audit/event table by time period and archive old events [1] or you can avoid the records hitting the database in the first place by generating the events elsewhere to begin with [2].
I'm not OP, but they were upgrading from Postgres 9.6, which at least implies that this initial db was from ~2017.
This is barely past the initial release of Cockroach. It would have been kind of crazy for the Retool team to use an experimental db with a lack of history when building up their product (that was not dependent on experimental new features)
Happy to stick around and answer any questions you have.