Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I always find it hard to think of a good reason for a (computed) virtual column

Why would you ever force your db to multiply a value by 12 to another column, or parse a json path, if it’s not for filtering?

Move that effort to your clients so you’re not needlessly consuming db resources.



I've worked in an environment where 3 applications accessed the same database, one of the applications wasn't really maintained.

Having computed (stored or virtual) columns would've been awesome.

The use case isn't really "multiply a value by 12", but more like "we have a single boolean is_active column, and want to migrate to a more extensive status model" or "migrate from an is_active column to a (begin, end) timestamp tuple" or so.

With a virtual column, you can present a read-only, compatible column to the legacy application, while the other applications can use the more detailed, new columns, without having to keep the legacy column in sync.


I think it /can/ also be to 'multiply a number by 12'. For example lets say I'm a supermarket, who gets their order data out of a 20 year old IT system and as a result I've got a orders table which has a id, user_id and data column with an array of 'sku, price and qty'. If I regularly want to query/sort/filter based on total order value the easiest and most performant solution absolutely is to use a computed column.


If you need to retrieve the values, the fastest would probably be a stored generated column, not a virtual computed on the fly.

In case you only want to filter without returning values, you could also index directly on the expression without needing to add a stored generated column with an index on it


Similar experience here. I have used it to make several legacy migrations much smoother.


Think about schema migrations, and the need to be compatible with the new and old versions of the schema.


Much faster to do it in the database than in the client, especially in a normal situation where the postgres server is on a different machine that's possibly not even in the same building as the app server or where it allows you to use existing indexes.

I agree that they're often a symptom of bad schema design or data normalisation though. Sometimes that can't be helped however.


Why should each one of your clients reimplement the calculated value, each in a slightly different way?


I'm not a fan of stored procedures but this is lightweight enough that I like how it simplifies by removing responsibility from the code.

I imagine the computed column could be indexed or materialized if needed.


> I imagine the computed column could be indexed or materialized if needed.

The article mentions that "you cannot create indexes on VIRTUAL generated columns".


Oh thanks I missed the distinction between a virtual and regular generated column.


Since you can index expressions I wonder if that's because you essentially emhave to store the value in the index anyway and that wouldn't be expected for a virtual column?


I don't really understand why this restriction would be necessary. Wouldn't the computed value just get stored in the index btree keys, exactly the same as an indexed expression?

For comparison, MySQL permits indexing of virtual columns, and indexes on expressions share the same underlying implementation to support this. In other words, in MySQL an index on an expression is literally just an index on an internally-hidden virtual column.


I've used them for common transforms such as timezones (e g. data comes in as uutc and we query everything in Chicago time) just to make querying easier.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: