Reading about the "semantic layer" it very much reminds me of the kind of things people do in an ORM. That is: how do tables relate, refinements of data types like strings where a column might have specific semantics... this post doesn't go into much so I don't know if Malloy also allows specifying things like how updates should happen (do you update in place or create new records?), reusable queries (especially given its nesting), knowledge of indexed vs unindexed queries, etc. All of this stuff usually either gets stuffed in the ORM layer, or exists only as folk wisdom about specific databases.
It is peculiar that databases typically lack referential integrity, something that we've decided is absolutely essential in other programming environments.
I am confused here. Referential integrity can only be implemnented in the database. If you try in the application there are race conditions that will break it. RI is a major reason to use a RMDBS (Look it is a Referential Database System)
Semantic layers usually go an extra mile beyond just ORM/E-R modeling. It is best thought of as an abstraction layer between the actual physical data system and the consumption tool (usually a BI or analytics app.)
They do things like define KPIs (how do we as an organization calculate "cost of goods sold") and business logic (what is an "invalid" order?), harmonize entities and attributes across multiple data sources (System A calls something foo, System B calls the same thing bar), provide localization options (currency, date formatting) and so on.
It can also do "basic" things like referential integrity, E-R modeling, aliasing columns, fixing data types for downstream consumption, etc.
Usually it is agnostic to the actual underlying data system (warehouse, lake, SaaS API ...)
It looks closer to a Data Fabric where you have an ORM as a service on top of all your hetereogenerous datasources and services: a semantic layer that enables you to define models across all your sources, and a data virtualization query engine that gets the data from these different sources without replicating all the data.
It is peculiar that databases typically lack referential integrity, something that we've decided is absolutely essential in other programming environments.