My general method for things like this is to: 1. Get a .dot file of the database...

totalhack · 2025-03-18T13:52:34 1742305954

You are doing a lot of the work a semantic layer would do for you. I wonder if you would have better luck having the LLM talk to a semantic layer instead of directly to the database.

jt_b · 2025-03-18T14:17:16 1742307436

Can you talk more about what an implementation of such a semantic layer would look like?

totalhack · 2025-03-18T16:23:03 1742314983

There are a number of semantic layer tools out there these days. Each has their own unique approach, but essentially it's a meta layer on top of your database that can be used to do things like form queries or provide a consolidated API to your data (which may be in multiple databases).

Some comments on this thread mention popular semantic layer tools like cube.dev. I also made an open source one that I use regularly, though it's currently in I-hope-to-put-more-time-into-this-someday mode. Been busy with an acquisition this year.

https://github.com/totalhack/zillion

rishabhparikh · 2025-03-18T14:44:44 1742309084

Not OP, but a naive guess is it would mean that you'd have your schema defined in an ORM (for example, Prisma). The advantage here is that the LLM gets context on both the schema and how the schema is used throughout the application.

MattDaEskimo · 2025-03-18T16:52:35 1742316755

I use Weaviate and let the model create GraphQL queries to take advantage of both the semantic and data layer. Not sure how efficient it is but it's worked for me

Der_Einzige · 2025-03-18T16:33:47 1742315627

Combining this with structured/constrained generation with grammars/pydantic can supercharge this btw

gcanyon · 2025-03-18T17:18:38 1742318318

Can you illustrate this a little? Or should I be asking an LLM for advice? :-)

breadwinner · 2025-03-18T14:27:40 1742308060

Something like this is what I do also. Is there another way? How are others doing it?

dimitri-vs · 2025-03-18T14:40:33 1742308833

Export DDL of one or more relevant tables, run query to get a random sample of 5 records for each table. Really quick to gather all this and enough context to handle most query writing tasks with some guidance.

sitkack · 2025-03-18T16:02:28 1742313748

Schema + samples. Same thing a skilled person would use.