I feel like a graph database is a solution to an issue I've faced (and, continue...

colinmcd · on Feb 10, 2022

EdgeDB employee here. I couldn't have asked for a better question to demonstrate the power of subqueries! Here's how I'd do this in EdgeQL:

  with tag_names := {"angular", "nestjs", "cypress", "nx"},
  select BlogPost {
    title,
    tag_names := .tags.name,
    match_count := count((select .tags filter .name in tag_names))
  }
  order by .match_count desc;

Which would give you a result like this:

  [
    {
      title: 'All the frameworks!',
      tag_names: ['angular', 'nestjs', 'cypress', 'nx'],
      match_count: 4,
    },
    {
      title: 'Nest + Cypress',
      tag_names: ['nestjs', 'cypress'],
      match_count: 2,
    },
    {
      title: 'NX is cool',
      tag_names: ['nx'],
      match_count: 1,
    },
  ];

maxhq · on Feb 11, 2022

Do I understand this correctly in that if the list goes on, it will also show posts with unrelated tags in tag_names? As the filter is only applied to match_count?

So to only get blog posts with matching tags we would need to add a filter „match_count > 0“, right?

Update: I am very excited about EdgeDB :)

colinmcd · on Feb 11, 2022

Yep, you understand correctly!

  with tag_names := {"angular", "nestjs", "cypress", "nx"},
  select BlogPost {
    title,
    tag_names := .tags.name,
    match_count := count((select .tags filter .name in tag_names))
  }
  filter .match_count > 0
  order by .match_count desc;

thaumasiotes · on Feb 10, 2022

So as I read TheSpiciestDev's comment, he's complaining that making his query in PostgreSQL is slow. It looks like EdgeDB is a frontend to PostgreSQL; how will it help with TheSpiciestDev's problem?

RedCrowbar · on Feb 10, 2022

The problem sounds like something that could be solved with a GIST index. EdgeDB doesn't yet have a way to specify the index type, though, mostly because we aren't sure what would be the best way to do it without things becoming too Postgres-specific in schemas.

eenell · on Feb 11, 2022

What do you mean by "too Postgres-specific"? Will you be supporting other DBs behind the EdgeDB interface in the future?

RedCrowbar · on Feb 11, 2022

This is not something we plan to do in the near future, but it’s also not outside the realm of possibility. We picked Postgres because of its power, quality and unparalleled extensibility, but we are also very careful to not leak any implementation details into our interfaces.

electroly · on Feb 11, 2022

I'm curious how this squares up with what someone linked elsewhere: https://github.com/edgedb/edgedb/discussions/3403

> EdgeDB does not treat Postgres as a simple standard SQL store. The opposite is true. To realize the full potential of the graph-relational model and EdgeQL efficiently, we must squeeze every last bit of functionality out of PostgreSQL's implementation of SQL and its schema.

I don't see how this and what you're saying can both be true at the same time. Is EdgeDB tightly coupled to the implementation of PostgreSQL, or isn't it? Is there really a chance that EdgeDB could support other databases, or not really? I don't think there's anything wrong with the answers being "yes" and "no", respectively; that's actually what I'd expect. It would be more unusual to try to do this in an implementation-agnostic way.

JadedBlueEyes · on Feb 11, 2022

I think what they mean by this is that EdgeDB's query language should not be coupled to Postgres, but EdgeDB itself should use Postgres specific SQL features to maximise performance - so you couldn't drop in MariaDB without changing code in EdgeDB, but in theory you could write another backend that takes the same queries and uses MariaDB or MongoDB or something custom under the hood.

colinmcd · on Feb 11, 2022

Exactly, implementation coupling vs interface coupling.

da39a3ee · on Feb 11, 2022

I'm feeling very disappointed to see that you're using two space-separated words for "order by" in your language. Do "order" and "by" have meanings on their own in independence such that the meaning of "order by" arises naturally via their composition/conjunction? If not then surely it should be "orderby" in your language.

SQL definitely must be replaced, but the silly pseudo English syntax is one of the things we want to get rid of, not retain.

TrapLord_Rhodo · on Feb 13, 2022

Do you have any timeline on when i can use this with rust?

Found this: https://github.com/edgedb/edgedb-rust

contingencies · on Feb 10, 2022

select posts.name as post, count(post_tags.id) as matches from posts,post_tags where post_tags.post=posts.id and post_tags.tag in ("angular","nestjs","cypress","nx") group by post_tags.post order by matches desc; Test data @ http://pratyeka.org/hn.sqlite3

klohto · on Feb 10, 2022

Is it though? Simple IN with an ORDER BY on the same match will return the correct ranking. More info on ranking here https://www.postgresql.org/docs/current/textsearch-controls....

d_watt · on Feb 10, 2022

Am I right in reading this as the parent comment envisioning a “post” table and a “tag” table, and you’re suggest the “post” table just have a “tag” column?

klohto · on Feb 10, 2022

I see just one table

> Every post is tagged with a given category. There could be 100+ categories in the blog system and a blog post could be tagged with any number of these system categories.

My point is, I don’t see SQL query as expensive for this kind of use case. There are easy and native ways to do it.

In case you would like a top notch performance, Redis might be a way to do it. Even a reverse-index would achieve great performance.

thaumasiotes · on Feb 10, 2022

That's a standard many-to-many relationship that would normally be implemented by three tables:

    +---------+-------+---------+-------+-----------+
    | post_id | title | content | other | fields... |

    +--------+------+
    | tag_id | name |

    +------------+---------+--------+
    | tagging_id | post_id | tag_id |

But it seems like the core of the request is still something like:

    SELECT post_id, count(1) AS count
    FROM taggings
    WHERE tag_id IN (3, 8, 255)
    GROUP BY post_id
    ORDER BY count DESC

(off the top of my head; I haven't checked this for any kind of correctness)

And I don't see why that query suffers as you add tags...?

------------

EDIT responding to below [HN believes I am a problem user who should only be allowed to make so many comments per day]:

< that is pretty much what I meant by “I see just one table” as you don’t need any joins

Well, assuming you're doing this because a user is interacting with your site via some kind of web interface, you can set the interface up to deliver you tag_id values directly, but you'll still need to do a join with the posts table so you can present a list of posts back to the user instead of a list of internal post_id values.

So I guess

    SELECT t.post_id, count(1) AS count, p.title, p.url
    FROM taggings t JOIN posts p ON t.post_id = p.post_id
    ...

klohto · on Feb 10, 2022

Confusing but that is pretty much what I meant by “I see just one table” as you don’t need any joins (atleast with the same design you outline)

FractalHQ · on Feb 10, 2022

I do these often with standard GraphQL queries, often over Postgres. Now I’m curious about the performance difference compared to an SQL ORDERBY or similar EdgeDB implementation!

rmbyrro · on Feb 10, 2022

Thanks for posting this. Kind of comment that adds value to the discussion by illustrating how a piece of tech can or cannot be useful.

I just happen to have a very similar requirement to yours and was also wondering.

AtNightWeCode · on Feb 10, 2022

Maybe I misunderstand what you are saying but it sounds pretty straight forward in SQL.

rmbyrro · on Feb 10, 2022

I'm currently investigating whether Redis Bloom [1] could be a good tool for similar requirement.

[1] https://github.com/RedisBloom/RedisBloom