I'm just waiting for zzzeek to publish a blog post on how to implement this on t...

zzzeek · on May 26, 2014

the AST-on-top-of-SQLAlchemy idea has been discussed for many years prior to Pony's existence. Robert Brewer's Geniusql http://www.aminus.net/geniusql/chrome/common/doc/trunk/manag... does the same thing (for Geniusql, you pass it lambdas; interpreting generators directly was more of a "future" thing, but by "future" we're talking, like four years ago :) ). He presented it at Pycon maybe in 2009, 2010 and right after I had the usual flurry of questions "Can SQLAlchemy do this?" and I said "sure! just write an adapter, easy enough". I think Robert was even interested in that.

at the end of the day the AST idea looks very nifty but IMO is too rigid to translate to SQL in a flexible enough way, and also works against the main reason you use code structures to produce SQL which is composability. When I last saw the Pony creators do a talk, the approach seemed that each time you have a given SELECT, and you'd like to add some extra criterion to it, it pretty much will keep producing subqueries of the original, because each time you can only wrap the AST construct you already have. It similarly had no ability to produce a JOIN - at that time at least, the only way to join things together was by rendering inefficient correlated subqueries. This was asked explicitly.

If they've found a way to resolve these issues while keeping true to the "AST all the way" approach and not dropping into a SQLAlchemy-style approach, that would be great. There's no reason SQLA ORM or Core couldn't be behind a similar approach as well except that nobody's really had the interest in producing it.

amalashkevich · on May 26, 2014

Hi Mike,

Pony had the ability to produce JOINs from the very beginning, but during that presentation we found that Pony produced subqueries for MySQL and that was not very performant, correct. Since than we've improved Pony and now it got a query optimizer which replaces subqueries with efficient JOINs where it is possible. Here is the query from that presentation:

    >>> select(c for c in Customer if sum(c.orders.total_price) > 1000)[:]

The straightforward way is to use a subquery here, but Pony's optimizer produces LEFT JOIN because such query usually has better performance:

    SELECT `c`.`id`
    FROM `customer` `c`
      LEFT JOIN `order` `order-1`
        ON `c`.`id` = `order-1`.`customer`
    GROUP BY `c`.`id`
    HAVING coalesce(SUM(`order-1`.`total_price`), 0) > 1000

In our opinion this is the main advantage of Pony ORM - the possibility to perform semantic transformations of a query in order to produce performat SQL while keeping the text of Python query as high level as possible.

zzzeek · on May 26, 2014

OK, here's something I should understand. Say we start with:

    myquery = select(c for c in Customer if sum(c.orders.total_price) > 1000)

I'm inside of a query builder. Based on conditional logic, I also want to alter the above statement to include customer.name > 'G'. Intuitively, I'd do this:

    mynewquery = select(c for c in myquery if c.name > 'G')

which will take the original SELECT, wrap it in a whole new SELECT. Right?

Given "myquery", how do I add, after the fact, a simple "WHERE customer.name > 'G"" to the SELECT? Just continuously wrapping in subqueries is obviously not feasible.

amalashkevich · on May 26, 2014

There is no need to wrap it with a query. You can just add a filter:

    mynewquery = myquery.filter(lambda c: c.name > 'G')

The new query will produce the following SQL:

    SELECT `c`.`id`
    FROM `customer` `c`
      LEFT JOIN `order` `order-1`
        ON `c`.`id` = `order-1`.`customer`
    WHERE `c`.`name` > 'G'
    GROUP BY `c`.`id`
    HAVING coalesce(SUM(`order-1`.`total_price`), 0) > 1000

zzzeek · on May 26, 2014

Well right, this is exactly a SQLAlchemy-style syntax, except a tad more verbose :). This is the "dropping into a SQLAlchemy-style approach" I referred to.

As far as the "AST allows caching" advantage, over at https://bitbucket.org/zzzeek/sqlalchemy/issue/3054/new-idea-... we're working out a way to give people access to the "lambda: <X>" -> cached SQL in a similar way, if they want it.