More

throwaway_pdp09 · on Sept 27, 2021

Your damn animated gif has sucked up >4.7GB in my browser and crashed it. Not impressed.

Did animation actually add anything significant anyway.

smoyer · on Sept 28, 2021

I'm not disputing that the image may not add anything to the article but if a 10MiB GIF image can crash your browser maybe you should look at other vendors. I'm on Firefox Developer Edition with approximately 100 tabs open and didn't notice that image as a resource hog.

throwaway_pdp09 · on Sept 28, 2021

It's animated (and there are 3 on that page totalling 35MB). Some systems unfold each frame into its own image and store that.

Please open firefox to that page, leave it open for a few minutes and observe the memory. I suspect FX will bloat too.

throwaway_pdp09 · on Sept 6, 2021

I'm doing a project perhaps relevant to this. Talk of declarative schemas (or what you imply they offer) is very interesting and I'd like to know more but I can't find anything relevant (just magento and sqlalchemy). Indeed, searching for <<<"Declarative schemas" "sql">>> in google gets this very thread on the first results page.

Any links to clear, actionable and reasonably comprehensive examples of these would be most helpful. Obviously abstract statements of the required semantics are also needed, but I also need to see what actual code would look like.

TIA

GeneralMayhem · on Sept 6, 2021

Magento (the new XML-based version, not the old create/update/revert-script version) gives a lot of these properties. Making it part of the database instead of a third-party tool would be better, though - it lets you cover more features, and with deeper integration you can get transactional updates and history logs.

throwaway_pdp09 · on July 16, 2021

Mornington Crescent!

throwaway_pdp09 · on Dec 12, 2020

Yours is the toxic comment. Appeal to authority is fine if the authority is an authority and WB has proven his. Suggestions of intention to flatter and accusations of hero worship are pretty gross. @kingaillas's point is entirely valid IMO.

throwaway_pdp09 · on Dec 11, 2020

Can you give an example (or three) where that lets us down? That would be very helpful to me I suspect.

chrisseaton · on Dec 11, 2020

Examples of which bit?

Languages that are context-sensitive? C, C++, Java, JavaScript.

Examples of tools based on the starting expectation that languages are context-free? Yacc, Bison, Jay.

Examples of the problems this causes? Well we're using the wrong tool for the job, right from the start. Instead of using an appropriate tool the first thing we do is bend our tools out of shape. We use side-effect actions to subvert the tool's model in an uncontrolled way. We don't get the full benefits of the tool's original model and can't rely on its guarantees.

throwaway_pdp09 · on Dec 11, 2020

Text -> parser -> AST -> job done. If it's any different in an IDE vs anything else I'd like to know how.

ohazi · on Dec 11, 2020

Your IDE parser will be unusable if it goes bananas while you're typing the characters needed to get from one fully, correctly parseable state to the next.

It needs to be able to handle:

    printf("hello");

and also:

    prin

and also:

    printf("He

It also needs to be able to autocomplete function signatures that exist below the current line being edited, so the parser can't simply bail out as soon as it reaches the first incomplete or incorrect line.

onei · on Dec 11, 2020

Isn't that pretty standard for a parser? When was the last time a compiler bailed on the very first error it hit and refused to do anything else?

The solution is to pick synchronisation points to start parsing again, i.e. ; at the end of a statement or } at the end of a block.

morelisp · on Dec 12, 2020

> When was the last time a compiler bailed on the very first error it hit and refused to do anything else?

Make still does this. (That's the "Stop." in the famous "* missing separator. Stop.") Many errors in Python still do this.

As late as 2010 I still saw some major C compilers do this.

99% of the toy compilers written for DSLs do this, or worse.

Good error recovery / line blaming is still an active field of development.

pfalcon · on Dec 13, 2020

> Good error recovery / line blaming is still an active field of development.

True. But let's get terminology straight: that's not a compiler science, that's parsing science. And it's no more compiler science than parsing a natural language is.

morelisp · on Dec 13, 2020

What terminology are you talking about? Neither "compiler science" nor "parsing science" are terms I used, or that the industry or academia use.

Parsing - formal theory like taxonomies of grammars, and practical concerns like speed and error recovery - remain a core part of compiler design both inside and outside of academia.

lifthrasiir · on Dec 12, 2020

How can you be sure that that } is the end of a certain defined block? This most importantly affects the scoping and in many cases it's ambiguous. IDEs do have rich metadata besides from the source code but then parsers should be aware of them.

throwaway_pdp09 · on Dec 12, 2020

You're ignoring the ; which are sync points.

> How can you be sure that that } is the end of a certain defined block

If it's not in a string, what else is it but a typo? If a typo, it fails to parse but so long as it doesn't crash, fine.

lifthrasiir · on Dec 12, 2020

Maybe my wording is not accurate, imagine the following (not necessarily idiomatic) C code:

    int main() {
        int x;
        {
            int x[];
        // <-- caret here
        x += 42;
    }

This code doesn't compile, so the IDE tries to produce a partial AST. A naive approach will result in the first } matching with the second {, so `x += 42;` will cause a type error. But as noticable from the indentation, it is more believable that there was or will be } matching with the second { at the caret position and `x += 42;` refers to the outer scope.

Yes, of course parsers can account for the indentation in this case. But more generally this kind of parsing is sensitive to a series of edit sequences, not just the current code. This makes incremental parsing a much different problem from ordinary parsing, and also is likely why ibains and folks use packrat parsing (which can be easily made incremental).

morelisp · on Dec 11, 2020

Partial parse state and recovery are critical. You don't want the entire bottom half of a file to lose semantic analysis while the programmer figures out what to put before a closing ).

throwaway_pdp09 · on Dec 11, 2020

Does that issue go away if you use packrat vs some other means?

morelisp · on Dec 11, 2020

Packrat parsers are notably faster than recursive descent parsers (also critical for IDE use) and by turning them "inside out" (replacing their memoization with dynamic programming) you get a pika parser which has very good recovery ability.

There are varying techniques to improve error recovery for all forms of parsing but hacked-up recursive descent (certainly the most common kind of parser I still write for my hacked-up DSLs!) have poor error recovery unless you put in the work. Most LR parsers are also awful by default.

When I was in university most focus was on LL and LR parsers with no discussion of error recovery and more focus on memory usage/bounds than speed. I also have no idea how common it is to teach combinator-based parser grammers these days; stuff like ANTLR and yacc dominated during my studies. This would add another level of unfamiliarity for students going to work on a "real compiler".

haberman · on Dec 12, 2020

> Packrat parsers are notably faster than recursive descent parsers

I think this needs to be qualified. I don't think a packrat parser is going to beat a deterministic top-down parser. Maybe the packrat parser will win if the recursive descent parser is backtracking.

morelisp · on Dec 12, 2020

True - the performance of a top-down parser is going to depend on if/how often it backtracks and how much lookahead it requires. But this requires control over your grammar which you might not have i.e. you are parsing a standardized programming language. Practically speaking, unless you want two separate parsing engines in your IDE, that Lisps are actually LL(1) and Python and Java are "almost LL(1)" doesn't get you closer to parsing C++.

Packrat parsers are equivalent to arbitrary lookahead in either LL or LR grammars.

specialist · on Dec 12, 2020

u/morelisp may have meant faster to develop, not runtime performance.

jlokier · on Dec 12, 2020

Thanks for mentioning pika parsers. I've just had an enjoyable read of the pika parser paper.

IshKebab · on Dec 11, 2020

Yes. Go and look up Tree Sitter and its references.

throwaway_pdp09 · on Dec 12, 2020

Parsing (use of rather than theory) matters as it affects my work. So I followed up.

See https://youtu.be/Jes3bD6P0To

Tree sitter is based on LR parsing (see 23:30 in above video) extended to GLR parsing (see 38:30).

I've had enough of fools on HN posting unverified crap to make themselves feel cool and knowledgeable (and don't kid yourself that you helped me find the right answer by posting the wrong one). Time I withdrew. Goodbye HN.

IshKebab · on Dec 13, 2020

I'm not sure what you think you're refuting but Tree Sitter definitely does some different stuff to allow recoverable parsing.

throwaway_pdp09 · on Dec 11, 2020

Prev post by me, if it's any help https://news.ycombinator.com/item?id=25210523

throwaway_pdp09 · on Dec 11, 2020

I have some idea what I'm talking about, see my post https://news.ycombinator.com/item?id=25210523 and I don't kniow about why using a packrat over anything else. I pull something off the shelf, antlr for my latest project, then get on with it.

The parser is the most trivial side of things, if you consider that a key skill it comes across as being an issue from the 1970s, not today. Writing a RD by hand is not a big deal anyway as someone else said.

If your into optimising compilers you're focusing on a very odd area to criticise. I'd gave though more about dataflow using lattices or whatever, graph stuff etc.

So parsing aside, what are the 'relevant' skills you feel are missing?

(and what are 'Database Compilers'? Nearest I can think of that may fit are query optimisers)

throwaway_pdp09 · on Dec 10, 2020

MDMA works on what's there, it doesn't conjure up what was clearly not there in you at the time.

throwaway_pdp09 · on Dec 10, 2020

> Americans aren’t capable of dealing with a problem where their actions impact others more than themselves.

While it's true USA'ians are probably more individualistic than many other cultures and that may translate some way even to selfishness, the above quote is a gross, incorrect and offensive description, and I'm a brit.

Please don't broadbrush entire cultures.