Hacker News new | past | comments | ask | show | jobs | submit login

> Grammar covers the correct use of the syntax of a language. With syntax alone you cannot know whether a program is a correct use of a language -- that is, it will parse or compile, even if its output is nonsensical. A grammar lets you do that.

Suppose I am writing a Pascal compiler. I might start with a grammar for Pascal, expressed in EBNF. I might write a parser from that grammar, either by hand, or by using some parser generator. My parser converts text (or a sequence of tokens generated by an independent lexical analysis stage) to some kind of parse tree or abstract syntax tree.

Since Pascal is a statically typed language, I also need to write some code to do type-checking. If you look at a tutorial on writing a compiler, you might see this presented as a subsequent compilation stage, after parsing is complete. It takes the AST and symbol table, annotates it with type information (possibly even doing type inference), and checks various rules are obeyed.

And, an EBNF grammar doesn't tell us how to do it. Generally you'd either express it informally, or if you were going to use some kind of formalism, you'd use a rather different formalism from those in which programming language grammars are commonly expressed. There are "typed grammar" formalisms which combine type-checking into parsing, but you don't see them commonly used by people implementing programming languages in the real world. In my experience they are mainly popular in academia (and I've seen them presented more often in the context of natural language processing than programming languages.)

I don't get the impression you are thinking of one of those "typed grammar" formalisms; I more get the impression that you are using the word "grammar" in an idiosyncratic way which elides the conceptual distinction between syntax and semantics, and ignores the standard usage in which the word "grammar" is primarily associated with the former not the later.




The fact remains -- you cannot know that a Tcl program is syntactically correct without running it, can you? Because the full syntax is determined by the extensions.

Can you write a parser for Tcl in Yacc?

https://compilers.iecc.com/comparch/article/99-08-100

http://computer-programming-forum.com/57-tcl/7aaac08c64c8c61...

This is why I say Tcl doesn't have a grammar. It literally lacks one by definition. Unless something fundamental has changed.

IMO it's a terrible language which should have been left in the last century with perl4.


> The fact remains -- you cannot know that a Tcl program is syntactically correct without running it, can you? Because the full syntax is determined by the extensions.

Okay, but the same is true of Common Lisp. In Common Lisp, you can't know whether a program is syntactically correct without compiling it–and compiling it executes those parts of it declared to run at compile-time (macros, EVAL-WHEN, etc). Common Lisp supports reader macros (SET-MACRO-CHARACTER), so your program can define some entirely new language syntax and immediately start using it. I found a good example [0] someone made of using it to extend Common Lisp to support embedded JSON. You slip the json-reader.lisp file into your Common Lisp application, and then you can do stuff like this:

    ;;; example.lisp
    (in-package #:example)
    (enable-json-syntax)
    (defun example-object () {
       "someArray": [1, 2, 3],
       "someBoolean": true,
       "someNull": null,
       "nestedObject": {
          "type": "example"
       }
    })
   (disable-json-syntax)
There is no way you can tell that's valid Common Lisp without actually compiling and executing the code in the json-reader.lisp file. When the compiler gets to (enable-json-syntax), it sees that it is a macro so it executes its code at compile-time. That code then redefines the meaning of the { and [ characters to be the start of JSON objects/arrays (instead of their normal meaning in Common Lisp), and installs code telling Common Lisp how to parse JSON syntax, which is triggered when those characters are encountered in input. The example-object function shows an example of using it. Finally, (disable-json-syntax) is another macro, also executed at compile time, which undoes those changes to the Common Lisp syntax.

Common Lisp did not invent the idea of allowing a program to dynamically alter the language syntax as it was being parsed; I believe the idea was first introduced in Maclisp, and from there spread to other Lisp dialects (such as Franz Lisp and Interlisp) before finally being standardised in Common Lisp. The designers of Scheme intentionally decided to leave this feature out, but some implementations/descendants of Scheme have added it back in, such as Racket [1], Guile [2], and Chicken Scheme [3]. Other Lisp-family languages (such as Clojure or Emacs Lisp) have resisted calls to add it, despite encouragement from certain quarters.

Earlier you spoke of "Tcl's lack of a grammar up against Scheme, Lisp", but if runtime-extensible syntax is a form of "lack of a grammar", then many Lisp family languages lack grammar too.

Another language for which this is true (albeit to a significantly lesser degree), is Java. With Java, you can run arbitrary code at compile time by defining an annotation processor. The annotation processor code is run when the annotation is encountered. The annotation processor is allowed to issue compilation errors. It can also generate more source code for additional classes, which gets added to the compilation process, and your program may refer to some of those generated classes, and hence need that generated code to successfully compile–for an example see [4] from the Google AutoValue docs, where the "AutoValue_Animal" class being constructed in the create() method is dynamically generated by a processor which is triggered by the @AutoValue annotation. So, if your Java program contains annotations with an associated processor, and if you define "syntactically correct" as "compiles without any errors", then you can't know if the Java program is syntactically correct without executing arbitrary Java code.

[0] https://gist.github.com/chaitanyagupta/9324402

[1] https://docs.racket-lang.org/reference/Reader_Extension.html

[2] https://www.gnu.org/software/guile/manual/html_node/Reader-E...

[3] https://wiki.call-cc.org/man/5/Module%20(chicken%20read-synt...

[4] https://github.com/google/auto/blob/master/value/userguide/i...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: