Expanded thoughts

A great language should let you think useful thoughts you have never been able to think before. This new project I'm working on, whose code name is "Smalltalk of Modeling", is expanding my mind.

Here is one idea. Lets generalize the idea of grammars. A grammar is normally used to parse text to create a parse tree. But if you annotate the grammar with constructors then the parse tree can be instantiated create arbitrary information structures, which we can view as the semantics of the text. If you are careful, then you will find that there can be quite a bit of divergence between the syntax and the semantics; for example, while syntax is tree-based, the semantics may be a graph. We can also run the process backwards, where we take some semantic object and render it into the grammar to create a parse tree, which can then be written out to text. This amounts to putting some clothes onto the abstract semantic structure.

Text --parse(Grammar)---> ParseTree --instantiate(Factory)--> Semantic Structure
Semantic Structure --render(Grammar)--> ParseTree --
------display----------> Text

Note that parse and render, which both use the Grammar, are not inverses of each other. Instead, they both output ParseTrees, but they do it from two different directions.

Now the interesting thing is that a Grammar is really just a ParseTree with added structure: to allow for alternation, repetition, and variable data. We don't need to add sequencing, because ParseTree already has that. That is, we can view a grammar as

ParseTreeGrammarStructure = Kleene(ParseTreeStructure)

This is probably not too surprising. The interesting thing is that once we make structural descriptions be values, then we can write the Kleene function, which "grammarizes" any structure. (Its called Kleene after the person who first formalized the idea of alternation, repetition, and sequencing.)

My new language lets me think about, and write down, the Kleene function. And use it in more general ways. For example, I can now contemplate constructing

DiagramGrammarStrucutre = Kleene(DiagramStructure)

This creates a semantic structure for the grammar of diagrams. I can then use it with my generic render routine to create diagrams of any semantic structure:

Semantic Structure ---render(DiagramGrammar)---> Diagram

This brings up the question of parsing diagrams. It turns out that we will edit them rather than parse them, but it will be the topic for a future post.

1 comment:

gasche said...

I found this post really hard to understand, partly because I don't know the topics you're referring to, or know them from different point of views, and partly because it is a bit too vague in some places (bug that's to be expected from an "Expanded Thoughts" post).

If I understand correctly, the "instantiation" you're talking about corresponds to adding "actions" to a grammar : for each rule, describe a "result" of the rule parsing, depending on the (result of) the subcomponents.

So the "grammar" in your vision would be the descriptive (syntax) part, without actions, that can be then attached different set of actions to produce different "semantics" from it.

I understand that idea, but I'm not sure considering actions/instantiations as a core business of a grammar is so useful. One may say that a grammar only produce a ParseTree, and them define further semantics by simply defining transformation from a tree to other things (this is the "elimination" of the tree, and in a functional language it would be performed by a "fold", or a recursive traversal using pattern matching). It turns out that presenting grammars as "things that produce results out of the parsed input" and "just trees built from the input" is equivalent, as you can build the tree as a particular semantics (each action would then be "build the corresponding tree", and in a very precise sense it's the most general thing you can do), or process the semantics from the tree after-the-fact. Given that the notion of "producing something else from a tree" is also useful for other trees that the ones resulting from parsing, I'm not sure it gives you so much to bundle that notion inside the "grammar" concept.

What I don't understand then is how you go in the other direction, from a semantics to a parsetree again. From my understanding of the first part, you would need to compute an "inverse" of the grammar action, but certainly those actions (instantiations) are in general not inversible. If the language I parse is arithmetic formulas over integers, my action evaluate it as float (or rational numbers), what parse-tree will I get back from "1.5" which is the semantics of the "3 / 2" input? I think an example would help here.

It also took me a long time to make sense of you "ParseTreeGrammarStructure = Kleene(ParseTreeStructure)" example. It don't help that you don't really say what those Structure are. It is also terribly confusing in that context that your Kleene function apparently does not mean "Kleene star", which is the first idea that came to my mind, and I suspect would be quite suggestive to most people that do know Kleene's name (even if he has done a lot of more profound things that only studying the Kleene star operator).

What I understand is: your "Kleene" transformation goes from a "grammar without action (a simple tree description)" to a "grammar with actions (a tree->semantics transformer)". So the Kleene of the Grammar structure would be a description of the things that consume grammars and produce some semantics value from them. Is it correct? Again, in functional programming terms, that would be a generic transformation that, from a datatype description, returns the folding function over that datatype.