Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement "Principled parsing for indentation-sensitive languages" #235

Open
TeofilC opened this issue Apr 22, 2022 · 6 comments
Open

Implement "Principled parsing for indentation-sensitive languages" #235

TeofilC opened this issue Apr 22, 2022 · 6 comments

Comments

@TeofilC
Copy link

TeofilC commented Apr 22, 2022

Principled parsing for indentation-sensitive languages [pdf] lays out a way for extending parser generators like Happy with the ability to directly deal with indentation-sensitive languages like Haskell. The paper mentions that a patch was made to extend Happy with this functionality, but afaict this was never merged into Happy.

Why hasn't this yet been added to Happy? Did nobody have the time to finish off and optimise this work, or is there another reason?

@Ericson2314
Copy link
Collaborator

Thanks for opening this. I was pointed to that paper, and also saddened we were not already using it.

@Ericson2314
Copy link
Collaborator

Ericson2314 commented Aug 22, 2022

https://michaeldadams.org/papers/layout_parsing/LayoutParsing.pdf here is the PDF from the author's website, without paywall.

@Ericson2314
Copy link
Collaborator

https://michaeldadams.org/projects/happy-indent/happy.indent.tar.gz (from https://michaeldadams.org/projects/) is the source code behind the things in the paper.

Do note that Happy was still in darcs then, so we will need to do some careful surgery to get just the fork parts as a branch off the right commit in this git repo.

@Ericson2314
Copy link
Collaborator

Ericson2314 commented Aug 22, 2022

happy.indent.tar.gz is a copy of the download just in case something were to happen to the author's website.

@andreasabel
Copy link
Member

Is the work of Michael Adams production-ready? Looking at his paper, it seems that he has given a new grammar formalism IS-CFG (indentation sensitive context free grammars) and explained how to generate a parser.
However, there isn't any effort to design a nice surface syntax. To get layout into your grammar, you have to annotate each and every symbol on each production. Looking at his examples, it seems that many of these annotations are schematic. I'd find it tedious to figure out all of the annotations for a realistic grammar.
Experiments such as the layout mechanism of BNFC suggests that there are higher-level approaches to get indentation-sensitivity. BNFC's method is insufficient so far (see its open issues), but it is now actively worked on (results expected in 2023).
It might be that Adams approach could be one layer on top of which sugar is defined (in the same way that fixity declarations are sugar and can be reduced to pure CFG grammars). However, there are other approaches to indentation-sensitivity that should be explored before committing to Adams' solution. (E.g. Erdweg, Rendel, Kästner, Ostermann, Layout-sensitive Generalized Parsing.)

@TeofilC
Copy link
Author

TeofilC commented Oct 2, 2024

See also: https://gitlab.haskell.org/ghc/ghc/-/issues/25322, which is another solution to having phase separation between the lexer and parser

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants