Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interactive reads #439

Merged
merged 4 commits into from
Dec 13, 2022
Merged

Interactive reads #439

merged 4 commits into from
Dec 13, 2022

Conversation

jjttjj
Copy link

@jjttjj jjttjj commented Dec 1, 2022

Addresses #438

I had mainly wanted to be able to upgrade my clojure repl to a convex repl and that has been working well for me. But there could be things I'm missing, and edge cases and implications of this change that I'm still missing. Any feedback is welcome. Below are a few things I'm unsure of

EOF handling

One thing I'm unsure of is if there should be better EOF handling with this feature. Should there be an EOF exception thrown when a read is attempted on EOF? The following currently just throws a ParseException:

(let [r (java.io.PushbackReader. (java.io.StringReader. "(+ 1 2 3)"))]
  [(AntlrReader/readOne r)
   (AntlrReader/readOne r)])

Should there be more options for EOF handling, as the clojure.core/read allows?

Testing

For now the tests I added are fairly minimal, just checking for things I ran into during development. I have tried out using readOne in place of read for all the relevant test cases and verified that things work as expected. I could try to refactor the reader tests to do in all appropriate tests as well. Or just copy and paste several other of the read tests to a readOne version.

[eE] (DIGITS | SIGNED_DIGITS);
fragment
DOUBLE_TAIL:
[.eE] [-0-9.eE]*;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this is required to properly throw a ParseException on 1.0e0.1234, instead of just reading 1.0 and leaving the rest on the stream.

@@ -148,7 +140,7 @@ QUOTING: '\'' | '`' | '~' | '~@';


KEYWORD:
':' NAME;
':' NAME?;
Copy link
Author

@jjttjj jjttjj Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allows for a parse excepton to be thrown on : input


p++;
currentCharIndex++;
//sync(1);
Copy link
Author

@jjttjj jjttjj Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two consume implementations are the same as the Unbuffered(Char|Token)Stream class that they extend except without the sync call at the end, which consumes more characters/tokens from the input


ParseTree tree = parser.form();

r.unread(cs.LA(1));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even after disabling the syncing in the streams with our custom consume implementations below, there is a single additional character that is read from the stream when a form is parsed. I'm currently unsure of the exact mechanism that causes this but I believe it involves consume calls in antlr's LexerATNSimulator class. It's possible that there's a better way to handle this there. But just unreading from the stream here seems to work well.

@@ -203,6 +195,6 @@ fragment
COMMENT: ';' ~[\r\n]* ;

TRASH
: ( WS | COMMENT ) -> channel(HIDDEN)
: ( WS | COMMENT ) -> skip
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The built in CommonTokenStream that is used for the other reads filters for tokens only from the default channel. This feature doesn't use that token stream and instead modifies UnbufferedTokenStream, which doesn't include channel filtering.

It would be possible to add that functionality back into the token stream we use, but since we are not doing anything with the hidden token channel, it should be fine to just skip these tokens instead.

@mikera
Copy link
Member

mikera commented Dec 1, 2022

This looks very good, happy to merge when you think it is ready. Are you on Discord @jjttjj and if so what is your handle?

Regarding EOF handling, I think some way to control this might be useful. I can see a couple of options:

  • Optional extra argument to say what happens when EOF is reached. Unfortunately, just retuning null isn't useful because that is also a valid form but
  • Have a static dummy ACell instance that acts as a marker? This wouldn't be a valid ACell in any way but a singleton that can be used to detect EOF
  • Have a function like hasAnotherForm that detects a next form. Might be inefficient though if used on large input...

@mikera mikera merged commit 664e7df into Convex-Dev:master Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants