Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HHH-18754 Improve HQLParser's error listener usage #9140

Merged
merged 1 commit into from
Nov 7, 2024

Conversation

NathanQingyangXu
Copy link
Contributor

@NathanQingyangXu NathanQingyangXu commented Oct 23, 2024

https://hibernate.atlassian.net/browse/HHH-18754

Below is the code pattern for HQL parsing (at org.hibernate.query.hql.internal.StandardHqlTranslator):

// Build the lexer
final HqlLexer hqlLexer = HqlParseTreeBuilder.INSTANCE.buildHqlLexer( hql );

// Build the parse tree
final HqlParser hqlParser = HqlParseTreeBuilder.INSTANCE.buildHqlParser( hql, hqlLexer );

ANTLRErrorListener errorListener = new ANTLRErrorListener() {
	@Override
	public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
		throw new SyntaxException( prettifyAntlrError( offendingSymbol, line, charPositionInLine, msg, e, hql, true ), hql );
	}

	@Override
	public void reportAmbiguity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, boolean exact, BitSet ambigAlts, ATNConfigSet configs) {
	}

	@Override
	public void reportAttemptingFullContext(Parser recognizer, DFA dfa, int startIndex, int stopIndex, BitSet conflictingAlts, ATNConfigSet configs) {
	}

	@Override
	public void reportContextSensitivity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, int prediction, ATNConfigSet configs) {
	}
};

// try to use SLL(k)-based parsing first - its faster
hqlLexer.addErrorListener( errorListener );
hqlParser.getInterpreter().setPredictionMode( PredictionMode.SLL );
hqlParser.removeErrorListeners();
hqlParser.addErrorListener( errorListener );
hqlParser.setErrorHandler( new BailErrorStrategy() );

try {
	return hqlParser.statement();
}
catch ( ParseCancellationException e) {
	// reset the input token stream and parser state
	hqlLexer.reset();
	hqlParser.reset();

	// fall back to LL(k)-based parsing
	hqlParser.getInterpreter().setPredictionMode( PredictionMode.LL );
	hqlParser.setErrorHandler( new DefaultErrorStrategy() );

	return hqlParser.statement();
}
catch ( ParsingException ex ) {
	// Note that this is supposed to represent a bug in the parser
	// Ee wrap and rethrow in order to attach the HQL query to the error
	throw new QueryException( "Failed to interpret HQL syntax [" + ex.getMessage() + "]", hql, ex );
}

firstly it is confusing to add the error listener BEFORE setting mode to SLL, then afterwards add it again (empty the listnerers first), as if the error listener will be used during the SLL setting statement (it won’t for it is simply a variable setting) as below:

hqlLexer.addErrorListener( errorListener );
hqlParser.getInterpreter().setPredictionMode( PredictionMode.SLL );
hqlParser.removeErrorListeners();
hqlParser.addErrorListener( errorListener ); 

but this is minor.

So I guess the two-step approach might be from this article Improving the performance of an ANTLR parser - Strumenta , but there is no reason to set error listener to both steps for the following reasons:

if SLL failed and the error listener takes effect, there might be possibility that the second LL step succeeds, then user got confused by the error message;

if SLL failed and then LL failed as well, user will be notified twice. LL step won’t skip error listener notification and I think in this scenario, LL step’s error listener message suffices.

Most seriously, given we throw SyntaxError exception in the syntaxError() method in the error listener, the LL step would be totally skipped!!

All in all, it seems there is no reason to use error listener for the first SLL step. What really matters might be the final step. So moving the error listener creation and setting logic into the LL step makes more sense (needless to say, it would improve perf by avoiding unnecessary processing) as below:

hqlParser.getInterpreter().setPredictionMode( PredictionMode.SLL );
hqlParser.removeErrorListeners();
hqlParser.setErrorHandler( new BailErrorStrategy() );

try {
	return hqlParser.statement();
}
catch ( ParseCancellationException e) {
	// reset the input token stream and parser state
	hqlLexer.reset();
	hqlParser.reset();

	// fall back to LL(k)-based parsing
	hqlParser.getInterpreter().setPredictionMode( PredictionMode.LL );
	hqlParser.setErrorHandler( new DefaultErrorStrategy() );
	
	ANTLRErrorListener errorListener = new ANTLRErrorListener() {
		@Override
		public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
			throw new SyntaxException( prettifyAntlrError( offendingSymbol, line, charPositionInLine, msg, e, hql, true ), hql );
		}

		@Override
		public void reportAmbiguity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, boolean exact, BitSet ambigAlts, ATNConfigSet configs) {
		}

		@Override
		public void reportAttemptingFullContext(Parser recognizer, DFA dfa, int startIndex, int stopIndex, BitSet conflictingAlts, ATNConfigSet configs) {
		}

		@Override
		public void reportContextSensitivity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, int prediction, ATNConfigSet configs) {
		}
	};
	hqlParser.addErrorListener( errorListener );

	return hqlParser.statement();
}

@hibernate-github-bot
Copy link

hibernate-github-bot bot commented Oct 23, 2024

Thanks for your pull request!

This pull request appears to follow the contribution rules.

› This message was automatically generated.

@NathanQingyangXu NathanQingyangXu changed the title HHH-18754 improve HQLParser's error listener usage in StandardHqlTran… HHH-18754 improve HQLParser's error listener usage Oct 23, 2024
@NathanQingyangXu NathanQingyangXu changed the title HHH-18754 improve HQLParser's error listener usage HHH-18754 Improve HQLParser's error listener usage Oct 23, 2024
Copy link
Member

@gavinking gavinking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sebersole
Copy link
Member

needless to say, it would improve perf by avoiding unnecessary processing

I'm curious, have you actually verified that?

@NathanQingyangXu
Copy link
Contributor Author

NathanQingyangXu commented Nov 2, 2024 via email

@NathanQingyangXu
Copy link
Contributor Author

NathanQingyangXu commented Nov 2, 2024 via email

@sebersole sebersole merged commit 2eeb615 into hibernate:main Nov 7, 2024
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants