[EPIC] Improve sqlparser performance #1557

alamb · 2024-11-26T14:07:00Z

What problem are you trying to solve?

Normally, in a SQL processing system, parsing SQL is not a major bottleneck compared to actually processing data. That being said, given how many SQL strings are parsed by this crate, I think there is significant benefit to improving the performance of the SQL parser in this crate.

That being said, I also think it is important to minimize the impact on downstream crates as much as possible.

Recently, we started introducing locations into the parser (thanks again @Nyrox!), which we found slows things down a bit (see #1435 (comment)).

Thankfully, I think there is significant room for improvement. As as part of the adding location information, I spent some time profiling and I think there are some obvious ways to improve the performance without impacting downstream crates.

Here is the flamegraph for anyone who is interested (you can download it locally to get zoom / etc):

fixed-flamegraph

What would you like to see?

The idea would be

Run the benchmarks (instructions in Document micro benchmarks #1555)
Maybe add additional benchmarks so they are more representative
Improve the benchmarks

Ideas to improve performance:

The most obvious one is to next_token / peek to not clone each Token (which involves copying strings)L: Improve performance by not copying Tokens as much #1558
make Parser generic around dialect #1381

The text was updated successfully, but these errors were encountered:

This was referenced Nov 26, 2024

Implement Spanned to retrieve source locations on AST nodes #1435

Merged

Improve performance by not copying Tokens as much #1558

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EPIC] Improve sqlparser performance #1557

[EPIC] Improve sqlparser performance #1557

alamb commented Nov 26, 2024 •

edited

Loading

[EPIC] Improve sqlparser performance #1557

[EPIC] Improve sqlparser performance #1557

Comments

alamb commented Nov 26, 2024 • edited Loading

What problem are you trying to solve?

What would you like to see?

Ideas to improve performance:

alamb commented Nov 26, 2024 •

edited

Loading