Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of conversion #190

Open
kavitharaju opened this issue Nov 7, 2022 · 2 comments
Open

Improve performance of conversion #190

kavitharaju opened this issue Nov 7, 2022 · 2 comments

Comments

@kavitharaju
Copy link
Collaborator

The parsing with the tree-sitter module is quite fast even for large and complex usfm files. Then we do a sequential parsing of the output syntax tree to convert them to USX , JSON etc. In doing so, the performance is greatly affected. Need to look into some alternate programming methodologies like callbacks to improve this.

@shadow-light
Copy link

Yes, just to give some real world stats:

https://github.com/schierlm/BibleMultiConverter can do USFM->USX conversion for a whole Bible in ~6 seconds.

usfm-grammar (Node) takes 3-60 seconds per book, so probably ~2000 seconds for whole Bible. I didn't run the whole thing as might have taken half an hour.

But there's different use cases, and it looks like this could be really useful for a more feature rich converter. I'll be keen to hear if there are performance improvements.

@kavitharaju
Copy link
Collaborator Author

Note: In python could use https://docs.python.org/3/library/profile.html to find out where improvement is needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants