-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider using ahocorasick-rs instead of pyahocorasick #9825
Labels
enhancement
Adding or requesting a new feature.
good first issue
Opportunity for newcoming contributors.
hacktoberfest
This is suitable for Hacktoberfest. Don’t try to spam.
help wanted
Extra attention is needed.
Milestone
Comments
nijel
added
hacktoberfest
This is suitable for Hacktoberfest. Don’t try to spam.
help wanted
Extra attention is needed.
good first issue
Opportunity for newcoming contributors.
labels
Aug 29, 2023
This issue seems to be a good fit for newbie contributors. You are welcome to contribute to Weblate! Don't hesitate to ask any questions you would have while implementing this. You can learn about how to get started in our contributors documentation. |
I've done some really basic benchmark and it seems that it's reasonable to switch: from weblate.checks.data import IGNORE_WORDS
import ahocorasick
import ahocorasick_rs
import timeit
def build_py():
automaton = ahocorasick.Automaton()
for term in IGNORE_WORDS:
automaton.add_word(term, term)
automaton.make_automaton()
return automaton
def build_rs():
return ahocorasick_rs.AhoCorasick(
IGNORE_WORDS,
implementation=ahocorasick_rs.Implementation.ContiguousNFA,
store_patterns=False,
)
print("Build")
print(timeit.timeit("build_py", globals={"build_py": build_py}))
print(timeit.timeit("build_rs", globals={"build_rs": build_rs}))
ac_py = build_py()
ac_rs = build_rs()
print("Find")
print(
timeit.timeit(
"list(ac_py.iter('Please enter the correct username and password.'))",
globals={"ac_py": ac_py},
)
)
print(
timeit.timeit(
"ac_rs.find_matches_as_indexes('Please enter the correct username and password.')",
globals={"ac_rs": ac_rs},
)
) |
5 tasks
nijel
added a commit
to nijel/weblate
that referenced
this issue
Sep 5, 2023
This delivers a better performance. Fixes WeblateOrg#9825
nijel
added a commit
to nijel/weblate
that referenced
this issue
Sep 5, 2023
This delivers a better performance. Fixes WeblateOrg#9825
nijel
added a commit
that referenced
this issue
Sep 5, 2023
This delivers a better performance. Fixes #9825
Thank you for your report; the issue you have reported has just been fixed.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
enhancement
Adding or requesting a new feature.
good first issue
Opportunity for newcoming contributors.
hacktoberfest
This is suitable for Hacktoberfest. Don’t try to spam.
help wanted
Extra attention is needed.
Describe the problem
https://pypi.org/project/ahocorasick-rs/ seems faster alternative to pyahocorasick.
Describe the solution you'd like
It would be useful to benchmark it in Weblate use-case and switch to it in case it outperforms pyahocorasick.
Describe alternatives you've considered
No response
Screenshots
No response
Additional context
The text was updated successfully, but these errors were encountered: