-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#2091] Minor Enhancements to Existing Regex Code #2115
[#2091] Minor Enhancements to Existing Regex Code #2115
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! It seems that many other places in the code use matchers in split
function calls as well. Would it be worth creating a matcher in StringsUtil for all these other split
calls?
@sopa301 It does seem worthwhile to consolidate common Regex patterns into one single class to avoid creating multiple Pattern/Matcher objects. I will look into doing that! |
Do you intend to do this in this PR? Also, in case you've done the profile, I'm curious about the extent of performance hit that |
@gok99 I will work on the aforementioned comment soon and update you with the results from the profiler! |
@gok99 After some preliminary testing and profiling, it seems that the refactoring does not have much of an impact on the overall performance of the code. The increases or decreases may likely be the result of randomness when executing the program. Here are the findings: Time
Space
These findings were made before we consolidated Regex patterns into one single class to avoid duplication, hence results may vary later on. |
@sopa301 and @gok99, I have implemented the suggestion to consolidate all commonly used Regex patterns into the So far, I'm not too sure if reusing |
Thanks @asdfghjkxd for consolidating the patterns! Do you have any data on the effect of this refactoring on the performance? I agree that if the performance increase is negligible/insignificant, we'll probably be better off leaving those patterns alone. I'm under the impression that these matchers would be repeatedly created especially with the frequency that these functions are called. Perhaps the compiler may have automatically optimised it for us. |
I definitely think this is worth doing if just for the improved readability and handy regex utils (are there other non-trivial but handy regexes elsewhere that could go here?). I don't think coupling is as much of concern since these are just temporary static helper functions. Will leave @reposense/active-developers to comment / approve first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! As discussed, there is no significant performance improvements, but it makes the code easier to modify with the compilation of patterns. It will take some effort getting used to this way of doing things, but I think it's not significantly difficult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, @asdfghjkxd feel free to add other common patterns to utils before this gets merged, if you spot them
@gok99 Sure thing, will look through the codebase to find other places with repeated regex patterns! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think consolidating these patterns into a single util is great as we can avoid having to visit multiple files to hunt down changes. I think that any additional patterns found can be done in a separate PR.
The following links are for previewing this pull request:
|
[#2091] Suggestions on improvement for memory performance regarding Regex matching
Proposed commit message
Other information
This was something that was missed out during the previous PR that aimed to resolve issue #2091.