Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for reverse variable matching #54

Merged

Conversation

nrktkt
Copy link

@nrktkt nrktkt commented Aug 15, 2016

Addresses #2

Bumped Java version to 8

  • Named capture groups not supported in 6
  • I like to lambda

Added Either monad class to deal with repeated variables.

Currently this should support the majority of use cases, however more work can be done to better support level 4 templates.

Future work:

  • Make matchParameters and matchSegments inspect each VarSpec for allowed characters
  • Make matchParameters and matchSegments inspect each VarSpec for explode modifier
    • Currently multiple parameters will automatically become a list, regardless of presence of explode
    • Currently multiple segments are not supported
  • Make buildMatchingPattern match more strictly, seems to work OK now but could result in issue reports later
  • Integrate with existing test suites and generally expand testing.

@nrktkt nrktkt force-pushed the 2.2-reverse-matching branch 3 times, most recently from bcf9ad3 to 05b8e06 Compare August 15, 2016 15:54
@damnhandy damnhandy merged commit decb623 into damnhandy:2.2-reverse-matching Aug 23, 2016
@damnhandy
Copy link
Owner

Thanks!

.append('?');
regex.deleteCharAt(regex.length() -1);

Pattern pattern = Pattern.compile(regex.toString());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to get reasonable performance, this pattern will need to be precompiled so it can be used over and over.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quite right, this and the pattern in matchParameters should be compiled late and once.

.append('>')
.append('\\')
.append(prefix)
.append(".{1,9001}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for limiting this to 9001 characters? You can use .+ if you want to match at least one character.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there was an issue with unbounded match. It would cause a stack overflow IIRC. 9001 is sufficiently large for a single sane expression, and investigating and documenting the phenomenon more is intended to fall under the todo.

@@ -82,6 +85,14 @@
*/
private Pattern matchPattern;

private String groupName = uid();

private static Random RANDOM = new SecureRandom();
Copy link

@jroper jroper Jan 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SecureRandom is a very expensive way to generate random names, in particular it's blocking, only one thread can generate a name at a time, and depending on your JVM configuration, it can block indefinitely if your system has no entropy. If the aim is to ensure that each name this class generates is unique within the scope of this class, then I'd just use an AtomicLong, and generate names like var1, var2, var3 using incrementAndGet. This is fast, non blocking, and guarantees uniqueness up to Long.MAX_VALUE names (technically there's no such guarantee with SecureRandom).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good point. My original thought was that the collision chance of 12 bytes was practically none, and a counter could max out on a long running server.
Now that you bring it up however, I think the performance hit of random is greater than the practical likelihood of using over 8 bytes of expressions (and if it becomes an issue, later code could handle it by resetting the counter or creating new counters and appending them together as strings).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants