Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Three problems about regular expressions #52

Open
x5g opened this issue Apr 21, 2022 · 4 comments
Open

Three problems about regular expressions #52

x5g opened this issue Apr 21, 2022 · 4 comments

Comments

@x5g
Copy link

x5g commented Apr 21, 2022

Hi, I have three questions about regular expressions,

  1. What is the difference between re. from_ecma2020 and re.from.str ?

  2. Does OSTRICH currently support regular expressions with backreferences? E.g., (.)\1.

  3. I found that (?=a).|c -> (model (define-fun w () String "a")), but ((?=a).|c) -> (model (define-fun w () String "\u{0}")), but \u0000 is not accepted by ((?=a).|c).

@amandasystems
Copy link
Collaborator

Hi, I have three questions about regular expressions,

Hi! Since nobody's responded to you I'll do my best.

  1. What is the difference between re. from_ecma2020 and re.from.str ?

re.from_str should either convert to/from textbook regular expressions (i.e. not the full ECMA standard), or it's a different kind of parser. Essentially what you want is from_ecma2020, IIRC (@uuverifiers can fill you in!), it should be strictly better.

  1. Does OSTRICH currently support regular expressions with backreferences? E.g., (.)\1.

Not in general since they in the general case make the language not regular and therefore cannot be handled in Ostrich, but @RiccardoDeMasellis and @uuverifiers are working on accommodating some practical use cases. I don't remember how much of that is in the main tree yet though.

  1. I found that (?=a).|c -> (model (define-fun w () String "a")), but ((?=a).|c) -> (model (define-fun w () String "\u{0}")), but \u0000 is not accepted by ((?=a).|c).

That sounds like a bug to me. I know we have had some problems with look-arounds in general so this might be another one.

@pruemmer
Copy link
Collaborator

pruemmer commented Apr 26, 2022 via email

@x5g
Copy link
Author

x5g commented Apr 27, 2022

Thank you for your answers. @amandasystems @pruemmer

Supporting backreferences is really difficult since it makes the regex language not regular.

For question 3, I wonder if, assuming that the regex is parsed well, i.e., L(r) is converted to an equivalent SMT constraint S. When I solve S using OSTRICH, is its model always consistent with L(r)?

Again, thank you for your works. :)

@pruemmer
Copy link
Collaborator

pruemmer commented Apr 27, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants