Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce Sendable constraints of Regex passed to ValidationRule #44

Open
1 task done
Supereg opened this issue Aug 26, 2024 · 5 comments
Open
1 task done

Enforce Sendable constraints of Regex passed to ValidationRule #44

Supereg opened this issue Aug 26, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@Supereg
Copy link
Member

Supereg commented Aug 26, 2024

Problem

Regex is currently not sendable. We currently just suppress the compiler warning that is generated when creating a ValidationRule from a Regex instance as Regex is most likely sendable even when used with transformers (transformers are basically mapping functions, e.g., mapping a String input to a type by calling the rawValue initializer). However, these operations are not required to be Sendable and Apple is seemingly missing the last point where they can change Sendable constraints of transformers without inducing breaking changes. Therefore, we will need to think how we want to treat Sendability of Regex in the future.

Solution

One solution could be to just accept a subset of valid Regex expressions. Making the initializer failable. Swift Foundation used a similar approach when they introduced Regex support for the #Predicate macro. We cite from SF-0004:

This PredicateRegex type will be the Codable & Sendable storage for an underlying RegexComponent. Rather than storing the
RegexComponent (which is not Codable & Sendable) directly in a PredicateExpressions.Value, this build_Arg overload allows us to
store it inside of our wrapper type. We cannot catch all cases of unsupported regular expressions at runtime, so the build_Arg overload
will fatalError for cases where the developer has constructed a non-representable regex. The PredicateRegex initializer is failable
allowing developers performing manual predicate construction to determine appropriate behavior for non-representable regular
expressions. We support all regular expressions that can be transformed to a textual representation; unsupported expressions include those
built with capture transform closures or custom parsers.

What they decided for was to just not support any Regex that might contain concurrency-unsafe code. So any Regex that contain transform closures even if they would be technically @Sendable and concurrency safe, greatly limiting the usefulness of the RegexBuilder.

Additional context

We could generally investigate if we would want to support accepting Predicate instances as an input to a ValidationRule. We could then use the Regex support for the #Predicate macro to automatically convert regex to a Predicate instance.

Note: wholeMatch is not available as of right now: see here. The regex support for Predicate only supports calling contains which would require to specify start and end anchors.

What do you think @PSchmiedmayer

Code of Conduct

  • I agree to follow this project's Code of Conduct and Contributing Guidelines
@Supereg Supereg added the enhancement New feature or request label Aug 26, 2024
@StanfordSpezi StanfordSpezi deleted a comment Aug 26, 2024
@StanfordSpezi StanfordSpezi deleted a comment Aug 26, 2024
@StanfordSpezi StanfordSpezi deleted a comment Aug 27, 2024
@PSchmiedmayer
Copy link
Member

Thank you for writing this up @Supereg!
That's an unfortunate situation; interesting that they chose not to add a sendable conformance to Regex and transformers as part of the push for Swift 6.

I agree that just rejecting every element that might not be sendable is a clear limitation that would be great to be avoided. I think the clearer way forward would be to clearly document that a Regex should be sendable (even when using transformers) and assert this in the documentation. When we encounter a regex that might not to be sendable it might make sense to emit a runtime warning in the console pointing the developer to ensure that the regex is actually sendable.

An other alternative would be to add a more sophisticated check that would test this for, e.g., DEBUG builds (e.g. by executing multiple regex evaluations at the same time and seeing if the program crashes due to shared state that is not isolated?). Would not really be a deterministic crash behavior and I think that we could also not really catch it at runtime (or maybe we could using XCTRuntimeAssertions?). Not sure if building that infrastructure is really worth it given the fact that it all boils down to a missing constraint in the standard lib?

@Supereg
Copy link
Member Author

Supereg commented Aug 28, 2024

I agree, that rejecting every element that might not be sendable is way too strict and defeats the purpose of having such a nice, type-safe Regex API. SF-0004 had the additional challenge that they needed Regex to be Codable. So the decision to just now support transformers makes much more sense in that context. With ValidationRule we just have legacy support for the Decodable protocol to allow decoding a regex string, but we do not support encoding.

Just to provide full context what the issue with Sendable for regex is. Looking at a code example like below, the transform closure is what might introduce data races as it is not required to be @Sendable. In my opinion, this closure is meant to be @Sendable just by how it is supposed to be used.

let doubleValueRegex = Regex {
    "$"
    Capture {
        OneOrMore(.digit)
        "."
        Repeat(.digit, count: 2)
    } transform: { Double($0)! }
    Anchor.endOfLine
}

for match in transactions.matches(of: doubleValueRegex) {
    if match.1 >= 100.0 {
        print("Large amount: \(match.1)")
    }
}

Another possibility would be to delay the creation of the Regex to have it created in the isolation context the ValidationRule is used. E.g., an escaping, sendable auto-closure or even an escaping RegexComponentBuilder closure. The second one might reduce the flexibility as one cannot pass existing instances and generally we would break the current API surface.

@Supereg
Copy link
Member Author

Supereg commented Aug 28, 2024

An other alternative would be to add a more sophisticated check that would test this for, e.g., DEBUG builds (e.g. by executing multiple regex evaluations at the same time and seeing if the program crashes due to shared state that is not isolated?). Would not really be a deterministic crash behavior and I think that we could also not really catch it at runtime (or maybe we could using XCTRuntimeAssertions?). Not sure if building that infrastructure is really worth it given the fact that it all boils down to a missing constraint in the standard lib?

I don't think there is a good way for checking that. A race condition doesn't necessarily lead to a crash. They typically just break the data integrity (e.g., two threads incrementing the same mutable state).

@PSchmiedmayer
Copy link
Member

Thanks for the context @Supereg! Given the input above I would suggest that we allow non-sendable regex even if it might contain unsafe code. It should be something that we mention in the documentation; add a warning, and maybe even see if we can push for this on a Swift-level in the long run, but I see your point that the ship might have already sailed on this.

Agree that some automated checks and even a delay might not be worth the tradeoff here if we generally expect the these transformers will be very isolated.

@Supereg
Copy link
Member Author

Supereg commented Sep 1, 2024

Sounds good. I would consider this issue to be complete if we add elaborate documentation around the topic.

Another solution could be to explore the sending keyword here. Maybe it is enough for our API surface to express that the calcite doesn't maintain reference to this non-sendable type. If the regex does unsafe things itself is not part of our problem then. But I generally agree, that we should clearly document this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants
@Supereg @PSchmiedmayer and others