Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce some tests of non-determinism #60

Merged
merged 8 commits into from
Mar 1, 2024

Conversation

glyn
Copy link
Contributor

@glyn glyn commented Feb 29, 2024

This is not necessarily an exhaustive set of tests of the non-deterministic features of RFC 9535. However, it establishes a convention for how to write such tests.

A non-deterministic test has a "results" key instead of a "result" key. The value corresponding to "results" is an array or possible results. If an implementation produces any one of these, the test should pass. If an implementation produces any other result, the test should fail.

Ref #9

@glyn glyn requested a review from gregsdennis February 29, 2024 06:09
cts.schema.json Show resolved Hide resolved
@glyn
Copy link
Contributor Author

glyn commented Feb 29, 2024

@cabo and @hiltontj: since you pass the CTS, you may be interested in this PR.

@gregsdennis
Copy link
Collaborator

I'll need to update my runner to account for this change. I can work on that tomorrow.

glyn added 2 commits February 29, 2024 07:55
It needs to be an array of arrays with at least two elements.

This will help diagnose the likely mistake of using "results"
instead of "result", since it should be fairly unusual for the
results of a query to be an array of two or more arrays.
cts.schema.json Show resolved Hide resolved
]
],
"properties": {
"invalid_selector": false,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregsdennis I took the liberty of tightening this up, even though it's not needed for the current PR.

]
],
"properties": {
"document": false,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregsdennis I took the liberty of tightening this up, even though it's not needed for the current PR.

Copy link
Collaborator

@gregsdennis gregsdennis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating my runner wasn't too difficult. All tests pass.

It's going to be "fun" filling out the rest of the tests...

@f3ath
Copy link
Collaborator

f3ath commented Feb 29, 2024

Have we considered adding a flag (e.g. "indeterministic": true) to allow arbitrary order of the result array instead of trying to list it in every possible order?

I might not understand the idea fully, but if we want to allow indeterministic order somewhere deep inside of the complex result object, we could maybe introduce a list of json pointers pointing to the particular elements of the result which are allowed to be indeterministic.

@gregsdennis
Copy link
Collaborator

gregsdennis commented Feb 29, 2024

The non-determinism is actually only for object selection. As you can see in the new test on line 480, while the values from the object properties are selected in any order, the values are always a then d. Similarly, selection from array values will always be in the same order.

It gets weird when you have something like $..* from

{
  "a": [1, 2, 3],
  "b": [4, 5, 6]
}

Here, you must always have 1, 2, and 3 together and in order, as well as 4, 5, and 6 together and in order, but 4, 5, and 6 could come before or after 1, 2, and 3.

So it's not "in any order," it's "one of these specific orders."

@gregsdennis
Copy link
Collaborator

That said, if you'd like to propose an alternative, I'm happy to consider it.

@glyn
Copy link
Contributor Author

glyn commented Mar 1, 2024

Updating my runner wasn't too difficult. All tests pass.

Yes, that's expected since your implementation is more deterministic than the spec.

It's going to be "fun" filling out the rest of the tests...

Agreed. If we had a full implementation of the RFC which was deliberately non-deterministic in the way it enumerated objects, then we'd have a better chance of automatically discovering tests that should be made non-deterministic (since those tests would fail, at least some of the time).

As it is, there is a risk that we'll miss some tests which should be made non-deterministics. In a sense, that's ok because we can always increase the amount of non-determinism in the CTS, so long as we don't go beyond the spec.

@gregsdennis
Copy link
Collaborator

Yes, that's expected since your implementation is more deterministic than the spec.

I might be able to set up the tests in a way to search for objects and scramble them so that the lib pulls the properties out of order. That would introduce some non-determinism.

It would only be the test runner and wouldn't affect the lib at all. (I kinda want to play with that now.)

@glyn
Copy link
Contributor Author

glyn commented Mar 1, 2024

Yes, that's expected since your implementation is more deterministic than the spec.

I might be able to set up the tests in a way to search for objects and scramble them so that the lib pulls the properties out of order. That would introduce some non-determinism.

It would only be the test runner and wouldn't affect the lib at all. (I kinda want to play with that now.)

That would produce some of the non-determinism allowed by the RFC, but not necessarily all of it (e.g. it wouldn't reproduce the non-determinism of the query $.o[*, *] shown in Table 6 of RFC 9535). But it would be a great way of finding some of the tests which need to be made non-deterministic.

@jg-rp
Copy link
Contributor

jg-rp commented Mar 1, 2024

For what it's worth, the nondeterministic branch of JSON P3 adds some randomness to object iteration.

After a clone (inc submodules) and npm install --production=false, you can run npm run test query to test against the main branch of the CTS. Or npm run test nondeterministic to test against the nondeterminism branch of glyn/jsonpath-compliance-test-suite.

After a few of runs, I've only seen one more case that needs updating, "basic, multiple selectors, wildcard and name", which is less than I was expecting. I might be missing something.

@glyn
Copy link
Contributor Author

glyn commented Mar 1, 2024

For what it's worth, the nondeterministic branch of JSON P3 adds some randomness to object iteration.

After a clone (inc submodules) and npm install --production=false, you can run npm run test query to test against the main branch of the CTS. Or npm run test nondeterministic to test against the nondeterminism branch of glyn/jsonpath-compliance-test-suite.

That's really helpful, thanks for that! (BTW please would you consider adding JSON P3 to the JSONPath comparison project? A first step is to raise an issue there to add JSON P3.)

After a few of runs, I've only seen one more case that needs updating, "basic, multiple selectors, wildcard and name", which is less than I was expecting. I might be missing something.

I wasn't expecting large numbers of the current tests to exhibit non-determinism as it requires a combination of factors:

  • At least one object (with two or more members) in the query argument and at least one of the following:
  1. A wildcard (*) in the query applied to one of those objects
  2. A filter in the query applied to one of those objects which selects two or more values
  3. A descendant segment applied to one of those objects or to one of those object's ancestors

(We could use these criteria to design further non-deterministic tests to plug any gaps.)

@gregsdennis
Copy link
Collaborator

@jg-rp thanks! I'll go ahead and merge this, then update that test in another PR.

@gregsdennis gregsdennis merged commit a74462d into jsonpath-standard:main Mar 1, 2024
1 check passed
@gregsdennis
Copy link
Collaborator

(I kinda want to play with that now.) - me

Yeah, I tried shuffling objects and it wasn't precise enough. I get false positives where it thinks a test that is deterministic shouldn't be.

@hiltontj
Copy link

hiltontj commented Mar 2, 2024

@glyn - thanks for this. I updated serde_json_path's compliance integration test to handle the non-determinism tests, and its all ✅ now: hiltontj/serde_json_path#85

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants