Skip to content

Commit

Permalink
Non-singular queries and function well-typedness (#35)
Browse files Browse the repository at this point in the history
* Enforce JSONPath singular query usage and function well-typedness

* Make standard fitler function type aware

* Enforce non-singular query and function comparison rules

* Add well-typedness test cases.

* Handle comparison expressions as JSONPath function arguments

* Handle nested function calls as arguments to JSONPath functions.

* Remove unnecessary flag on type-aware function implementations.

* Refactor filter expression comparisons.

* Add CLI option for disabling JSONPath expression type checks.

* Update docs with info about well-typed function extensions.
  • Loading branch information
jg-rp authored Oct 1, 2023
1 parent f2bc762 commit 3c2a1d5
Show file tree
Hide file tree
Showing 25 changed files with 774 additions and 368 deletions.
7 changes: 6 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,18 @@

**Breaking Changes**

- We now enforce JSONPath filter expression "well-typedness" by default. That is, filter expressions are checked at compile time according to the IETF JSONPath Draft function extension type system and rules regarding non-singular query usage. If an expression is deemed to not be well-typed, a `JSONPathTypeError` is raised. This can be disabled in Python JSONPath by setting the `well_typed` argument to `JSONPathEnvironment` to `False`, or using `--no-type-checks` on the command line.
- The JSONPath lexer now yields distinct tokens for single and double quoted string literals. This is so the parser can do a better job of detecting invalid escape sequences.
- Changed the canonical representation of a JSONPath string literal to use double quotes instead of single quotes.
- The built-in implementation of the standard `length()` filter function is now a class and is renamed to `jsonpath.function_extensions.Length`.
- The built-in implementation of the standard `value()` filter function is now a class and is renamed to `jsonpath.function_extensions.Value`.

**Fixes**

- We no longer silently ignore invalid escape sequences in JSONPath string literals. For example, `$['\"']` used to be OK, it now raises a `JSONPathSyntaxError`.
- Fixed parsing of JSONPath integer literals that use scientific notation.
- Fixed parsing of JSONPath integer literals that use scientific notation. Previously we raised a `JSONPathSyntaxError` for literals such as `1e2`.
- Fixed parsing of JSONPath comparison and logical expressions as filter function arguments. Previously we raised a `JSONPathSyntaxError` if a comparison or logical expression appeared as a filter function argument. Note that none of the built-in, standard filter functions accept arguments of `LogicalType`.
- Fixed parsing of nested JSONPath filter functions, where a function is used as an argument to another.

## Version 0.9.0

Expand Down
38 changes: 29 additions & 9 deletions docs/advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,27 +41,47 @@ user_names = jsonpath.findall(

Add, remove or replace [filter functions](functions.md) by updating the [`function_extensions`](api.md#jsonpath.env.JSONPathEnvironment.function_extensions) attribute of a [`JSONPathEnvironment`](api.md#jsonpath.env.JSONPathEnvironment). It is a regular Python dictionary mapping filter function names to any [callable](https://docs.python.org/3/library/typing.html#typing.Callable), like a function or class with a `__call__` method.

### Type System for Function Expressions

[Section 2.4.1](https://datatracker.ietf.org/doc/html/draft-ietf-jsonpath-base-21#section-2.4.1) of the IETF JSONPath Draft specification defines a type system for function expressions and requires that we check that filter expressions are well-typed. With that in mind, you are encouraged to implement custom filter functions by extending [`jsonpath.function_extensions.FilterFunction`](api.md#jsonpath.function_extensions.FilterFunction), which forces you to be explicit about the [types](api.md#jsonpath.function_extensions.ExpressionType) of arguments the function extension accepts and the type of its return value.

!!! info

[`FilterFunction`](api.md#jsonpath.function_extensions.FilterFunction) was new in Python JSONPath version 0.10.0. Prior to that we did not enforce function expression well-typedness. To use any arbitrary [callable](https://docs.python.org/3/library/typing.html#typing.Callable) as a function extension - or if you don't want built-in filter functions to raise a `JSONPathTypeError` for function expressions that are not well-typed - set [`well_typed`](api.md#jsonpath.env.JSONPathEnvironment.well_typed) to `False` when constructing a [`JSONPathEnvironment`](api.md#jsonpath.env.JSONPathEnvironment).

### Example

As an example, we'll add a `min()` filter function, which will return the minimum of a sequence of values. If any of the values are not comparable, we'll return the special `undefined` value instead.

```python
from typing import Iterable

import jsonpath
from jsonpath.function_extensions import ExpressionType
from jsonpath.function_extensions import FilterFunction


def min_filter(obj: object) -> object:
if not isinstance(obj, Iterable):
return jsonpath.UNDEFINED
class MinFilterFunction(FilterFunction):
"""A JSONPath function extension returning the minimum of a sequence."""

try:
return min(obj)
except TypeError:
return jsonpath.UNDEFINED
arg_types = [ExpressionType.VALUE]
return_type = ExpressionType.VALUE

def __call__(self, value: object) -> object:
if not isinstance(value, Iterable):
return jsonpath.UNDEFINED

try:
return min(value)
except TypeError:
return jsonpath.UNDEFINED


env = jsonpath.JSONPathEnvironment()
env.function_extensions["min"] = min_filter
env.function_extensions["min"] = MinFilterFunction()

example_data = {"foo": [{"bar": [4, 5]}, {"bar": [1, 5]}]}
print(env.findall("$.foo[?min(@.bar) > 1]", example_data))
```

Now, when we use `env.finall()`, `env.finditer()` or `env.compile()`, our `min` function will be available for use in filter expressions.
Expand Down Expand Up @@ -117,7 +137,7 @@ env = MyEnv()

### Compile Time Validation

A function extension's arguments can be validated at compile time by implementing the function as a class with a `__call__` method, and a `validate` method. `validate` will be called after parsing the function, giving you the opportunity to inspect its arguments and raise a `JSONPathTypeError` should any arguments be unacceptable. If defined, `validate` must take a reference to the current environment, an argument list and the token pointing to the start of the function call.
Calls to [type-aware](#type-system-for-function-expressions) function extension are validated at JSONPath compile-time automatically. If [`well_typed`](api.md#jsonpath.env.JSONPathEnvironment.well_typed) is set to `False` or a custom function extension does not inherit from [`FilterFunction`](api.md#jsonpath.function_extensions.FilterFunction), its arguments can be validated by implementing the function as a class with a `__call__` method, and a `validate` method. `validate` will be called after parsing the function, giving you the opportunity to inspect its arguments and raise a `JSONPathTypeError` should any arguments be unacceptable. If defined, `validate` must take a reference to the current environment, an argument list and the token pointing to the start of the function call.

```python
def validate(
Expand Down
6 changes: 6 additions & 0 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,12 @@
::: jsonpath.CompoundJSONPath
handler: python

::: jsonpath.function_extensions.FilterFunction
handler: python

::: jsonpath.function_extensions.ExpressionType
handler: python

::: jsonpath.JSONPointer
handler: python

Expand Down
22 changes: 15 additions & 7 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@

Python JSONPath includes a script called `json`, exposing [JSONPath](quickstart.md#findallpath-data), [JSON Pointer](quickstart.md#pointerresolvepointer-data) and [JSON Patch](quickstart.md#patchapplypatch-data) features on the command line. Use the `--version` argument to check the current version of Python JSONPath, and the `--help` argument to display command information.


```console
$ json --version
python-jsonpath, version 0.9.0
Expand Down Expand Up @@ -62,6 +61,7 @@ optional arguments:
-o OUTPUT, --output OUTPUT
File to write resulting objects to, as a JSON array. Defaults to the standard
output stream.
--no-type-checks Disables filter expression well-typedness checks.
```

## Global Options
Expand All @@ -73,14 +73,14 @@ These arguments apply to any subcommand and must be listed before the command.
Enable debugging. Display full stack traces, if available, when errors occur. Without the `--debug` option, the following example shows a short "json path syntax error" message.

```console
$ json path -q "$.1" -f /tmp/source.json
$ json path -q "$.1" -f /tmp/source.json
json path syntax error: unexpected token '1', line 1, column 2
```

With the `--debug` option, we get the stack trace triggered by `JSONPathSyntaxError`.

```console
$ json --debug path -q "$.1" -f /tmp/source.json
$ json --debug path -q "$.1" -f /tmp/source.json
Traceback (most recent call last):
File "/home/james/.local/share/virtualenvs/jsonpath_cli-8Tb3e-ir/bin/json", line 8, in <module>
sys.exit(main())
Expand All @@ -102,14 +102,14 @@ jsonpath.exceptions.JSONPathSyntaxError: unexpected token '1', line 1, column 2
Enable pretty formatting when outputting JSON. Adds newlines and indentation to output specified with the `-o` or `--output` option. Without the `--pretty` option, the following example output is on one line.

```console
$ json pointer -p "/categories/1/products/0" -f /tmp/source.json
$ json pointer -p "/categories/1/products/0" -f /tmp/source.json
{"title": "Cap", "description": "Baseball cap", "price": 15.0}
```

With the `--pretty` option, we get nicely formatted JSON output.

```console
$ json --pretty pointer -p "/categories/1/products/0" -f /tmp/source.json
$ json --pretty pointer -p "/categories/1/products/0" -f /tmp/source.json
{
"title": "Cap",
"description": "Baseball cap",
Expand All @@ -122,7 +122,7 @@ $ json --pretty pointer -p "/categories/1/products/0" -f /tmp/source.json
Disable decoding of UTF-16 escape sequences, including surrogate paris. This can improve performance if you know your paths and pointers don't contain UTF-16 escape sequences.

```console
$ json --no-unicode-escape path -q "$.price_cap" -f /tmp/source.json
$ json --no-unicode-escape path -q "$.price_cap" -f /tmp/source.json
```

## Commands
Expand Down Expand Up @@ -185,6 +185,12 @@ $ json path -q "$.price_cap" -f /tmp/source.json -o result.json
$ json path -q "$.price_cap" -f /tmp/source.json --output result.json
```

#### `--no-type-checks`

_New in version 0.10.0_

Disables JSONPath filter expression well-typedness checks. The well-typedness of a filter expression is defined by the IETF JSONPath Draft specification.

### `pointer`

Resolve a JSON Pointer against a JSON document. One of `-p`/`--pointer` or `-r`/`--pointer-file` must be given. `-p` being a JSON Pointer given on the command line as a string, `-r` being the path to a file containing a JSON Pointer.
Expand Down Expand Up @@ -236,6 +242,7 @@ The path to a file to write the resulting object to. If omitted or a hyphen (`-`
```console
$ json pointer -p "/categories/0/name" -f /tmp/source.json -o result.json
```

```console
$ json pointer -p "/categories/0/name" -f /tmp/source.json --output result.json
```
Expand Down Expand Up @@ -289,6 +296,7 @@ The path to a file to write the resulting object to. If omitted or a hyphen (`-`
```console
$ json patch /tmp/patch.json -f /tmp/target.json -o result.json
```

```console
$ json patch /tmp/patch.json -f /tmp/target.json --output result.json
```
Expand All @@ -303,4 +311,4 @@ $ json patch /tmp/patch.json -f /tmp/target.json -u

```console
$ json patch /tmp/patch.json -f /tmp/target.json --uri-decode
```
```
1 change: 1 addition & 0 deletions docs/pointers.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ JSON Pointer ([RFC 6901](https://datatracker.ietf.org/doc/html/rfc6901)) is a st
JSON Pointers are a fundamental part of JSON Patch ([RFC 6902](https://datatracker.ietf.org/doc/html/rfc6902)). Each patch operation must have at least one pointer, identifying the target value.

!!! note

We have extended RFC 6901 to handle our non-standard JSONPath [keys selector](syntax.md#keys-or) and index/property pointers from [Relative JSON Pointer](#torel).

## `resolve(data)`
Expand Down
9 changes: 5 additions & 4 deletions docs/syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,15 +187,16 @@ This is a list of things that you might find in other JSONPath implementation th

And this is a list of areas where we deviate from the [IETF JSONPath draft](https://datatracker.ietf.org/doc/html/draft-ietf-jsonpath-base-13).

- We don't follow all "singular query" rules when evaluating a filter comparison. Note that we support membership operators `in` and `contains`, plus list literals, so testing non-singular queries for membership is OK.
- We don't yet force the result of some filter functions to be compared.
- The root token (default `$`) is optional and paths starting with a dot (`.`) are OK. `.thing` is the same as `$.thing`, as is `thing`, `$[thing]` and `$["thing"]`.
- Whitespace is mostly insignificant unless inside quotes.
- The root token (default `$`) is optional.
- Paths starting with a dot (`.`) are OK. `.thing` is the same as `$.thing`, as is `thing`, `$[thing]` and `$["thing"]`.
- The built-in `match()` and `search()` filter functions use Python's standard library `re` module, which, at least, doesn't support Unicode properties. We might add an implementation of `match()` and `search()` using the third party [regex](https://pypi.org/project/regex/) package in the future.
- We don't require property names to be quoted inside a bracketed selection, unless the name contains reserved characters.
- We don't require the recursive descent segment to have a selector. `$..` is equivalent to `$..*`.
- We support explicit comparisons to `undefined` as well as implicit existence tests.

And this is a list of features that are uncommon or unique to Python JSONPath.

- We support membership operators `in` and `contains`, plus list/array literals.
- `|` is a union operator, where matches from two or more JSONPaths are combined. This is not part of the Python API, but built-in to the JSONPath syntax.
- `&` is an intersection operator, where we exclude matches that don't exist in both left and right paths. This is not part of the Python API, but built-in to the JSONPath syntax.
- `#` is the current key/property or index identifier when filtering a mapping or sequence.
Expand Down
15 changes: 11 additions & 4 deletions jsonpath/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,12 @@ def path_sub_command(parser: argparse.ArgumentParser) -> None: # noqa: D103
),
)

parser.add_argument(
"--no-type-checks",
action="store_true",
help="Disables filter expression well-typedness checks.",
)


def pointer_sub_command(parser: argparse.ArgumentParser) -> None: # noqa: D103
parser.set_defaults(func=handle_pointer_command)
Expand Down Expand Up @@ -234,14 +240,15 @@ def handle_path_command(args: argparse.Namespace) -> None: # noqa: PLR0912
"""Handle the `path` sub command."""
# Empty string is OK.
if args.query is not None:
path = args.query
query = args.query
else:
path = args.query_file.read().strip()
query = args.query_file.read().strip()

try:
path = jsonpath.JSONPathEnvironment(
unicode_escape=not args.no_unicode_escape
).compile(path)
unicode_escape=not args.no_unicode_escape,
well_typed=not args.no_type_checks,
).compile(query)
except JSONPathSyntaxError as err:
if args.debug:
raise
Expand Down
Loading

0 comments on commit 3c2a1d5

Please sign in to comment.