diff --git a/CHANGELOG.md b/CHANGELOG.md index 4b27000..04966d2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,13 +4,18 @@ **Breaking Changes** +- We now enforce JSONPath filter expression "well-typedness" by default. That is, filter expressions are checked at compile time according to the IETF JSONPath Draft function extension type system and rules regarding non-singular query usage. If an expression is deemed to not be well-typed, a `JSONPathTypeError` is raised. This can be disabled in Python JSONPath by setting the `well_typed` argument to `JSONPathEnvironment` to `False`, or using `--no-type-checks` on the command line. - The JSONPath lexer now yields distinct tokens for single and double quoted string literals. This is so the parser can do a better job of detecting invalid escape sequences. - Changed the canonical representation of a JSONPath string literal to use double quotes instead of single quotes. +- The built-in implementation of the standard `length()` filter function is now a class and is renamed to `jsonpath.function_extensions.Length`. +- The built-in implementation of the standard `value()` filter function is now a class and is renamed to `jsonpath.function_extensions.Value`. **Fixes** - We no longer silently ignore invalid escape sequences in JSONPath string literals. For example, `$['\"']` used to be OK, it now raises a `JSONPathSyntaxError`. -- Fixed parsing of JSONPath integer literals that use scientific notation. +- Fixed parsing of JSONPath integer literals that use scientific notation. Previously we raised a `JSONPathSyntaxError` for literals such as `1e2`. +- Fixed parsing of JSONPath comparison and logical expressions as filter function arguments. Previously we raised a `JSONPathSyntaxError` if a comparison or logical expression appeared as a filter function argument. Note that none of the built-in, standard filter functions accept arguments of `LogicalType`. +- Fixed parsing of nested JSONPath filter functions, where a function is used as an argument to another. ## Version 0.9.0 diff --git a/docs/advanced.md b/docs/advanced.md index 3a5ca17..b6ddda1 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -41,27 +41,47 @@ user_names = jsonpath.findall( Add, remove or replace [filter functions](functions.md) by updating the [`function_extensions`](api.md#jsonpath.env.JSONPathEnvironment.function_extensions) attribute of a [`JSONPathEnvironment`](api.md#jsonpath.env.JSONPathEnvironment). It is a regular Python dictionary mapping filter function names to any [callable](https://docs.python.org/3/library/typing.html#typing.Callable), like a function or class with a `__call__` method. +### Type System for Function Expressions + +[Section 2.4.1](https://datatracker.ietf.org/doc/html/draft-ietf-jsonpath-base-21#section-2.4.1) of the IETF JSONPath Draft specification defines a type system for function expressions and requires that we check that filter expressions are well-typed. With that in mind, you are encouraged to implement custom filter functions by extending [`jsonpath.function_extensions.FilterFunction`](api.md#jsonpath.function_extensions.FilterFunction), which forces you to be explicit about the [types](api.md#jsonpath.function_extensions.ExpressionType) of arguments the function extension accepts and the type of its return value. + +!!! info + + [`FilterFunction`](api.md#jsonpath.function_extensions.FilterFunction) was new in Python JSONPath version 0.10.0. Prior to that we did not enforce function expression well-typedness. To use any arbitrary [callable](https://docs.python.org/3/library/typing.html#typing.Callable) as a function extension - or if you don't want built-in filter functions to raise a `JSONPathTypeError` for function expressions that are not well-typed - set [`well_typed`](api.md#jsonpath.env.JSONPathEnvironment.well_typed) to `False` when constructing a [`JSONPathEnvironment`](api.md#jsonpath.env.JSONPathEnvironment). + ### Example As an example, we'll add a `min()` filter function, which will return the minimum of a sequence of values. If any of the values are not comparable, we'll return the special `undefined` value instead. ```python from typing import Iterable + import jsonpath +from jsonpath.function_extensions import ExpressionType +from jsonpath.function_extensions import FilterFunction -def min_filter(obj: object) -> object: - if not isinstance(obj, Iterable): - return jsonpath.UNDEFINED +class MinFilterFunction(FilterFunction): + """A JSONPath function extension returning the minimum of a sequence.""" - try: - return min(obj) - except TypeError: - return jsonpath.UNDEFINED + arg_types = [ExpressionType.VALUE] + return_type = ExpressionType.VALUE + + def __call__(self, value: object) -> object: + if not isinstance(value, Iterable): + return jsonpath.UNDEFINED + + try: + return min(value) + except TypeError: + return jsonpath.UNDEFINED env = jsonpath.JSONPathEnvironment() -env.function_extensions["min"] = min_filter +env.function_extensions["min"] = MinFilterFunction() + +example_data = {"foo": [{"bar": [4, 5]}, {"bar": [1, 5]}]} +print(env.findall("$.foo[?min(@.bar) > 1]", example_data)) ``` Now, when we use `env.finall()`, `env.finditer()` or `env.compile()`, our `min` function will be available for use in filter expressions. @@ -117,7 +137,7 @@ env = MyEnv() ### Compile Time Validation -A function extension's arguments can be validated at compile time by implementing the function as a class with a `__call__` method, and a `validate` method. `validate` will be called after parsing the function, giving you the opportunity to inspect its arguments and raise a `JSONPathTypeError` should any arguments be unacceptable. If defined, `validate` must take a reference to the current environment, an argument list and the token pointing to the start of the function call. +Calls to [type-aware](#type-system-for-function-expressions) function extension are validated at JSONPath compile-time automatically. If [`well_typed`](api.md#jsonpath.env.JSONPathEnvironment.well_typed) is set to `False` or a custom function extension does not inherit from [`FilterFunction`](api.md#jsonpath.function_extensions.FilterFunction), its arguments can be validated by implementing the function as a class with a `__call__` method, and a `validate` method. `validate` will be called after parsing the function, giving you the opportunity to inspect its arguments and raise a `JSONPathTypeError` should any arguments be unacceptable. If defined, `validate` must take a reference to the current environment, an argument list and the token pointing to the start of the function call. ```python def validate( diff --git a/docs/api.md b/docs/api.md index 9bddc47..a893f3a 100644 --- a/docs/api.md +++ b/docs/api.md @@ -11,6 +11,12 @@ ::: jsonpath.CompoundJSONPath handler: python +::: jsonpath.function_extensions.FilterFunction + handler: python + +::: jsonpath.function_extensions.ExpressionType + handler: python + ::: jsonpath.JSONPointer handler: python diff --git a/docs/cli.md b/docs/cli.md index 8ea32fc..0c50034 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -4,7 +4,6 @@ Python JSONPath includes a script called `json`, exposing [JSONPath](quickstart.md#findallpath-data), [JSON Pointer](quickstart.md#pointerresolvepointer-data) and [JSON Patch](quickstart.md#patchapplypatch-data) features on the command line. Use the `--version` argument to check the current version of Python JSONPath, and the `--help` argument to display command information. - ```console $ json --version python-jsonpath, version 0.9.0 @@ -62,6 +61,7 @@ optional arguments: -o OUTPUT, --output OUTPUT File to write resulting objects to, as a JSON array. Defaults to the standard output stream. + --no-type-checks Disables filter expression well-typedness checks. ``` ## Global Options @@ -73,14 +73,14 @@ These arguments apply to any subcommand and must be listed before the command. Enable debugging. Display full stack traces, if available, when errors occur. Without the `--debug` option, the following example shows a short "json path syntax error" message. ```console -$ json path -q "$.1" -f /tmp/source.json +$ json path -q "$.1" -f /tmp/source.json json path syntax error: unexpected token '1', line 1, column 2 ``` With the `--debug` option, we get the stack trace triggered by `JSONPathSyntaxError`. ```console -$ json --debug path -q "$.1" -f /tmp/source.json +$ json --debug path -q "$.1" -f /tmp/source.json Traceback (most recent call last): File "/home/james/.local/share/virtualenvs/jsonpath_cli-8Tb3e-ir/bin/json", line 8, in sys.exit(main()) @@ -102,14 +102,14 @@ jsonpath.exceptions.JSONPathSyntaxError: unexpected token '1', line 1, column 2 Enable pretty formatting when outputting JSON. Adds newlines and indentation to output specified with the `-o` or `--output` option. Without the `--pretty` option, the following example output is on one line. ```console -$ json pointer -p "/categories/1/products/0" -f /tmp/source.json +$ json pointer -p "/categories/1/products/0" -f /tmp/source.json {"title": "Cap", "description": "Baseball cap", "price": 15.0} ``` With the `--pretty` option, we get nicely formatted JSON output. ```console -$ json --pretty pointer -p "/categories/1/products/0" -f /tmp/source.json +$ json --pretty pointer -p "/categories/1/products/0" -f /tmp/source.json { "title": "Cap", "description": "Baseball cap", @@ -122,7 +122,7 @@ $ json --pretty pointer -p "/categories/1/products/0" -f /tmp/source.json Disable decoding of UTF-16 escape sequences, including surrogate paris. This can improve performance if you know your paths and pointers don't contain UTF-16 escape sequences. ```console -$ json --no-unicode-escape path -q "$.price_cap" -f /tmp/source.json +$ json --no-unicode-escape path -q "$.price_cap" -f /tmp/source.json ``` ## Commands @@ -185,6 +185,12 @@ $ json path -q "$.price_cap" -f /tmp/source.json -o result.json $ json path -q "$.price_cap" -f /tmp/source.json --output result.json ``` +#### `--no-type-checks` + +_New in version 0.10.0_ + +Disables JSONPath filter expression well-typedness checks. The well-typedness of a filter expression is defined by the IETF JSONPath Draft specification. + ### `pointer` Resolve a JSON Pointer against a JSON document. One of `-p`/`--pointer` or `-r`/`--pointer-file` must be given. `-p` being a JSON Pointer given on the command line as a string, `-r` being the path to a file containing a JSON Pointer. @@ -236,6 +242,7 @@ The path to a file to write the resulting object to. If omitted or a hyphen (`-` ```console $ json pointer -p "/categories/0/name" -f /tmp/source.json -o result.json ``` + ```console $ json pointer -p "/categories/0/name" -f /tmp/source.json --output result.json ``` @@ -289,6 +296,7 @@ The path to a file to write the resulting object to. If omitted or a hyphen (`-` ```console $ json patch /tmp/patch.json -f /tmp/target.json -o result.json ``` + ```console $ json patch /tmp/patch.json -f /tmp/target.json --output result.json ``` @@ -303,4 +311,4 @@ $ json patch /tmp/patch.json -f /tmp/target.json -u ```console $ json patch /tmp/patch.json -f /tmp/target.json --uri-decode -``` \ No newline at end of file +``` diff --git a/docs/pointers.md b/docs/pointers.md index b05a9fb..6f8738a 100644 --- a/docs/pointers.md +++ b/docs/pointers.md @@ -7,6 +7,7 @@ JSON Pointer ([RFC 6901](https://datatracker.ietf.org/doc/html/rfc6901)) is a st JSON Pointers are a fundamental part of JSON Patch ([RFC 6902](https://datatracker.ietf.org/doc/html/rfc6902)). Each patch operation must have at least one pointer, identifying the target value. !!! note + We have extended RFC 6901 to handle our non-standard JSONPath [keys selector](syntax.md#keys-or) and index/property pointers from [Relative JSON Pointer](#torel). ## `resolve(data)` diff --git a/docs/syntax.md b/docs/syntax.md index 6ebddb9..1f6fda4 100644 --- a/docs/syntax.md +++ b/docs/syntax.md @@ -187,15 +187,16 @@ This is a list of things that you might find in other JSONPath implementation th And this is a list of areas where we deviate from the [IETF JSONPath draft](https://datatracker.ietf.org/doc/html/draft-ietf-jsonpath-base-13). -- We don't follow all "singular query" rules when evaluating a filter comparison. Note that we support membership operators `in` and `contains`, plus list literals, so testing non-singular queries for membership is OK. -- We don't yet force the result of some filter functions to be compared. +- The root token (default `$`) is optional and paths starting with a dot (`.`) are OK. `.thing` is the same as `$.thing`, as is `thing`, `$[thing]` and `$["thing"]`. - Whitespace is mostly insignificant unless inside quotes. -- The root token (default `$`) is optional. -- Paths starting with a dot (`.`) are OK. `.thing` is the same as `$.thing`, as is `thing`, `$[thing]` and `$["thing"]`. - The built-in `match()` and `search()` filter functions use Python's standard library `re` module, which, at least, doesn't support Unicode properties. We might add an implementation of `match()` and `search()` using the third party [regex](https://pypi.org/project/regex/) package in the future. +- We don't require property names to be quoted inside a bracketed selection, unless the name contains reserved characters. +- We don't require the recursive descent segment to have a selector. `$..` is equivalent to `$..*`. +- We support explicit comparisons to `undefined` as well as implicit existence tests. And this is a list of features that are uncommon or unique to Python JSONPath. +- We support membership operators `in` and `contains`, plus list/array literals. - `|` is a union operator, where matches from two or more JSONPaths are combined. This is not part of the Python API, but built-in to the JSONPath syntax. - `&` is an intersection operator, where we exclude matches that don't exist in both left and right paths. This is not part of the Python API, but built-in to the JSONPath syntax. - `#` is the current key/property or index identifier when filtering a mapping or sequence. diff --git a/jsonpath/cli.py b/jsonpath/cli.py index d9af70a..e79d2fd 100644 --- a/jsonpath/cli.py +++ b/jsonpath/cli.py @@ -53,6 +53,12 @@ def path_sub_command(parser: argparse.ArgumentParser) -> None: # noqa: D103 ), ) + parser.add_argument( + "--no-type-checks", + action="store_true", + help="Disables filter expression well-typedness checks.", + ) + def pointer_sub_command(parser: argparse.ArgumentParser) -> None: # noqa: D103 parser.set_defaults(func=handle_pointer_command) @@ -234,14 +240,15 @@ def handle_path_command(args: argparse.Namespace) -> None: # noqa: PLR0912 """Handle the `path` sub command.""" # Empty string is OK. if args.query is not None: - path = args.query + query = args.query else: - path = args.query_file.read().strip() + query = args.query_file.read().strip() try: path = jsonpath.JSONPathEnvironment( - unicode_escape=not args.no_unicode_escape - ).compile(path) + unicode_escape=not args.no_unicode_escape, + well_typed=not args.no_type_checks, + ).compile(query) except JSONPathSyntaxError as err: if args.debug: raise diff --git a/jsonpath/env.py b/jsonpath/env.py index 01afedc..6aa16a8 100644 --- a/jsonpath/env.py +++ b/jsonpath/env.py @@ -2,7 +2,7 @@ from __future__ import annotations import re -from collections.abc import Collection +from decimal import Decimal from operator import getitem from typing import TYPE_CHECKING from typing import Any @@ -20,7 +20,15 @@ from . import function_extensions from .exceptions import JSONPathNameError from .exceptions import JSONPathSyntaxError +from .exceptions import JSONPathTypeError from .filter import UNDEFINED +from .filter import VALUE_TYPE_EXPRESSIONS +from .filter import FilterExpression +from .filter import FunctionExtension +from .filter import InfixExpression +from .filter import Path +from .function_extensions import ExpressionType +from .function_extensions import FilterFunction from .function_extensions import validate from .lex import Lexer from .match import JSONPathMatch @@ -74,6 +82,11 @@ class attributes `root_token`, `self_token` and `filter_context_token`. where possible. unicode_escape: If `True`, decode UTF-16 escape sequences found in JSONPath string literals. + well_typed: Control well-typedness checks on filter function expressions. + If `True` (the default), JSONPath expressions are checked for + well-typedness as compile time. + + **New in version 0.10.0** Attributes: filter_context_token (str): The pattern used to select extra filter context @@ -120,6 +133,7 @@ def __init__( *, filter_caching: bool = True, unicode_escape: bool = True, + well_typed: bool = True, ) -> None: self.filter_caching: bool = filter_caching """Enable or disable filter expression caching.""" @@ -128,6 +142,9 @@ def __init__( """Enable or disable decoding of UTF-16 escape sequences found in JSONPath string literals.""" + self.well_typed: bool = well_typed + """Control well-typedness checks on filter function expressions.""" + self.lexer: Lexer = self.lexer_class(env=self) """The lexer bound to this environment.""" @@ -311,11 +328,11 @@ async def finditer_async( def setup_function_extensions(self) -> None: """Initialize function extensions.""" self.function_extensions["keys"] = function_extensions.keys - self.function_extensions["length"] = function_extensions.length + self.function_extensions["length"] = function_extensions.Length() self.function_extensions["count"] = function_extensions.Count() self.function_extensions["match"] = function_extensions.Match() self.function_extensions["search"] = function_extensions.Search() - self.function_extensions["value"] = function_extensions.value + self.function_extensions["value"] = function_extensions.Value() self.function_extensions["isinstance"] = function_extensions.IsInstance() self.function_extensions["is"] = self.function_extensions["isinstance"] self.function_extensions["typeof"] = function_extensions.TypeOf() @@ -336,12 +353,75 @@ def validate_function_extension_signature( f"function {token.value!r} is not defined", token=token ) from err + # Type-aware function extensions use the spec's type system. + if self.well_typed and isinstance(func, FilterFunction): + self.check_well_typedness(token, func, args) + return args + + # A callable with a `validate` method? if hasattr(func, "validate"): args = func.validate(self, args, token) assert isinstance(args, list) return args + + # Generic validation using introspection. return validate(self, func, args, token) + def check_well_typedness( + self, + token: Token, + func: FilterFunction, + args: List[FilterExpression], + ) -> None: + """Check the well-typedness of a function's arguments at compile-time.""" + # Correct number of arguments? + if len(args) != len(func.arg_types): + raise JSONPathTypeError( + f"{token.value!r}() requires {len(func.arg_types)} arguments", + token=token, + ) + + # Argument types + for idx, typ in enumerate(func.arg_types): + arg = args[idx] + if typ == ExpressionType.VALUE: + if not ( + isinstance(arg, VALUE_TYPE_EXPRESSIONS) + or (isinstance(arg, Path) and arg.path.singular_query()) + or (self._function_return_type(arg) == ExpressionType.VALUE) + ): + raise JSONPathTypeError( + f"{token.value}() argument {idx} must be of ValueType", + token=token, + ) + elif typ == ExpressionType.LOGICAL: + if not isinstance(arg, (Path, InfixExpression)): + raise JSONPathTypeError( + f"{token.value}() argument {idx} must be of LogicalType", + token=token, + ) + elif typ == ExpressionType.NODES and not ( + isinstance(arg, Path) + or self._function_return_type(arg) == ExpressionType.NODES + ): + raise JSONPathTypeError( + f"{token.value}() argument {idx} must be of NodesType", + token=token, + ) + + def _function_return_type(self, expr: FilterExpression) -> Optional[ExpressionType]: + """Return the type returned from a filter function. + + If _expr_ is not a `FunctionExtension` or the registered function definition is + not type-aware, return `None`. + """ + if not isinstance(expr, FunctionExtension): + return None + func = self.function_extensions.get(expr.name) + if isinstance(func, FilterFunction): + return func.return_type + return None + def getitem(self, obj: Any, key: Any) -> Any: """Sequence and mapping item getter used throughout JSONPath resolution. @@ -374,16 +454,17 @@ def is_truthy(self, obj: object) -> bool: Returns: `True` if the object exists and is not `False` or `0`. """ + if isinstance(obj, NodeList) and len(obj) == 0: + return False if obj is UNDEFINED: return False - if isinstance(obj, Collection): - return True if obj is None: return True return bool(obj) - # ruff: noqa: PLR0912, PLR0911 - def compare(self, left: object, operator: str, right: object) -> bool: + def compare( # noqa: PLR0911 + self, left: object, operator: str, right: object + ) -> bool: """Object comparison within JSONPath filters. Override this to customize filter expression comparison operator @@ -398,71 +479,62 @@ def compare(self, left: object, operator: str, right: object) -> bool: `True` if the comparison between _left_ and _right_, with the given _operator_, is truthy. `False` otherwise. """ - if isinstance(left, NodeList): - left = left.values_or_singular() - if isinstance(right, NodeList): - right = right.values_or_singular() - if operator == "&&": return self.is_truthy(left) and self.is_truthy(right) if operator == "||": return self.is_truthy(left) or self.is_truthy(right) - if operator == "==": - return bool(left == right) - if operator in ("!=", "<>"): - return bool(left != right) - - if isinstance(right, Sequence) and operator == "in": + return self._eq(left, right) + if operator == "!=": + return not self._eq(left, right) + if operator == "<": + return self._lt(left, right) + if operator == ">": + return self._lt(right, left) + if operator == ">=": + return self._lt(right, left) or self._eq(left, right) + if operator == "<=": + return self._lt(left, right) or self._eq(left, right) + if operator == "in" and isinstance(right, Sequence): return left in right - - if isinstance(left, Sequence) and operator == "contains": + if operator == "contains" and isinstance(left, Sequence): return right in left - - if left is UNDEFINED or right is UNDEFINED: - return operator == "<=" - if operator == "=~" and isinstance(right, re.Pattern) and isinstance(left, str): return bool(right.fullmatch(left)) + return False - if isinstance(left, str) and isinstance(right, str): - if operator == "<=": - return left <= right - if operator == ">=": - return left >= right - if operator == "<": - return left < right - - assert operator == ">" - return left > right - - # This will catch booleans too. - if isinstance(left, (int, float)) and isinstance(right, (int, float)): - if operator == "<=": - return left <= right - if operator == ">=": - return left >= right - if operator == "<": - return left < right - - assert operator == ">" - return left > right - - if ( - isinstance(left, Mapping) - and isinstance(right, Mapping) - and operator == "<=" - ): - return left == right + def _eq(self, left: object, right: object) -> bool: # noqa: PLR0911 + if isinstance(right, NodeList): + left, right = right, left - if ( - isinstance(left, Sequence) - and isinstance(right, Sequence) - and operator == "<=" - ): - return left == right + if isinstance(left, NodeList): + if isinstance(right, NodeList): + return left == right + if left.empty(): + return right is UNDEFINED + if len(left) == 1: + return left[0] == right + return False - if left is None and right is None and operator in ("<=", ">="): + if left is UNDEFINED and right is UNDEFINED: return True + # Remember 1 == True and 0 == False in Python + if isinstance(right, bool): + left, right = right, left + + if isinstance(left, bool): + return isinstance(right, bool) and left == right + + return left == right + + def _lt(self, left: object, right: object) -> bool: + if isinstance(left, str) and isinstance(right, str): + return left < right + + if isinstance(left, (int, float, Decimal)) and isinstance( + right, (int, float, Decimal) + ): + return left < right + return False diff --git a/jsonpath/filter.py b/jsonpath/filter.py index e8fbc67..b5d3d7d 100644 --- a/jsonpath/filter.py +++ b/jsonpath/filter.py @@ -7,6 +7,8 @@ from abc import ABC from abc import abstractmethod from typing import TYPE_CHECKING +from typing import Any +from typing import Callable from typing import Generic from typing import Iterable from typing import List @@ -15,7 +17,10 @@ from typing import Sequence from typing import TypeVar +from jsonpath.function_extensions.filter_function import ExpressionType + from .exceptions import JSONPathTypeError +from .function_extensions import FilterFunction from .match import NodeList from .selectors import Filter as FilterSelector @@ -101,6 +106,13 @@ def set_children(self, children: List[FilterExpression]) -> None: # noqa: ARG00 class _Undefined: __slots__ = () + def __eq__(self, other: object) -> bool: + return ( + other is UNDEFINED_LITERAL + or other is UNDEFINED + or (isinstance(other, NodeList) and other.empty()) + ) + def __str__(self) -> str: return "" @@ -108,6 +120,7 @@ def __repr__(self) -> str: return "" +# This is equivalent to the spec's special `Nothing` value. UNDEFINED = _Undefined() @@ -117,7 +130,11 @@ class Undefined(FilterExpression): __slots__ = () def __eq__(self, other: object) -> bool: - return isinstance(other, Undefined) or other is UNDEFINED + return ( + isinstance(other, Undefined) + or other is UNDEFINED + or (isinstance(other, NodeList) and len(other) == 0) + ) def __str__(self) -> str: return "undefined" @@ -228,34 +245,6 @@ def __str__(self) -> str: return f"/{pattern}/{''.join(flags)}" -class RegexArgument(FilterExpression): - """A compiled regex.""" - - __slots__ = ("pattern",) - - def __init__(self, pattern: Pattern[str]) -> None: - self.pattern = pattern - super().__init__() - - def __eq__(self, other: object) -> bool: - return isinstance(other, RegexArgument) and other.pattern == self.pattern - - def __str__(self) -> str: - return repr(self.pattern.pattern) - - def evaluate(self, _: FilterContext) -> object: - return self.pattern - - async def evaluate_async(self, _: FilterContext) -> object: - return self.pattern - - def children(self) -> List[FilterExpression]: - return [] - - def set_children(self, children: List[FilterExpression]) -> None: # noqa: ARG002 - return - - class ListLiteral(FilterExpression): """A list literal.""" @@ -352,17 +341,25 @@ def __eq__(self, other: object) -> bool: ) def evaluate(self, context: FilterContext) -> bool: - if isinstance(self.left, Undefined) and isinstance(self.right, Undefined): - return True left = self.left.evaluate(context) + if isinstance(left, NodeList) and len(left) == 1: + left = left[0].obj + right = self.right.evaluate(context) + if isinstance(right, NodeList) and len(right) == 1: + right = right[0].obj + return context.env.compare(left, self.operator, right) async def evaluate_async(self, context: FilterContext) -> bool: - if isinstance(self.left, Undefined) and isinstance(self.right, Undefined): - return True left = await self.left.evaluate_async(context) + if isinstance(left, NodeList) and len(left) == 1: + left = left[0].obj + right = await self.right.evaluate_async(context) + if isinstance(right, NodeList) and len(right) == 1: + right = right[0].obj + return context.env.compare(left, self.operator, right) def children(self) -> List[FilterExpression]: @@ -494,48 +491,31 @@ def __init__(self, path: JSONPath) -> None: def __str__(self) -> str: return "@" + str(self.path)[1:] - def evaluate(self, context: FilterContext) -> object: # noqa: PLR0911 - if isinstance(context.current, str): + def evaluate(self, context: FilterContext) -> object: + if isinstance(context.current, str): # TODO: refactor if self.path.empty(): return context.current - return UNDEFINED + return NodeList() if not isinstance(context.current, (Sequence, Mapping)): if self.path.empty(): return context.current - return UNDEFINED - - try: - matches = NodeList(self.path.finditer(context.current)) - except json.JSONDecodeError: # this should never happen - return UNDEFINED + return NodeList() - if not matches: - return UNDEFINED - return matches + return NodeList(self.path.finditer(context.current)) - async def evaluate_async(self, context: FilterContext) -> object: # noqa: PLR0911 - if isinstance(context.current, str): + async def evaluate_async(self, context: FilterContext) -> object: + if isinstance(context.current, str): # TODO: refactor if self.path.empty(): return context.current - return UNDEFINED + return NodeList() if not isinstance(context.current, (Sequence, Mapping)): if self.path.empty(): return context.current - return UNDEFINED + return NodeList() - try: - matches = NodeList( - [ - match - async for match in await self.path.finditer_async(context.current) - ] - ) - except json.JSONDecodeError: - return UNDEFINED - - if not matches: - return UNDEFINED - return matches + return NodeList( + [match async for match in await self.path.finditer_async(context.current)] + ) class RootPath(Path): @@ -553,18 +533,12 @@ def __str__(self) -> str: return str(self.path) def evaluate(self, context: FilterContext) -> object: - matches = NodeList(self.path.finditer(context.root)) - if not matches: - return UNDEFINED - return matches + return NodeList(self.path.finditer(context.root)) async def evaluate_async(self, context: FilterContext) -> object: - matches = NodeList( + return NodeList( [match async for match in await self.path.finditer_async(context.root)] ) - if not matches: - return UNDEFINED - return matches class FilterContextPath(Path): @@ -583,21 +557,15 @@ def __str__(self) -> str: return "_" + path_repr[1:] def evaluate(self, context: FilterContext) -> object: - matches = NodeList(self.path.finditer(context.extra_context)) - if not matches: - return UNDEFINED - return matches + return NodeList(self.path.finditer(context.extra_context)) async def evaluate_async(self, context: FilterContext) -> object: - matches = NodeList( + return NodeList( [ match async for match in await self.path.finditer_async(context.extra_context) ] ) - if not matches: - return UNDEFINED - return matches class FunctionExtension(FilterExpression): @@ -625,23 +593,36 @@ def evaluate(self, context: FilterContext) -> object: try: func = context.env.function_extensions[self.name] except KeyError: - return UNDEFINED + return UNDEFINED # TODO: should probably raise an exception args = [arg.evaluate(context) for arg in self.args] - if getattr(func, "with_node_lists", False): - return func(*args) - return func(*self._unpack_node_lists(args)) + return func(*self._unpack_node_lists(func, args)) async def evaluate_async(self, context: FilterContext) -> object: try: func = context.env.function_extensions[self.name] except KeyError: - return UNDEFINED + return UNDEFINED # TODO: should probably raise an exception args = [await arg.evaluate_async(context) for arg in self.args] + return func(*self._unpack_node_lists(func, args)) + + def _unpack_node_lists( + self, func: Callable[..., Any], args: List[object] + ) -> List[object]: + if isinstance(func, FilterFunction): + _args: List[object] = [] + for idx, arg in enumerate(args): + if func.arg_types[idx] != ExpressionType.NODES and isinstance( + arg, NodeList + ): + _args.append(arg.values_or_singular()) + else: + _args.append(arg) + return _args + + # Legacy way to indicate that a filter function wants node lists as arguments. if getattr(func, "with_node_lists", False): - return func(*args) - return func(*self._unpack_node_lists(args)) + return args - def _unpack_node_lists(self, args: List[object]) -> List[object]: return [ obj.values_or_singular() if isinstance(obj, NodeList) else obj for obj in args @@ -690,3 +671,12 @@ def walk(expr: FilterExpression) -> Iterable[FilterExpression]: yield expr for child in expr.children(): yield from walk(child) + + +VALUE_TYPE_EXPRESSIONS = ( + Nil, + Undefined, + Literal, + ListLiteral, + CurrentKey, +) diff --git a/jsonpath/function_extensions/__init__.py b/jsonpath/function_extensions/__init__.py index 983f3ba..cd7e8c9 100644 --- a/jsonpath/function_extensions/__init__.py +++ b/jsonpath/function_extensions/__init__.py @@ -1,22 +1,26 @@ # noqa: D104 -from .arguments import validate +from .arguments import validate # noqa: I001 +from .filter_function import ExpressionType +from .filter_function import FilterFunction from .count import Count from .is_instance import IsInstance from .keys import keys -from .length import length +from .length import Length from .match import Match from .search import Search from .typeof import TypeOf -from .value import value +from .value import Value __all__ = ( "Count", + "ExpressionType", + "FilterFunction", "IsInstance", "keys", - "length", + "Length", "Match", "Search", "TypeOf", "validate", - "value", + "Value", ) diff --git a/jsonpath/function_extensions/count.py b/jsonpath/function_extensions/count.py index 875bfa8..153f4b5 100644 --- a/jsonpath/function_extensions/count.py +++ b/jsonpath/function_extensions/count.py @@ -2,45 +2,20 @@ from __future__ import annotations from typing import TYPE_CHECKING -from typing import List -from jsonpath.exceptions import JSONPathTypeError -from jsonpath.filter import Literal -from jsonpath.filter import Nil +from jsonpath.function_extensions import ExpressionType +from jsonpath.function_extensions import FilterFunction if TYPE_CHECKING: - from jsonpath.env import JSONPathEnvironment from jsonpath.match import NodeList - from jsonpath.token import Token -class Count: +class Count(FilterFunction): """The built-in `count` function.""" - with_node_lists = True + arg_types = [ExpressionType.NODES] + return_type = ExpressionType.VALUE def __call__(self, node_list: NodeList) -> int: """Return the number of nodes in the node list.""" return len(node_list) - - def validate( - self, - _: "JSONPathEnvironment", - args: List[object], - token: "Token", - ) -> List[object]: - """Function argument validation.""" - if len(args) != 1: # noqa: PLR2004 - raise JSONPathTypeError( - f"{token.value!r} requires 1 arguments, found {len(args)}", - token=token, - ) - - if isinstance(args[0], (Literal, Nil)): - raise JSONPathTypeError( - f"{token.value!r} requires a node list, " - f"found {args[0].__class__.__name__}", - token=token, - ) - - return args diff --git a/jsonpath/function_extensions/filter_function.py b/jsonpath/function_extensions/filter_function.py new file mode 100644 index 0000000..7391323 --- /dev/null +++ b/jsonpath/function_extensions/filter_function.py @@ -0,0 +1,32 @@ +"""Classes modeling the JSONPath spec type system for function extensions.""" +from abc import ABC +from abc import abstractmethod +from enum import Enum +from typing import Any +from typing import List + + +class ExpressionType(Enum): + """The type of a filter function argument or return value.""" + + VALUE = 1 + LOGICAL = 2 + NODES = 3 + + +class FilterFunction(ABC): + """Base class for typed function extensions.""" + + @property + @abstractmethod + def arg_types(self) -> List[ExpressionType]: + """Argument types expected by the filter function.""" + + @property + @abstractmethod + def return_type(self) -> ExpressionType: + """The type of the value returned by the filter function.""" + + @abstractmethod + def __call__(self, *args: Any, **kwds: Any) -> Any: + """Called the filter function.""" diff --git a/jsonpath/function_extensions/is_instance.py b/jsonpath/function_extensions/is_instance.py index 568c670..c1b6f59 100644 --- a/jsonpath/function_extensions/is_instance.py +++ b/jsonpath/function_extensions/is_instance.py @@ -5,19 +5,34 @@ from jsonpath.filter import UNDEFINED from jsonpath.filter import UNDEFINED_LITERAL +from jsonpath.function_extensions import ExpressionType +from jsonpath.function_extensions import FilterFunction +from jsonpath.match import NodeList -class IsInstance: +class IsInstance(FilterFunction): """A non-standard "isinstance" filter function.""" - def __call__(self, obj: object, t: str) -> bool: # noqa: PLR0911 + arg_types = [ExpressionType.NODES, ExpressionType.VALUE] + return_type = ExpressionType.LOGICAL + + def __call__(self, nodes: NodeList, t: str) -> bool: # noqa: PLR0911 """Return `True` if the type of _obj_ matches _t_. This function allows _t_ to be one of several aliases for the real Python "type". Some of these aliases follow JavaScript/JSON semantics. """ - if obj is UNDEFINED or obj is UNDEFINED_LITERAL: + if not nodes: + return t in ("undefined", "missing") + + obj = nodes.values_or_singular() + if ( + obj is UNDEFINED + or obj is UNDEFINED_LITERAL + or (isinstance(obj, NodeList) and len(obj) == 0) + ): return t in ("undefined", "missing") + if obj is None: return t in ("null", "nil", "None", "none") if isinstance(obj, str): diff --git a/jsonpath/function_extensions/length.py b/jsonpath/function_extensions/length.py index 059d0c3..3189a51 100644 --- a/jsonpath/function_extensions/length.py +++ b/jsonpath/function_extensions/length.py @@ -2,10 +2,19 @@ from collections.abc import Sized from typing import Optional +from jsonpath.function_extensions import ExpressionType +from jsonpath.function_extensions import FilterFunction -def length(obj: Sized) -> Optional[int]: - """Return an object's length, or `None` if the object does not have a length.""" - try: - return len(obj) - except TypeError: - return None + +class Length(FilterFunction): + """A type-aware implementation of the standard `length` function.""" + + arg_types = [ExpressionType.VALUE] + return_type = ExpressionType.VALUE + + def __call__(self, obj: Sized) -> Optional[int]: + """Return an object's length, or `None` if the object does not have a length.""" + try: + return len(obj) + except TypeError: + return None diff --git a/jsonpath/function_extensions/match.py b/jsonpath/function_extensions/match.py index 8fd2fbd..7bc8749 100644 --- a/jsonpath/function_extensions/match.py +++ b/jsonpath/function_extensions/match.py @@ -1,56 +1,21 @@ """The standard `match` function extension.""" import re -from typing import TYPE_CHECKING -from typing import List -from typing import Pattern -from typing import Union -from jsonpath.exceptions import JSONPathTypeError -from jsonpath.filter import RegexArgument -from jsonpath.filter import StringLiteral +from jsonpath.function_extensions import ExpressionType +from jsonpath.function_extensions import FilterFunction -if TYPE_CHECKING: - from jsonpath.env import JSONPathEnvironment - from jsonpath.token import Token +class Match(FilterFunction): + """A type-aware implementation of the standard `match` function.""" -class Match: - """The built-in `match` function. - - This implementation uses the standard _re_ module, without attempting to map - I-Regexps to Python regex. - """ - - def __call__(self, string: str, pattern: Union[str, Pattern[str], None]) -> bool: - """Return `True` if _pattern_ matches the given string, `False` otherwise.""" - # The IETF JSONPath draft requires us to return `False` if the pattern was - # invalid. We use `None` to indicate the pattern could not be compiled. - if string is None or pattern is None: - return False + arg_types = [ExpressionType.VALUE, ExpressionType.VALUE] + return_type = ExpressionType.LOGICAL + def __call__(self, string: str, pattern: str) -> bool: + """Return `True` if _string_ matches _pattern_, or `False` otherwise.""" try: + # re.fullmatch caches compiled patterns internally return bool(re.fullmatch(pattern, string)) except (TypeError, re.error): return False - - def validate( - self, - _: "JSONPathEnvironment", - args: List[object], - token: "Token", - ) -> List[object]: - """Function argument validation.""" - if len(args) != 2: # noqa: PLR2004 - raise JSONPathTypeError( - f"{token.value!r} requires 2 arguments, found {len(args)}", - token=token, - ) - - if isinstance(args[1], StringLiteral): - try: - return [args[0], RegexArgument(re.compile(args[1].value))] - except re.error: - return [None, None] - - return args diff --git a/jsonpath/function_extensions/search.py b/jsonpath/function_extensions/search.py index 8ecc8b7..ed88635 100644 --- a/jsonpath/function_extensions/search.py +++ b/jsonpath/function_extensions/search.py @@ -1,56 +1,21 @@ """The standard `search` function extension.""" import re -from typing import TYPE_CHECKING -from typing import List -from typing import Pattern -from typing import Union -from jsonpath.exceptions import JSONPathTypeError -from jsonpath.filter import RegexArgument -from jsonpath.filter import StringLiteral +from jsonpath.function_extensions import ExpressionType +from jsonpath.function_extensions import FilterFunction -if TYPE_CHECKING: - from jsonpath.env import JSONPathEnvironment - from jsonpath.token import Token +class Search(FilterFunction): + """A type-aware implementation of the standard `search` function.""" -class Search: - """The built-in `search` function. - - This implementation uses the standard _re_ module, without attempting to map - I-Regexps to Python regex. - """ - - def __call__(self, string: str, pattern: Union[str, Pattern[str], None]) -> bool: - """Return `True` if the given string contains _pattern_, `False` otherwise.""" - # The IETF JSONPath draft requires us to return `False` if the pattern was - # invalid. We use `None` to indicate the pattern could not be compiled. - if string is None or pattern is None: - return False + arg_types = [ExpressionType.VALUE, ExpressionType.VALUE] + return_type = ExpressionType.LOGICAL + def __call__(self, string: str, pattern: str) -> bool: + """Return `True` if _string_ contains _pattern_, or `False` otherwise.""" try: + # re.search caches compiled patterns internally return bool(re.search(pattern, string)) except (TypeError, re.error): return False - - def validate( - self, - _: "JSONPathEnvironment", - args: List[object], - token: "Token", - ) -> List[object]: - """Function argument validation.""" - if len(args) != 2: # noqa: PLR2004 - raise JSONPathTypeError( - f"{token.value!r} requires 2 arguments, found {len(args)}", - token=token, - ) - - if isinstance(args[1], StringLiteral): - try: - return [args[0], RegexArgument(re.compile(args[1].value))] - except re.error: - return [None, None] - - return args diff --git a/jsonpath/function_extensions/typeof.py b/jsonpath/function_extensions/typeof.py index 16f7eec..c0f9eae 100644 --- a/jsonpath/function_extensions/typeof.py +++ b/jsonpath/function_extensions/typeof.py @@ -5,9 +5,12 @@ from jsonpath.filter import UNDEFINED from jsonpath.filter import UNDEFINED_LITERAL +from jsonpath.function_extensions import ExpressionType +from jsonpath.function_extensions import FilterFunction +from jsonpath.match import NodeList -class TypeOf: +class TypeOf(FilterFunction): """A non-standard "typeof" filter function. Arguments: @@ -15,15 +18,23 @@ class TypeOf: otherwise we'll use "int" and "float" respectively. Defaults to `True`. """ + arg_types = [ExpressionType.NODES] + return_type = ExpressionType.VALUE + def __init__(self, *, single_number_type: bool = True) -> None: self.single_number_type = single_number_type - def __call__(self, obj: object) -> str: # noqa: PLR0911 + def __call__(self, nodes: NodeList) -> str: # noqa: PLR0911 """Return the type of _obj_ as a string. The strings returned from this function use JSON terminology, much like the result of JavaScript's `typeof` operator. """ + if not nodes: + return "undefined" + + obj = nodes.values_or_singular() + if obj is UNDEFINED or obj is UNDEFINED_LITERAL: return "undefined" if obj is None: diff --git a/jsonpath/function_extensions/value.py b/jsonpath/function_extensions/value.py index fa33db4..94cd10c 100644 --- a/jsonpath/function_extensions/value.py +++ b/jsonpath/function_extensions/value.py @@ -1,15 +1,24 @@ """The standard `value` function extension.""" -from typing import Sequence +from __future__ import annotations + +from typing import TYPE_CHECKING from jsonpath.filter import UNDEFINED +from jsonpath.function_extensions import ExpressionType +from jsonpath.function_extensions import FilterFunction + +if TYPE_CHECKING: + from jsonpath.match import NodeList + + +class Value(FilterFunction): + """A type-aware implementation of the standard `value` function.""" + arg_types = [ExpressionType.NODES] + return_type = ExpressionType.VALUE -def value(obj: object) -> object: - """Return the first object in the sequence if the sequence has only one item.""" - if isinstance(obj, str): - return obj - if isinstance(obj, Sequence): - if len(obj) == 1: - return obj[0] + def __call__(self, nodes: NodeList) -> object: + """Return the first node in a node list if it has only one item.""" + if len(nodes) == 1: + return nodes[0].obj return UNDEFINED - return obj diff --git a/jsonpath/match.py b/jsonpath/match.py index 1aecdc5..1f39059 100644 --- a/jsonpath/match.py +++ b/jsonpath/match.py @@ -98,3 +98,10 @@ def values_or_singular(self) -> object: if len(self) == 1: return self[0].obj return [match.obj for match in self] + + def empty(self) -> bool: + """Return `True` if this node list is empty.""" + return not bool(self) + + def __str__(self) -> str: + return f"NodeList{super().__str__()}" diff --git a/jsonpath/parse.py b/jsonpath/parse.py index d7430f8..19ca802 100644 --- a/jsonpath/parse.py +++ b/jsonpath/parse.py @@ -11,7 +11,11 @@ from typing import Optional from typing import Union +from jsonpath.function_extensions.filter_function import ExpressionType +from jsonpath.function_extensions.filter_function import FilterFunction + from .exceptions import JSONPathSyntaxError +from .exceptions import JSONPathTypeError from .filter import CURRENT_KEY from .filter import FALSE from .filter import NIL @@ -25,6 +29,7 @@ from .filter import InfixExpression from .filter import IntegerLiteral from .filter import ListLiteral +from .filter import Path from .filter import PrefixExpression from .filter import RegexLiteral from .filter import RootPath @@ -177,6 +182,17 @@ class Parser: TOKEN_RE: "=~", } + SINGULAR_QUERY_COMPARISON_OPERATORS = frozenset( + [ + "==", + ">=", + ">", + "<=", + "<", + "!=", + ] + ) + PREFIX_OPERATORS = frozenset( [ TOKEN_NOT, @@ -249,6 +265,7 @@ def __init__(self, *, env: JSONPathEnvironment) -> None: TOKEN_ROOT: self.parse_root_path, TOKEN_SELF: self.parse_self_path, TOKEN_FILTER_CONTEXT: self.parse_filter_context_path, + TOKEN_FUNCTION: self.parse_function_extension, } def parse(self, stream: TokenStream) -> Iterable[JSONPathSelector]: @@ -430,14 +447,25 @@ def parse_selector_list(self, stream: TokenStream) -> ListSelector: # noqa: PLR def parse_filter(self, stream: TokenStream) -> Filter: tok = stream.next_token() - expr = BooleanExpression(self.parse_filter_selector(stream)) + expr = self.parse_filter_selector(stream) + + if self.env.well_typed and isinstance(expr, FunctionExtension): + func = self.env.function_extensions.get(expr.name) + if ( + func + and isinstance(func, FilterFunction) + and func.return_type == ExpressionType.VALUE + ): + raise JSONPathTypeError( + f"result of {expr.name}() must be compared", token=tok + ) if stream.peek.kind == TOKEN_RPAREN: raise JSONPathSyntaxError("unbalanced ')'", token=stream.current) stream.next_token() stream.expect(TOKEN_FILTER_END, TOKEN_RBRACKET) - return Filter(env=self.env, token=tok, expression=expr) + return Filter(env=self.env, token=tok, expression=BooleanExpression(expr)) def parse_boolean(self, stream: TokenStream) -> FilterExpression: if stream.current.kind == TOKEN_TRUE: @@ -475,11 +503,17 @@ def parse_infix_expression( ) -> FilterExpression: tok = stream.next_token() precedence = self.PRECEDENCES.get(tok.kind, self.PRECEDENCE_LOWEST) - return InfixExpression( - left, - self.BINARY_OPERATORS[tok.kind], - self.parse_filter_selector(stream, precedence), - ) + right = self.parse_filter_selector(stream, precedence) + operator = self.BINARY_OPERATORS[tok.kind] + + self._raise_for_non_singular_query(left, tok) # TODO: store tok on expression + self._raise_for_non_singular_query(right, tok) + + if operator in self.SINGULAR_QUERY_COMPARISON_OPERATORS: + self._raise_for_non_comparable_function(left, tok) + self._raise_for_non_comparable_function(right, tok) + + return InfixExpression(left, operator, right) def parse_grouped_expression(self, stream: TokenStream) -> FilterExpression: stream.next_token() @@ -558,15 +592,24 @@ def parse_function_extension(self, stream: TokenStream) -> FilterExpression: while stream.current.kind != TOKEN_RPAREN: try: - function_arguments.append( - self.function_argument_map[stream.current.kind](stream) - ) + func = self.function_argument_map[stream.current.kind] except KeyError as err: raise JSONPathSyntaxError( f"unexpected {stream.current.value!r}", token=stream.current, ) from err + expr = func(stream) + + # The argument could be a comparison or logical expression + peek_kind = stream.peek.kind + while peek_kind in self.BINARY_OPERATORS: + stream.next_token() + expr = self.parse_infix_expression(stream, expr) + peek_kind = stream.peek.kind + + function_arguments.append(expr) + if stream.peek.kind != TOKEN_RPAREN: if stream.peek.kind == TOKEN_FILTER_END: break @@ -575,10 +618,10 @@ def parse_function_extension(self, stream: TokenStream) -> FilterExpression: stream.next_token() - function_arguments = self.env.validate_function_extension_signature( - tok, function_arguments + return FunctionExtension( + tok.value, + self.env.validate_function_extension_signature(tok, function_arguments), ) - return FunctionExtension(tok.value, function_arguments) def parse_filter_selector( self, stream: TokenStream, precedence: int = PRECEDENCE_LOWEST @@ -624,3 +667,27 @@ def _decode_string_literal(self, token: Token) -> str: raise JSONPathSyntaxError(str(err).split(":")[1], token=token) from None return token.value + + def _raise_for_non_singular_query( + self, expr: FilterExpression, token: Token + ) -> None: + if ( + self.env.well_typed + and isinstance(expr, Path) + and not expr.path.singular_query() + ): + raise JSONPathSyntaxError( + "non-singular query is not comparable", token=token + ) + + def _raise_for_non_comparable_function( + self, expr: FilterExpression, token: Token + ) -> None: + if not self.env.well_typed or not isinstance(expr, FunctionExtension): + return + func = self.env.function_extensions.get(expr.name) + if ( + isinstance(func, FilterFunction) + and func.return_type != ExpressionType.VALUE + ): + raise JSONPathTypeError(f"result of {expr.name}() is not comparable", token) diff --git a/jsonpath/path.py b/jsonpath/path.py index 0c8fc7c..9a97f68 100644 --- a/jsonpath/path.py +++ b/jsonpath/path.py @@ -17,6 +17,9 @@ from jsonpath._data import load_data from jsonpath.match import FilterContextVars from jsonpath.match import JSONPathMatch +from jsonpath.selectors import IndexSelector +from jsonpath.selectors import ListSelector +from jsonpath.selectors import PropertySelector if TYPE_CHECKING: from io import IOBase @@ -206,6 +209,20 @@ def empty(self) -> bool: """Return `True` if this path has no selectors.""" return not bool(self.selectors) + def singular_query(self) -> bool: + """Return `True` if this JSONPath query is a singular query.""" + for selector in self.selectors: + if isinstance(selector, (PropertySelector, IndexSelector)): + continue + if ( + isinstance(selector, ListSelector) + and len(selector.items) == 1 + and isinstance(selector.items[0], (PropertySelector, IndexSelector)) + ): + continue + return False + return True + class CompoundJSONPath: """Multiple `JSONPath`s combined.""" diff --git a/tests/test_cli.py b/tests/test_cli.py index 4c433ca..16d7918 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -178,7 +178,7 @@ def test_json_path_type_error( sample_target: str, capsys: pytest.CaptureFixture[str], ) -> None: - """Test that we handle a JSONPath with a syntax error.""" + """Test that we handle a JSONPath with a type error.""" args = parser.parse_args( ["path", "-q", "$.foo[?count(@.bar, 'baz')]", "-f", sample_target] ) @@ -195,7 +195,7 @@ def test_json_path_type_error_debug( parser: argparse.ArgumentParser, sample_target: str, ) -> None: - """Test that we handle a JSONPath with a syntax error.""" + """Test that we handle a JSONPath with a type error.""" args = parser.parse_args( ["--debug", "path", "-q", "$.foo[?count(@.bar, 'baz')]", "-f", sample_target] ) @@ -204,6 +204,47 @@ def test_json_path_type_error_debug( handle_path_command(args) +def test_json_path_no_well_typed_checks( + parser: argparse.ArgumentParser, + sample_target: str, + capsys: pytest.CaptureFixture[str], +) -> None: + """Test that we can disable well-typedness checks.""" + # `count()` must be compared + query = "$[?count(@..*)]" + + args = parser.parse_args( + [ + "path", + "-q", + query, + "-f", + sample_target, + ] + ) + + with pytest.raises(SystemExit) as err: + handle_path_command(args) + + captured = capsys.readouterr() + assert err.value.code == 1 + assert captured.err.startswith("json path type error") + + args = parser.parse_args( + [ + "path", + "-q", + query, + "--no-type-checks", + "-f", + sample_target, + ] + ) + + # does not raise + handle_path_command(args) + + def test_json_path_index_error( parser: argparse.ArgumentParser, sample_target: str, diff --git a/tests/test_compliance.py b/tests/test_compliance.py index 10018a8..c737337 100644 --- a/tests/test_compliance.py +++ b/tests/test_compliance.py @@ -37,40 +37,6 @@ class Case: "functions, match, filter, match function, unicode char class negated, uppercase": "\\P not supported", # noqa: E501 "functions, search, filter, search function, unicode char class, uppercase": "\\p not supported", # noqa: E501 "functions, search, filter, search function, unicode char class negated, uppercase": "\\P not supported", # noqa: E501 - "filter, non-singular query in comparison, slice": "TODO", - "filter, non-singular query in comparison, all children": "TODO", - "filter, non-singular query in comparison, descendants": "TODO", - "filter, non-singular query in comparison, combined": "TODO", - "filter, relative non-singular query, index, equal": "TODO", - "filter, relative non-singular query, index, not equal": "TODO", - "filter, relative non-singular query, index, less-or-equal": "TODO", - "filter, relative non-singular query, name, equal": "TODO", - "filter, relative non-singular query, name, not equal": "TODO", - "filter, relative non-singular query, name, less-or-equal": "TODO", - "filter, relative non-singular query, combined, equal": "TODO", - "filter, relative non-singular query, combined, not equal": "TODO", - "filter, relative non-singular query, combined, less-or-equal": "TODO", - "filter, relative non-singular query, wildcard, equal": "TODO", - "filter, relative non-singular query, wildcard, not equal": "TODO", - "filter, relative non-singular query, wildcard, less-or-equal": "TODO", - "filter, relative non-singular query, slice, equal": "TODO", - "filter, relative non-singular query, slice, not equal": "TODO", - "filter, relative non-singular query, slice, less-or-equal": "TODO", - "filter, absolute non-singular query, index, equal": "TODO", - "filter, absolute non-singular query, index, not equal": "TODO", - "filter, absolute non-singular query, index, less-or-equal": "TODO", - "filter, absolute non-singular query, name, equal": "TODO", - "filter, absolute non-singular query, name, not equal": "TODO", - "filter, absolute non-singular query, name, less-or-equal": "TODO", - "filter, absolute non-singular query, combined, equal": "TODO", - "filter, absolute non-singular query, combined, not equal": "TODO", - "filter, absolute non-singular query, combined, less-or-equal": "TODO", - "filter, absolute non-singular query, wildcard, equal": "TODO", - "filter, absolute non-singular query, wildcard, not equal": "TODO", - "filter, absolute non-singular query, wildcard, less-or-equal": "TODO", - "filter, absolute non-singular query, slice, equal": "TODO", - "filter, absolute non-singular query, slice, not equal": "TODO", - "filter, absolute non-singular query, slice, less-or-equal": "TODO", "filter, multiple selectors": "TODO", "filter, multiple selectors, comparison": "TODO", "filter, multiple selectors, overlapping": "TODO", @@ -79,11 +45,6 @@ class Case: "filter, multiple selectors, filter and slice": "TODO", "filter, multiple selectors, comparison filter, index and slice": "TODO", "filter, equals number, decimal fraction, no fractional digit": "TODO", - "functions, length, result must be compared": "ignore", - "functions, count, result must be compared": "ignore", - "functions, match, result cannot be compared": "ignore", - "functions, search, result cannot be compared": "ignore", - "functions, value, result must be compared": "ignore", "whitespace, selectors, space between dot and name": "flexible whitespace policy", # noqa: E501 "whitespace, selectors, newline between dot and name": "flexible whitespace policy", # noqa: E501 "whitespace, selectors, tab between dot and name": "flexible whitespace policy", # noqa: E501 diff --git a/tests/test_ietf_comparison.py b/tests/test_ietf_comparison.py index 5cad7cd..fb73b25 100644 --- a/tests/test_ietf_comparison.py +++ b/tests/test_ietf_comparison.py @@ -43,6 +43,7 @@ from jsonpath import JSONPathEnvironment from jsonpath.filter import UNDEFINED +from jsonpath.match import NodeList @dataclasses.dataclass @@ -56,6 +57,7 @@ class Case: DATA = {"obj": {"x": "y"}, "arr": [2, 3]} + TEST_CASES = [ Case( description="$.absent1 == $.absent2", @@ -64,6 +66,27 @@ class Case: right=UNDEFINED, want=True, ), + Case( + description="$.absent1 == $.absent2, empty node lists", + left=NodeList(), + op="==", + right=NodeList(), + want=True, + ), + Case( + description="$.absent1 == $.absent2, empty node list and undefined", + left=NodeList(), + op="==", + right=UNDEFINED, + want=True, + ), + Case( + description="$.absent1 == $.absent2, undefined and empty node list", + left=UNDEFINED, + op="==", + right=NodeList(), + want=True, + ), Case( description="$.absent1 <= $.absent2", left=UNDEFINED, diff --git a/tests/test_ietf_well_typedness.py b/tests/test_ietf_well_typedness.py new file mode 100644 index 0000000..9babd45 --- /dev/null +++ b/tests/test_ietf_well_typedness.py @@ -0,0 +1,195 @@ +"""Function well-typedness test derived from IETF spec examples. + +The test cases defined here are taken from version 11 of the JSONPath +internet draft, draft-ietf-jsonpath-base-11. In accordance with +https://trustee.ietf.org/license-info, Revised BSD License text +is included bellow. + +See https://datatracker.ietf.org/doc/html/draft-ietf-jsonpath-base-20 + +Copyright (c) 2023 IETF Trust and the persons identified as authors +of the code. All rights reserved.Redistribution and use in source and +binary forms, with or without modification, are permitted provided +that the following conditions are met: + +- Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the + distribution. +- Neither the name of Internet Society, IETF or IETF Trust, nor the + names of specific contributors, may be used to endorse or promote + products derived from this software without specific prior written + permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +“AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +""" # noqa: D205 +import dataclasses +import operator + +import pytest + +from jsonpath import JSONPathEnvironment +from jsonpath.exceptions import JSONPathTypeError +from jsonpath.function_extensions import ExpressionType +from jsonpath.function_extensions import FilterFunction +from jsonpath.match import NodeList + + +@dataclasses.dataclass +class Case: + description: str + path: str + valid: bool + + +TEST_CASES = [ + Case( + description="length, singular query, compared", + path="$[?length(@) < 3]", + valid=True, + ), + Case( + description="length, non-singular query, compared", + path="$[?length(@.*) < 3]", + valid=False, + ), + Case( + description="count, non-singular query, compared", + path="$[?count(@.*) == 1]", + valid=True, + ), + Case( + description="count, int literal, compared", + path="$[?count(1) == 1]", + valid=False, + ), + Case( + description="nested function, LogicalType -> NodesType", + path="$[?count(foo(@.*)) == 1]", + valid=True, + ), + Case( + description="match, singular query, string literal", + path="$[?match(@.timezone, 'Europe/.*')]", + valid=True, + ), + Case( + description="match, singular query, string literal, compared", + path="$[?match(@.timezone, 'Europe/.*') == true]", + valid=False, + ), + Case( + description="value, non-singular query param, comparison", + path="$[?value(@..color) == 'red']", + valid=True, + ), + Case( + description="value, non-singular query param", + path="$[?value(@..color)]", + valid=False, + ), + Case( + description="function, singular query, value type param, logical return type", + path="$[?bar(@.a)]", + valid=True, + ), + Case( + description=( + "function, non-singular query, value type param, logical return type" + ), + path="$[?bar(@.*)]", + valid=False, + ), + Case( + description=( + "function, non-singular query, nodes type param, logical return type" + ), + path="$[?bn(@.*)]", + valid=True, + ), + Case( + description=( + "function, non-singular query, logical type param, logical return type" + ), + path="$[?bl(@.*)]", + valid=True, + ), + Case( + description="function, logical type param, comparison, logical return type", + path="$[?bl(1==1)]", + valid=True, + ), + Case( + description="function, logical type param, literal, logical return type", + path="$[?bl(1)]", + valid=False, + ), + Case( + description="function, value type param, literal, logical return type", + path="$[?bar(1)]", + valid=True, + ), +] + + +class MockFoo(FilterFunction): + arg_types = [ExpressionType.NODES] + return_type = ExpressionType.NODES + + def __call__(self, nodes: NodeList) -> NodeList: # noqa: D102 + return nodes + + +class MockBar(FilterFunction): + arg_types = [ExpressionType.VALUE] + return_type = ExpressionType.LOGICAL + + def __call__(self) -> bool: # noqa: D102 + return False + + +class MockBn(FilterFunction): + arg_types = [ExpressionType.NODES] + return_type = ExpressionType.LOGICAL + + def __call__(self, _: object) -> bool: # noqa: D102 + return False + + +class MockBl(FilterFunction): + arg_types = [ExpressionType.LOGICAL] + return_type = ExpressionType.LOGICAL + + def __call__(self, _: object) -> bool: # noqa: D102 + return False + + +@pytest.fixture() +def env() -> JSONPathEnvironment: + environment = JSONPathEnvironment() + environment.function_extensions["foo"] = MockFoo() + environment.function_extensions["bar"] = MockBar() + environment.function_extensions["bn"] = MockBn() + environment.function_extensions["bl"] = MockBl() + return environment + + +@pytest.mark.parametrize("case", TEST_CASES, ids=operator.attrgetter("description")) +def test_ietf_well_typedness(env: JSONPathEnvironment, case: Case) -> None: + if case.valid: + env.compile(case.path) + else: + with pytest.raises(JSONPathTypeError): + env.compile(case.path)