Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A URI with 'file' protocol is not handled as it should #252

Open
Scanframe opened this issue Feb 27, 2023 · 6 comments
Open

A URI with 'file' protocol is not handled as it should #252

Scanframe opened this issue Feb 27, 2023 · 6 comments

Comments

@Scanframe
Copy link

Scanframe commented Feb 27, 2023

Problem

When the $id is set to use a file protocol like in this case file:///mnt/server/userdata/source/json-schemas/schema/customer.schema.json an error is reported when other schema files are referenced for definitions.

As a comparison the validator from the Linux package python3-jsonschema
only allows file:// protocol for local files which is the most logical in my opinion.
(The problem there is that it does not handle relative file paths.)

Directory Structure & Command

Files

<project-dir>
├── json
│   ├── test.customer.json
└── schema
    ├── address.schema.json
    ├── customer.schema.json
    └── defs.schema.json

Command

Both commands are executed when the current directory is the project root.

Python

jsonschema -i json/test.customer.json schema/customer.schema.json

C++ json-schema-validator

json-schema-validate schema/customer.schema.json < json/test.customer.json

Main Schema File

The file below references other files.
Those files can be found at this location.

{
	"$id": "file:///mnt/server/userdata/source/json-schemas/schema/customer.schema.json",
	"$schema": "http://json-schema.org/draft-07/schema#",
	"type": "object",
	"additionalProperties": false,
	"properties": {
		"name": {
			"type": "object",
			"additionalProperties": false,
			"properties": {
				"first": {
					"$ref": "defs.schema.json#/definitions/firstName"
				},
				"middle": {
					"$ref": "defs.schema.json#/definitions/middleName"
				},
				"last": {
					"$ref": "defs.schema.json#/definitions/lastName"
				}
			},
			"required": [
				"first",
				"middle",
				"last"
			]
		},
		"shipping_address": {
			"$ref": "address.schema.json"
		},
		"billing_address": {
			"$ref": "address.schema.json"
		},
		"parcel_size": {
			"type": "object",
			"additionalProperties": false,
			"properties": {
				"height": {
					"$ref": "defs.schema.json#/definitions/parcelSizeHeight"
				},
				"width": {
					"$ref": "defs.schema.json#/definitions/parcelSizeWidth"
				},
				"depth": {
					"$ref": "defs.schema.json#/definitions/parcelSizeDepth"
				}
			}
		}
	},
	"required": [
		"name",
		"shipping_address",
		"billing_address",
		"parcel_size"
	]
}

Error Log

setting root schema failed
could not open file:///mnt/server/userdata/source/json-schemas/schema/address.schema.json tried with .//mnt/server/userdata/source/json-schemas/schema/address.schema.json
ERROR: '"/billing_address"' - '{"city":"'s-Gravenhage","postal_code":"2514GL","state":"Zuid-Holland","street_address":"Noordeinde 68"}': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/address.schema.json # 
ERROR: '"/name/first"' - '"Prins"': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/defs.schema.json # /definitions/firstName
ERROR: '"/name/last"' - '"Oranje"': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/defs.schema.json # /definitions/lastName
ERROR: '"/name/middle"' - '"van"': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/defs.schema.json # /definitions/middleName
ERROR: '"/parcel_size/depth"' - '30': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/defs.schema.json # /definitions/parcelSizeDepth
ERROR: '"/parcel_size/height"' - '200': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/defs.schema.json # /definitions/parcelSizeHeight
ERROR: '"/parcel_size/width"' - '80': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/defs.schema.json # /definitions/parcelSizeWidth
ERROR: '"/shipping_address"' - '{"city":"'s-Gravenhage","postal_code":"2513BJ","state":"Zuid-Hoilland","street_address":"Molenstraat 27"}': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/address.schema.json # 
schema validation failed
@pboettch
Copy link
Owner

The validator program you're using is just a test program, an example showing how to use the library.

The simple loader-callback actually is doing a good work, because it uses the URL-path of the root-schema to find the other sub-schemas.

If you use the library please write your own loader-script matching your infrastructure.

To solve your problem validator library needs to be aware of the initial filename and path of the root-schema. As of today it isn't. It seems the python one is doing that.

If you don't want to integrate the library in your program and just want to use an executable, why not stick with the python one?

Otherwise, do not hesitate to suggest a patch for the example so that it does what you want.

Btw. Isn't it very strange that the $id-tag contains a local file path?

@Scanframe
Copy link
Author

Scanframe commented Mar 3, 2023

Thanks for responding.

Btw. Isn't it very strange that the $id-tag contains a local file path?

When your system/application has no access to webservers then this is the only option.

I assumed the $id-tag in the main schema can only contain a URI to identify its resource.
The other linked or referenced schemas can use a relative location to the main one.

I tried fix it in the code but a path is prefixed with ./ which is good for the http protocol but not for the file protocol.
It became too complex from there to figure out what to change in a short time to make it work.

@pboettch
Copy link
Owner

pboettch commented Mar 3, 2023

Btw. Isn't it very strange that the $id-tag contains a local file path?

When your system/application has no access to webservers then this is the only option.

No, it seems common usage to put http-addresses as $ids, even though nothing is looking up anything on the internet.

I tried fix it in the code but a path is prefixed with ./ which is good for the http protocol but not for the file protocol. It became too complex from there to figure out what to change in a short time to make it work.

The library also does not support ../-relative path references. This might be related. Someone with time needs to take a look.

@Scanframe
Copy link
Author

No, it seems common usage to put http-addresses as $ids, even though nothing is looking up anything on the internet.

My understanding is it when the $id is omitted the from the main schema file secondary referenced schema files are not found at all. The $id sets the location where the other schema files are to be found.
When using only a single schema file nothing in the $id tag matters since nothing is externally referenced.

@pboettch
Copy link
Owner

pboettch commented Mar 3, 2023

The other schema-validators I saw all use callbacks for the user to handle the loading of additional schemas. So, it's up to the application handling the evaluation of the URL of $id.

The problem you have is, that file:// is not (correctly) handled in the URL-class (probably).

OK, but you are also using an example program which is not really designed to be generic. Maybe we can fix it there? In the loader callback, if the protocol is file, we remove the .?

@Scanframe
Copy link
Author

OK, but you are also using an example program which is not really designed to be generic.
Maybe we can fix it there? In the loader callback, if the protocol is file, we remove the .?

I can make a contribution trying to fixing it.

BTW...

I used FetchContent_xxxxx CMake functions instead of the Hunter ones.
CMake V3.11 is needed for it at least.

file: cmake/nlohmann_jsonConfig.cmake

# FetchContent added in CMake 3.11, downloads during the configure step
include(FetchContent)
# Import Json library.
FetchContent_Declare(
	nlohmann-json
	GIT_REPOSITORY https://github.com/nlohmann/json
	GIT_TAG v3.8.0
	)
# Adds nlohmann_json::nlohmann_json
FetchContent_MakeAvailable(nlohmann-json)

Addition in main CMakeLists.txt

# Make it so our own packages are found and also the ones in the sub-module library.
list(APPEND CMAKE_PREFIX_PATH "${CMAKE_CURRENT_LIST_DIR}/cmake")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants