Skip to content

Commit

Permalink
JSON schema importer
Browse files Browse the repository at this point in the history
Add ability to import schema defined in JSON format via file path or directly via JSON string.
This implementation doesn't migrate or update already defined schema elements. Neither it removes existing defined elements.
SchemaInitStrategy is defined in mind to be able to implement migration and more advance schema management, even so current JSON schema importer implementation is fairly simple. This work is made in hopes to simplify schema definition for beginners, speed-up prototypes development based on JanusGraph, and simplify testing.

Signed-off-by: Oleksandr Porunov <[email protected]>
  • Loading branch information
porunov committed Oct 28, 2024
1 parent e849077 commit ac7fddb
Show file tree
Hide file tree
Showing 66 changed files with 4,270 additions and 17 deletions.
22 changes: 13 additions & 9 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,15 +105,14 @@ For more information on features and bug fixes in 1.1.0, see the GitHub mileston
Inlining vertex properties into a Composite Index structure can offer significant performance and efficiency benefits.
See [documentation](./schema/index-management/index-performance.md#inlining-vertex-properties-into-a-composite-index) on how to inline vertex properties into a composite index.

**Important Notes on Compatibility**

1. **Backward Incompatibility**
Once a JanusGraph instance adopts this new schema feature, it cannot be rolled back to a prior version of JanusGraph.
The changes in the schema structure are not compatible with earlier versions of the system.

2. **Migration Considerations**
It is critical that users carefully plan their migration to this new version, as there is no automated or manual rollback process
to revert to an older version of JanusGraph once this feature is used.
!!! warning
Important Notes on Compatibility.
1. Backward Incompatibility:
Once a JanusGraph instance adopts this new schema feature, it cannot be rolled back to a prior version of JanusGraph.
The changes in the schema structure are not compatible with earlier versions of the system.
2. Migration Considerations:
It is critical that users carefully plan their migration to this new version, as there is no automated or manual rollback process
to revert to an older version of JanusGraph once this feature is used.

##### BerkeleyJE ability to overwrite arbitrary settings applied at `EnvironmentConfig` creation

Expand All @@ -124,6 +123,11 @@ All configurations values should be specified as `String` and be formated the sa
[documentation](https://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/EnvironmentConfig.html).
Example: `storage.berkeleyje.ext.je.lock.timeout=5000 ms`

##### JSON schema initializer

For simplicity JSON schema initialization options has been added into JanusGraph.
See [documentation](./schema/schema-init-strategies.md) to learn more about JSON schema initialization process.

### Version 1.0.1 (Release Date: ???)

/// tab | Maven
Expand Down
23 changes: 23 additions & 0 deletions docs/configs/janusgraph-cfg.md
Original file line number Diff line number Diff line change
Expand Up @@ -384,6 +384,29 @@ Schema related configuration options
| schema.default | Configures the DefaultSchemaMaker to be used by this graph. Either one of the following shorthands can be used: <br> - `default` (a blueprints compatible schema maker with MULTI edge labels and SINGLE property keys),<br> - `tp3` (same as default, but has LIST property keys),<br> - `none` (automatic schema creation is disabled)<br> - `ignore-prop` (same as none, but simply ignore unknown properties rather than throw exceptions)<br> - or to the full package and classname of a custom/third-party implementing the interface `org.janusgraph.core.schema.DefaultSchemaMaker` | String | default | MASKABLE |
| schema.logging | Controls whether logging is enabled for schema makers. This only takes effect if you set `schema.default` to `default` or `ignore-prop`. For `default` schema maker, warning messages will be logged before schema types are created automatically. For `ignore-prop` schema maker, warning messages will be logged before unknown properties are ignored. | Boolean | false | MASKABLE |

### schema.init
Configuration options for schema initialization on startup.


| Name | Description | Datatype | Default Value | Mutability |
| ---- | ---- | ---- | ---- | ---- |
| schema.init.drop-before-startup | Drops the entire schema with graph data before JanusGraph schema initialization. Note that the schema will be dropped regardless of the selected initialization strategy, including when `schema.init.strategy` is set to `none`. | Boolean | false | LOCAL |
| schema.init.strategy | Specifies the strategy for schema initialization before starting JanusGraph. You must provide the full class path of a class that implements the `SchemaInitStrategy` interface and has parameterless constructor.<br>The following shortcuts are also available:<br>- `none` - Skips schema initialization.<br>- `json` - Schema initialization via provided JSON file or JSON string.<br> | String | none | LOCAL |

### schema.init.json
Options for JSON schema initialization strategy.


| Name | Description | Datatype | Default Value | Mutability |
| ---- | ---- | ---- | ---- | ---- |
| schema.init.json.await-index-status-timeout | Timeout for awaiting index status operation defined in milliseconds. If the status await timeouts the exception will be thrown during schema initialization process. | Long | 180000 | LOCAL |
| schema.init.json.file | File path to JSON formated schema definition. | String | (no default value) | LOCAL |
| schema.init.json.force-close-other-instances | Force closes other JanusGraph instances before schema initialization, regardless if they are active or not. This is a dangerous operation. This option exists to help people initialize schema who struggle with zombie JanusGraph instances. It's not recommended to be used unless you know what you are doing. Instead of this parameter, it's recommended to check `graph.unique-instance-id` and `graph.replace-instance-if-exists` options to not create zombie instances in the cluster. | Boolean | false | LOCAL |
| schema.init.json.indices-activation | Indices activation type:<br>- `reindex_and_enable_updated_only` - Reindex process will be triggered for any updated index. After this all updated indexes will be enabled.<br>- `reindex_and_enable_non_enabled` - Reindex process will be triggered for any index which is not enabled (including previously created indices). After reindexing all indices will be enabled.<br>- `skip_activation` - Skip reindex process for any updated indexes.<br>- `force_enable_updated_only` - Force enable all updated indexes without running any reindex process (previous data may not be available for such indices).<br>- `force_enable_non_enabled` - Force enable all indexes (including previously created indices) without running any reindex process (previous data may not be available for such indices).<br> | String | reindex_and_enable_non_enabled | LOCAL |
| schema.init.json.skip-elements | Skip creation of VertexLabel, EdgeLabel, and PropertyKey. | Boolean | false | LOCAL |
| schema.init.json.skip-indices | Skip creation of indices. | Boolean | false | LOCAL |
| schema.init.json.string | JSON formated schema definition string. This option takes precedence if both `file` and `string` are used. | String | (no default value) | LOCAL |

### storage
Configuration options for the storage backend. Some options are applicable only for certain backends.

Expand Down
Loading

1 comment on commit ac7fddb

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark

Benchmark suite Current: ac7fddb Previous: 213b754 Ratio
org.janusgraph.JanusGraphSpeedBenchmark.basicAddAndDelete 12605.539336231393 ms/op 12994.438964091325 ms/op 0.97
org.janusgraph.GraphCentricQueryBenchmark.getVertices 952.8487462157776 ms/op 957.3251909284766 ms/op 1.00
org.janusgraph.MgmtOlapJobBenchmark.runClearIndex 216.28348984239133 ms/op 216.45303196086957 ms/op 1.00
org.janusgraph.MgmtOlapJobBenchmark.runReindex 346.3720456992949 ms/op 342.81005004892853 ms/op 1.01
org.janusgraph.JanusGraphSpeedBenchmark.basicCount 221.8337648954928 ms/op 207.33680618088454 ms/op 1.07
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection 4831.472849478207 ms/op 4953.295327365606 ms/op 0.98
org.janusgraph.CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps 16032.774814838689 ms/op 16917.057558105356 ms/op 0.95
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch 18475.374324934346 ms/op 18983.13907385985 ms/op 0.97
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching 53956.68135738333 ms/op 56527.85002600001 ms/op 0.95
org.janusgraph.CQLMultiQueryDropBenchmark.dropVertices 1546.4529868128632 ms/op 1570.8428983417461 ms/op 0.98
org.janusgraph.CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex 7900.053390151586 ms/op 8433.13502817794 ms/op 0.94
org.janusgraph.CQLMultiQueryBenchmark.getVerticesWithDoubleUnion 391.6238220248095 ms/op 384.2152506805113 ms/op 1.02
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch 3929.4785096569667 ms/op 4227.1771161974975 ms/op 0.93
org.janusgraph.CQLMultiQueryBenchmark.getNames 8023.037576744028 ms/op 8339.221853925019 ms/op 0.96
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection 5485.912016191758 ms/op 5604.356576582386 ms/op 0.98
org.janusgraph.CQLMultiQueryBenchmark.getLabels 6961.58933332698 ms/op 7082.884761983721 ms/op 0.98
org.janusgraph.CQLMultiQueryBenchmark.getVerticesFilteredByAndStep 431.02274935675473 ms/op 430.31039337061094 ms/op 1.00
org.janusgraph.CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex 12075.974151608048 ms/op 12459.636105572155 ms/op 0.97
org.janusgraph.CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage 359.9524059735096 ms/op 357.5981502840734 ms/op 1.01
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection 14387.771568689044 ms/op 14793.559446997619 ms/op 0.97
org.janusgraph.CQLMultiQueryBenchmark.getIdToOutVerticesProjection 251.32587219487405 ms/op 245.84974412075837 ms/op 1.02
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch 13446.031071442445 ms/op 13806.414282860256 ms/op 0.97
org.janusgraph.CQLCompositeIndexInlinePropBenchmark.searchVertices 1501.1486144378007 ms/op 1511.142514571489 ms/op 0.99
org.janusgraph.CQLMultiQueryBenchmark.getNeighborNames 8269.8524756692 ms/op 8411.967305495045 ms/op 0.98
org.janusgraph.CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps 8869.51980242747 ms/op 9104.974810254043 ms/op 0.97
org.janusgraph.CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts 8119.911418966441 ms/op 8793.398072298722 ms/op 0.92

This comment was automatically generated by workflow using github-action-benchmark.

Please sign in to comment.