The nested JSON parser extension is used during Druid indexing to allow rows with nested JSON data to be unpacked and flattened into a single row to be ingested.
You can either use the prebuilt jar located in dist/
or build from source
using the directions below.
Once you have the compiled jar
, copy it to your druid installation and
follow the
including extension
druid documentation. You should end up adding a line similar to this to your
common.runtime.properties
file:
druid.extensions.loadList=["druid-nested-json-parser"]
Clone the druid repo and add this line to pom.xml
in the "Community extensions"
section:
<module>${druid-nested-json-parser-src-root}/nested-json-parser</module>
replacing ${druid-nested-json-parser-src-root}
with your path to this
repo.
Then, inside the druid repo, run:
mvn package -DskipTests=true -rf :druid-nested-json-parser
This will build the nested JSON parser extension and place it in
${druid-nested-json-parser-src-root}/nested-json-parser/target
The nested JSON parser uses a pivotSpec
to determine how to unpack nested JSON
values into a flat row.
XXXXXXXXXXX
{
...
"spec": {
"dataSchema": {
"dataSource": "MyDatasource",
"parser": {
"parseSpec": {
...
"format": "json"
},
"type": "nestedJson",
"pivotSpec": [{
"dimensionFieldName": "field",
"rowFieldName": "data",
"metricFieldName": "val"
}]
},
...
},
...
},
...
}
XXXXXXXXXXXXX