This plugin combine rows from file having data format like a table, based on a common field between them.
- Plugin type: filter
- base_column: a column name of data embulk loaded (hash, required)
- name: name of the column
- type: type of the column (see below)
- format: format of the timestamp if type is timestamp
- counter_column: a column name of data loaded from file (string, default:
{name: id, type: long}
)- name: name of the column
- type: type of the column (see below)
- format: format of the timestamp if type is timestamp
- joined_column_prefix: prefix added to joined data columns (string, default:
"_joined_by_embulk_"
) - file_path: path of file (string, required)
- file_format: file format (string, required, supported:
csv
,tsv
,yaml
,json
) - columns: required columns of data from the file (array of hash, required)
- name: name of the column
- type: type of the column (see below)
- format: format of the timestamp if type is timestamp
type of the column
name | description |
---|---|
boolean | true or false |
long | 64-bit signed integers |
timestamp | Date and time with nano-seconds precision |
double | 64-bit floating point numbers |
string | Strings |
filters:
- type: join_file
base_column: {name: name_id, type: long}
counter_column: {name: id, type: long}
joined_column_prefix: _joined_by_embulk_
file_path: master.json
file_format: json
columns:
- {name: id, type: long}
- {name: name, type: string}
$ ./gradlew classpath
$ embulk run -I lib example/config.yml
- csv ( not implemented )
- tsv ( not implemented )
- yaml ( not implemented )
- json
id,name
0,civitaspo
2,mori.ogai
5,natsume.soseki
Since the representation is difficult, it represents the tab as \t
.
id\tname
0\tcivitaspo
2\tmori.ogai
5\tnatsume.soseki
- id: 0
name: civitaspo
- id: 2
name: mori.ogai
- id: 5
name: natsume.soseki
[
{
"id": 0,
"name": "civitaspo"
},
{
"id": 2,
"name": "moriogai"
},
{
"id": 5,
"name": "natsume.soseki"
}
]
$ ./gradlew gem # -t to watch change of files and rebuild continuously