Join File filter plugin for Embulk

This plugin combine rows from file having data format like a table, based on a common field between them.

Overview

Plugin type: filter

Configuration

base_column: a column name of data embulk loaded (hash, required)
- name: name of the column
- type: type of the column (see below)
- format: format of the timestamp if type is timestamp
counter_column: a column name of data loaded from file (string, default: {name: id, type: long})
- name: name of the column
- type: type of the column (see below)
- format: format of the timestamp if type is timestamp
joined_column_prefix: prefix added to joined data columns (string, default: "_joined_by_embulk_")
file_path: path of file (string, required)
file_format: file format (string, required, supported: csv, tsv, yaml, json)
columns: required columns of data from the file (array of hash, required)
- name: name of the column
- type: type of the column (see below)
- format: format of the timestamp if type is timestamp

type of the column

name	description
boolean	true or false
long	64-bit signed integers
timestamp	Date and time with nano-seconds precision
double	64-bit floating point numbers
string	Strings

Example

filters:
  - type: join_file
    base_column: {name: name_id, type: long}
    counter_column: {name: id, type: long}
    joined_column_prefix: _joined_by_embulk_
    file_path: master.json
    file_format: json
    columns:
      - {name: id, type: long}
      - {name: name, type: string}

Run Example

$ ./gradlew classpath
$ embulk run -I lib example/config.yml

Supported Data Format

csv ( not implemented )
tsv ( not implemented )
yaml ( not implemented )
json

Supported Data Format Example

CSV

id,name
0,civitaspo
2,mori.ogai
5,natsume.soseki

TSV

Since the representation is difficult, it represents the tab as \t.

id\tname
0\tcivitaspo
2\tmori.ogai
5\tnatsume.soseki

YAML

- id: 0
  name: civitaspo
- id: 2
  name: mori.ogai
- id: 5
  name: natsume.soseki

JSON

[
  {
    "id": 0,
    "name": "civitaspo"
  },
  {
    "id": 2,
    "name": "moriogai"
  },
  {
    "id": 5,
    "name": "natsume.soseki"
  }
]

Build

$ ./gradlew gem  # -t to watch change of files and rebuild continuously

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
example		example
gradle/wrapper		gradle/wrapper
lib/embulk/filter		lib/embulk/filter
src		src
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
build.gradle		build.gradle
gradlew		gradlew
gradlew.bat		gradlew.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Join File filter plugin for Embulk

Overview

Configuration

Example

Run Example

Supported Data Format

Supported Data Format Example

CSV

TSV

YAML

JSON

Build

About

Releases

Packages

Languages

License

medjed/embulk-filter-join_file

Folders and files

Latest commit

History

Repository files navigation

Join File filter plugin for Embulk

Overview

Configuration

Example

Run Example

Supported Data Format

Supported Data Format Example

CSV

TSV

YAML

JSON

Build

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages