Format are organized in modules. Each module is associated with a category or, most of the time, a product vendor.
A module is made up of:
_meta
, a meta-information directory- as many sub-directories as formats included in the module
This directory holds meta-information about the module. It hosts two documents:
-
A logo that illustrates the module. This logo should help to identify the category or the vendor.
It must be a PNG image, with transparent background, named
logo.png
and must be lighter than 50Ko. -
a
manifest.yml
document. This document must contain an UUID that identifies the module, the name of the module, a short and url-friendly name of the module and a short description that describes the module.Here is an example from the AWS Module manifest.
A format is associated to a software technology. In general, a format correspond to a vendor product.
Each format share the same tree structure:
_meta
, a meta-information directoryingest/parser.yml
, the parser of the format- a set of testings files
Like the module one, this directory holds meta-information about the format. It consists of 4 files:
-
A logo that identifies the product. This logo should help identify the product that is represented by the format.
It must be a PNG image, with transparent background, named
logo.png
and must be lighter than 50Ko. -
a
manifest.yml
document. This document must contain an UUID that identifies the format, the name of the format, a short and url-friendly name, a short description that describes the software technology and a list of data-sources.Here is an example from Trend Micro Deep Security manifest
-
A taxonomy file that describes fields used in the parser.
For inspiration, see Windows taxonomy.
-
Some smart-descriptions (e.g: Windows smart-descriptions)
The parser transforms raw event into structured event.
It's a vocabulary scheduling how information will be extracted from the raw event. It consists of a pipeline setting up the sequence of data extraction as well as stages defining the construction of the structured event.
The parser is written in YAML dialect.
Refer to parser to understand the vocabulary.
To validate the parser, a set of test files should be hosted in the directory tests
of the format.
See testing to understand test files and how to validate the parser with them.
To generate a new module or a new format, you can use utils/generate.py
to guide you.
Go to the directory utils
and execute generate.py
with the command new-module
. Fill the prompts with requested information.
$ cd utils
$ poetry install
$ poetry run generate.py new-module
module_name [SEKOIAIO]: My module
module_description [The description of the module]: My first module
module_dir [My module]:
$
Go to the directory utils
and execute generate.py
with the command new-format
with the path to your module as argument. Fill the prompts with requested information.
$ cd utils
$ poetry run generate.py new-format ../My\ module
intake_name [The name of the intake]: My Format
intake_slug [my-format]:
intake_description [The description of the intake]: My First Format
$