This document is an explanation of the Elucidation Metatadata Standard and our own implementation of the standard as a set of libraries and tools. This Elucidation Metadata Standard provides a way to create implementation-independent metadata standards using text, so that you can create arbitrary byte-based metadata and store it using technologies like databases, without having to resort to creating your own libraries or locking you in to a particular library or knowing schemas in advance. Elucidator itself is a set of libraries and tools intended to provide real-world use of this standard. Other implementors are free to use this standard to implement other tools and libraries as needed.
This standard specifies how to describe byte-based metadata using text, and interpret it on reading. From here on
From here on, "The Standard" is an abbreviation of "The Elucidation Metadata Standard".
Metadata
is a collection of bytes that can be interpreted based on someSpecification
(singularMetadatum
).Group
is a set ofMetadata Specification
s, often particular to a project or domain.Specification
is the association of someIdentifier
with some set of rules for interpretation.Identifier
is the string which is associated with some set of rules about how something should be interpreted. Identifiers must be UTF-8 encoded alphanumerical or underscore characters, beginning with an alphabetical character.Interpreter
is a routine which can convert an individualMetadatum
into the correct associated types.Metadata Specification
is theSpecification
of a MetadataDesignation
and its associated, orderedMember
s.Designation
is theIdentifier
associated with a particular class ofMetadata
.Member
is a component of an individualMetadatum
, which has an associatedIdentifier
,Data Type
, andValue
. For a member, the identifier must be unique to an individualMetadata Specification
, but need not be unique to across allMetadata Specification
s.Data Type
indicates how a particularValue
should be extracted from a collection of bytes. The Standard specifies a discrete set of possible types. This is often abbreviated asDtype
.Value
is the contents of someMember
interpreted through itsData Type
.Member Specification
defines the association of someIdentifier
with aData Type
for a particularMember
.Array
is an ordered set of values with homogeneousData Type
.
Group
├── Metadata Specification
│ ├── Designation
│ ├── Member 0
│ │ ├── Identifier
│ │ └── Data Type
│ ├── Member 1
│ │ ├── Identifier
│ │ └── Data Type
│ ├── ...
│ └── Member n
│ ├── Identifier
│ └── Data Type
├── Metadata Specification
│ └── ...
├── ...
└── Metadata Specification
└── ...
Metadata Specification consists of a Designation
and an ordered set of Members
.
The Designation
should be unique for a given set of related Metadata
.
Implementors are allowed to make any link between a Designation and the ordered set of
Members that they please; for example, using columns in a SQL database, one for
Designationand one for
Metadata Specificationwhich contains a textual representation of the
Member`s.
In the absence of an implementation-defined linkage, the following grammar should be used to indicate the mapping of designation to ordered member sets:
specification: Designation(member, member, member, ...)(context);
with Designation
the designation for this specification, context
an optional string with additional descriptive information, and member
s specified by the grammar
member: Identifier: Dtype
.
Compliant implementations may NOT use a context
to perform any processing; this field is intended for human readability and information only, much like comments in source code.
Whitespace is ignored except for the context
string, as Identifiers and Dtypes are not allowed to contain them.
The following table indicates all allowable data types. Compliant implementations must implement all data types.
Name | String Representation |
---|---|
Byte | u8 |
Unsigned 16-bit integer | u16 |
Unsigned 32-bit integer | u32 |
Unsigned 64-bit integer | u64 |
Signed 8-bit integer | i8 |
Signed 16-bit integer | i16 |
Signed 32-bit integer | i32 |
Signed 64-bit integer | i64 |
IEEE 32-bit floating point | f32 |
IEEE 64-bit floating point | f64 |
String | string |
All Data Types which are not String
may be constructed as an Array
.
An Array
may be of fixed size in the Member Specification
, or of dynamic size.
NOTE: signed integers used for dynamic sizing are NOT compliant with The Standard.
Arrays are specified using the following grammar for fixed size:
Dtype[literal]
and the following grammar for dynamic size:
Dtype[]
For all types, little endian byte ordering is required.
The String
type consists of one unsigned 64-bit integer, followed by that number of bytes to represent the string.
NOTE: The String
type is NOT nul-terminated.
For fixed arrays, the underlying data type is repeated for the size of the array with no padding.
For dynamic arrays, like String
s, the array begins with one unsigned 64-bit integer, followed by that number of elements of the designated type in byte representation.
Elucidator contains the following components:
- A rust-based library which implements manipulations of metadata based on The Standard
- A rust-based library which adds database storage of metadata with spatiotemporal bounding boxes associated with each metadatum
- Python bindings for the rust libraries
- C bindings for the rust libraries
- A small set of utility tools