weight | title | description |
---|---|---|
70 |
Data Storage |
StreamingFast Firehose data storage |
Data and the locations where it is stored are important facets of Firehose deployment and operation.
Key Firehose data storage topics include Data Stores, Merged blocks files, serialization, one block files, and 100-blocks files.
Firehose Stores are abstractions sitting on top of Object Storage.
{% hint style="info" %} Note: Object Storage is a data storage technique that manages data as objects in opposition to other data storage architectures like hierarchical file systems. {% endhint %}
Stores utilize the Firehose dstore abstraction library to provide support for local file systems, Azure, Google Cloud, Amazon S3, and other Amazon S3 API compatible object storage solutions such as MinIO or Ceph.
For production deployments outside of cloud providers, StreamingFast recommends Ceph as the distributed storage instead of its compatible Amazon S3 API system.
Firehose primarily utilizes Protocol Buffers version 3 for serialization.
Merged blocks files are also referred to as 100-blocks files
, and merged bundles. These terms are all used interchangeably within Firehose.
Merged blocks are binary files that use the dbin packing format to store a series of bstream block objects, serialized as protocol buffers.
Firehose uses Firehose-enabled node components that have been set with a special flag to work in catch-up mode to create merged blocks.
In high-availability Firehose configurations, merged blocks will be created by the Merger component. The Firehose-enabled node component will provide the Merger component with one-block files.
The Merger component will also collate all of the one-block files into a single bundle of blocks.
Up to one hundred blocks can be contained within a single 100-blocks file.
The 100-blocks files can include multiple versions such as a fork block or a given block number, ensuring continuity through the previous block link.
Nearly all components in Firehose rely on or utilize 100-blocks files. The bstream library consumes 100-blocks files for example.
Protocol-specific decoded block objects, like Ethereum, are what circulate amongst all processes that work with executed block data in Firehose.
In high availability configurations, one-block files are transient and ensure the Merger component gathers all visible forks from any Firehose-enabled Node components.
{% hint style="warning" %}
Important: One-block files contain only one bstream.Block
as a serialized protocol buffer.
{% endhint %}
One-block files are consumed by the Merger
component, bundled in executed __ 100-blocks files. The one-block files are then stored to dstore
storage and consumed by most of the other Firehose processes.