Skip to content

Commit

Permalink
Merge pull request #9 from pinterest/update_readme
Browse files Browse the repository at this point in the history
Update readme
  • Loading branch information
jeffxiang authored Aug 23, 2024
2 parents a5a14dc + 64e00ca commit 6935a54
Show file tree
Hide file tree
Showing 2 changed files with 69 additions and 33 deletions.
80 changes: 58 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,41 +1,68 @@
# Kafka Tiered Storage
Kafka Tiered Storage is a framework that allows [Apache Kafka](https://kafka.apache.org/) brokers to offload finalized log segments to a remote storage system.
This allows Kafka to maintain a smaller disk footprint and reduce the need for expensive storage on the brokers.
The framework also provides a Kafka client compatible consumer that can consume from both the broker and the remote storage system.
# Pinterest Tiered Storage for Apache Kafka®
Pinterest Tiered Storage for [Apache Kafka®](https://kafka.apache.org/) is a broker-independent framework that allows brokers
to offload finalized log segments to a remote storage system.
This allows Apache Kafka® to maintain a smaller disk footprint and reduce the need for expensive storage on the brokers.
The framework also provides a consumer client that can consume from both the broker and the remote storage system.

Pinterest's implementation of Kafka Tiered Storage provides a Kafka broker independent approach to tiered storage. It consists of two main components:
1. [Uploader](ts-segment-uploader): A continuous process that runs on each Kafka broker and uploads finalized log segments to a remote storage system (e.g. Amazon S3, with unique prefix per Kafka cluster and topic).
2. [Consumer](ts-consumer): A Kafka client compatible consumer that consumes from both tiered storage log segments and Kafka cluster.
Pinterest's implementation of Tiered Storage for Apache Kafka® provides a ***broker-independent*** approach to Tiered Storage.
***See the differences between [Pinterest vs. Apache Kafka® Tiered Storage](#pinterest-vs-apache-kafka-tiered-storage)***.

It consists of two main components:
1. [Uploader](ts-segment-uploader): A continuous process that runs on each Apache Kafka® broker and uploads finalized log segments to a remote storage system (e.g. Amazon S3, with unique prefix per cluster and topic).
2. [Consumer](ts-consumer): A consumer client capable of consuming from both Tiered Storage log segments and Apache Kafka® cluster.

A third module [ts-common](ts-common) contains common classes and interfaces that are used by the `ts-consumer` and `ts-segment-uploader` modules, such as Metrics, StorageEndpointProvider, etc.

Feel free to read into each module's README for more details.

## Why Tiered Storage?
[Apache Kafka](https://kafka.apache.org/) is a distributed event streaming platform that stores partitioned and replicated log segments on disk for
a configurable retention period. However, as data volume and/or retention periods grow, the disk footprint of Kafka clusters can become expensive.
Tiered Storage allows Kafka to offload finalized log segments to a more cost-effective remote storage system, reducing the need for expensive storage on the brokers.
# Why Tiered Storage?
[Apache Kafka®](https://kafka.apache.org/) is a distributed event streaming platform that stores partitioned and replicated log segments on disk for
a configurable retention period. However, as data volume and/or retention periods grow, the disk footprint of Apache Kafka® clusters can become expensive.
Tiered Storage allows brokers to offload finalized log segments to a more cost-effective remote storage system, reducing the need for expensive storage on the brokers.

With Tiered Storage, you can:
1. Maintain a smaller overall broker footprint, reducing operational costs
2. Retain data for longer periods of time while avoiding horizontal and vertical scaling of Kafka clusters
2. Retain data for longer periods of time while avoiding horizontal and vertical scaling of Apache Kafka® clusters
3. Reduce CPU, network, and disk I/O utilization on brokers by reading directly from remote storage

## Highlights
- **Kafka Broker Independent**: The tiered storage solution is designed to be Kafka broker independent, meaning it runs as an independent process alongside the Kafka server process. Currently, it only supports ZooKeeper-based Kafka versions. KRaft support is WIP.
- **Fault Tolerant**: Broker restarts, replacements, leadership changes, and other common Kafka operations / issues are handled gracefully.
## Pinterest vs. Apache Kafka® Tiered Storage
### Apache Kafka® Tiered Storage
[KIP-405](https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage?uclick_id=11f222c6-967b-4935-98a9-cc88aafad7f5)
provides a native, open-source offering to Tiered Storage for Apache Kafka® and is available starting from Apache Kafka® 3.6.0.
The native Tiered Storage implementation is broker-dependent, meaning that the broker process itself is responsible
for offloading finalized log segments to remote storage, and the ***broker is always in the critical path of consumption***.

### Pinterest Tiered Storage: Skip the broker
***Pinterest's implementation of Tiered Storage is broker-independent***, meaning that the Tiered Storage process runs as a separate process alongside the Apache Kafka® server process,
***and the broker is not always in the critical path of consumption***.
This allows for more flexibility in adopting Tiered Storage, and better accommodates unpredictable consumption patterns.
Some of the key advantages of a broker-independent approach are:

1. **You don't need to upgrade brokers**: While the native offering requires upgrading brokers to a version that supports Tiered Storage, a broker-independent approach does not.
2. **You can skip the broker entirely during consumption**: When in `TIERED_STORAGE_ONLY` mode, the consumption loop does not touch the broker itself, allowing for more
unpredictable spikes in consumption patterns without affecting the broker. See [MemQ](https://github.com/pinterest/memq) for a PubSub system that uses this approach natively.
3. **Support consumer backfills and replays without affecting broker CPU**: When the broker is out of the critical path of consumption,
consumer backfills and replays can be done without needing to keep additional CPU buffer on the brokers just to support those surges.
4. **Avoid cross-AZ transfer costs**: While the native approach adds a cross-AZ network cost factor for consumers that are not AZ-aware,
this broker-independent approach avoids that cost for all consumers when reading directly from remote storage.
5. **Faster adoption, iteration, and improvements**: A broker-independent Tiered Storage solution lets you adopt and upgrade Tiered Storage without
waiting for Apache Kafka® upgrades. Improvements, bug fixes, and new features are released independently of Apache Kafka® releases.

# Highlights
- **Broker Independent**: The tiered storage solution is designed to be broker-independent. [Here's why we think it's better](#pinterest-tiered-storage-for-apache-kafka).
- **Skip the broker entirely during consumption**: The consumer can read from both broker and Tiered Storage backend filesystem. When in `TIERED_STORAGE_ONLY` mode, the consumption loop does not touch the broker itself, allowing for reduction in broker resource utilization.
- **Pluggable Storage Backends**: The framework is designed to be backend-agnostic. Currently, only S3 is supported. More backend filesystems will be supported in the near future.
- **Pluggable Storage Backends**: The framework is designed to be backend-agnostic.
- **S3 Partitioning**: Prefix-entropy (salting) is configurable out-of-the-box to allow for prefix-partitioned S3 buckets, allowing for better scalability by avoiding request rate hotspots.
- **Fault Tolerant**: Broker restarts, replacements, leadership changes, and other common Apache Kafka® operations / issues are handled gracefully.
- **Metrics**: Comprehensive metrics are provided out-of-the-box for monitoring and alerting purposes.

# Quick Start
Detailed quickstart instructions are available [here](docs/quickstart.md).

# Usage
Using Kafka Tiered Storage consists of the following high-level steps:
Using Pinterest Tiered Storage for Apache Kafka® consists of the following high-level steps:
1. Have a remote storage system ready to accept reads and writes of log segments (e.g. Amazon S3 bucket)
2. Configure and start [ts-segment-uploader](ts-segment-uploader) on each Kafka broker
2. Configure and start [ts-segment-uploader](ts-segment-uploader) on each Apache Kafka® broker
3. Use [ts-consumer](ts-consumer) to read from either the broker or the remote storage system
4. Monitor and manage the tiered storage system using the provided metrics and tools

Expand All @@ -45,22 +72,31 @@ Feel free to read into each module's README for more details.
![Architecture](docs/images/architecture.png)

# Current Status
**Kafka Tiered Storage is currently under active development and the APIs may change over time.**
**Pinterest Tiered Storage for Apache Kafka® is currently under active development and the APIs may change over time.**

Kafka Tiered Storage currently supports the following remote storage systems:
It currently supports the following remote storage systems:
- Amazon S3

Some of our planned features and improvements:
Some planned features and improvements:

- KRaft support
- More storage system support (e.g. HDFS)
- Integration with [PubSub Client](https://github.com/pinterest/psc) (backend-agnostic client library)

Contributions are always welcome!

# Ecosystem
Check out some of the other Pinterest projects designed to make PubSub more automated, efficient, and reliable:
- [PubSub Client](https://github.com/pinterest/psc): A backend-agnostic client library for PubSub systems
- [MemQ](https://github.com/pinterest/memq): An efficient, scalable cloud native PubSub system
- [Orion](https://github.com/pinterest/orion): A generalized and pluggable management and automation platform for stateful distributed systems, such as Apache Kafka® and MemQ

# Maintainers
- Vahid Hashemian
- Jeff Xiang

# License
Kafka Tiered Storage is distributed under Apache License, Version 2.0.
Pinterest Tiered Storage for Apache Kafka® is distributed under Apache License, Version 2.0.

# Trademark
Apache®️, Apache Kafka, and Kafka are trademarks of the Apache Software Foundation.
22 changes: 11 additions & 11 deletions ts-segment-uploader/README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,28 @@
# Tiered Storage Segment Uploader

## Overview
This module contains the uploader code that is used to upload Kafka log segments to the backing tiered storage filesystem.
It is designed to be a long-running and independent process that runs on each Kafka broker in order to upload
This module contains the uploader code that is used to upload Apache Kafka® log segments to the backing tiered storage filesystem.
It is designed to be a long-running and independent process that runs on each Apache Kafka® broker in order to upload
finalized (closed) log segments to a remote storage system. These log segments can later be read by a
[TieredStorageConsumer](../ts-consumer) even if the log segments have already been deleted from local storage on the broker, provided
that the retention period of the segments on remote storage system is longer than that of the Kafka topic itself.
that the retention period of the segments on remote storage system is longer than that of the Apache Kafka® topic itself.

## Architecture
![Uploader Architecture](../docs/images/uploader.png)

The uploader process runs alongside the Kafka server process on every broker, and is responsible for uploading log segments
The uploader process runs alongside the Apache Kafka® server process on every broker, and is responsible for uploading log segments
to the configured remote storage system.

The key components to the uploader are:
1. [KafkaSegmentUploader](src/main/java/com/pinterest/kafka/tieredstorage/uploader/KafkaSegmentUploader.java): The entrypoint class
2. [DirectoryTreeWatcher](src/main/java/com/pinterest/kafka/tieredstorage/uploader/DirectoryTreeWatcher.java): Watches the Kafka log directory for log rotations (closing and opening of new log segments)
3. [KafkaLeadershipWatcher](src/main/java/com/pinterest/kafka/tieredstorage/uploader/KafkaLeadershipWatcher.java): Monitors leadership changes in the Kafka cluster and updates the watched filepaths as necessary
2. [DirectoryTreeWatcher](src/main/java/com/pinterest/kafka/tieredstorage/uploader/DirectoryTreeWatcher.java): Watches the Apache Kafka® log directory for log rotations (closing and opening of new log segments)
3. [KafkaLeadershipWatcher](src/main/java/com/pinterest/kafka/tieredstorage/uploader/KafkaLeadershipWatcher.java): Monitors leadership changes in the Apache Kafka® cluster and updates the watched filepaths as necessary
4. [FileUploader](src/main/java/com/pinterest/kafka/tieredstorage/uploader/S3FileUploader.java): Uploads the log segments to the remote storage system
5. [KafkaEnvironmentProvider](src/main/java/com/pinterest/kafka/tieredstorage/uploader/KafkaEnvironmentProvider.java): Provides information about the Kafka environment for the uploader
5. [KafkaEnvironmentProvider](src/main/java/com/pinterest/kafka/tieredstorage/uploader/KafkaEnvironmentProvider.java): Provides information about the Apache Kafka® environment for the uploader

If a broker is currently a leader for a topic-partition, the uploader will watch the log directory for that topic-partition.
When a previously-active log segment is closed, the uploader will upload the closed segment to the remote storage system. Typically,
a log segment is closed when it reaches a time or size-based threshold, configurable via Kafka broker properties.
a log segment is closed when it reaches a time or size-based threshold, configurable via Apache Kafka® broker properties.

Each upload will consist of 3 parts: the segment file, the index file, and the time index file. Once these files are successfully
uploaded, an `offset.wm` file will also be uploaded for that topic partition which contains the offset of the last uploaded log segment.
Expand All @@ -31,10 +31,10 @@ This is used to resume uploads from the last uploaded offset in case of a failur
## Usage
The segment uploader entrypoint class is `KafkaSegmentUploader`. At a minimum, running the segment uploader requires:
1. **KafkaEnvironmentProvider**: This system property should be provided upon running the segment uploader (i.e. `-DkafkaEnvironmentProvider`). It should
provide the FQDN of the class that provides the Kafka environment, which should be an implementation of [KafkaEnvironmentProvider](src/main/java/com/pinterest/kafka/tieredstorage/uploader/KafkaEnvironmentProvider.java).
provide the FQDN of the class that provides the Apache Kafka® environment, which should be an implementation of [KafkaEnvironmentProvider](src/main/java/com/pinterest/kafka/tieredstorage/uploader/KafkaEnvironmentProvider.java).
2. **Uploader Configurations**: The directory in which uploader configurations live should be provided as the first argument to the segment uploader main class.
It should point to a directory which contains a `.properties` file that contains the configurations for the segment uploader. The file that
the uploader chooses to use in this directory is determined by the Kafka cluster ID that is provided by `clusterId()` method
the uploader chooses to use in this directory is determined by the Apache Kafka® cluster ID that is provided by `clusterId()` method
for the `KafkaEnvironmentProvider` implementation. Therefore, the properties file should be named as `<clusterId>.properties`.
3. **Storage Endpoint Configurations**: The segment uploader requires a [StorageServiceEndpointProvider](../ts-common/src/main/java/com/pinterest/kafka/tieredstorage/common/discovery/StorageServiceEndpointProvider.java)
to be specified in the properties file. This provider's implementation dictates where each topic's log segments should be uploaded to.
Expand Down Expand Up @@ -66,7 +66,7 @@ With prefix entropy, one can pre-partition the S3 bucket using the prefix bits t
Because the hash is generated via the cluster, topic, and partition ID combination, all log segments (and their associated `.index`, `.timeindex`, and `offset.wm` files)
for a given topic-partition on a given cluster will be uploaded to the same prefix in S3. This way, the [TieredStorageConsumer](../ts-consumer/src/main/java/com/pinterest/kafka/tieredstorage/consumer/TieredStorageConsumer.java)
can deterministically reconstruct the entire key during consumption. Due to this reason, **it is important to ensure that the `ts.segment.uploader.s3.prefix.entropy.bits`
configuration is consistent across all brokers in the Kafka cluster, and consistent in the consumer's configuration as well**.
configuration is consistent across all brokers in the Apache Kafka® cluster, and consistent in the consumer's configuration as well**.

## Configuration
The segment uploader configurations are passed via the aforementioned properties file. Available configurations
Expand Down

0 comments on commit 6935a54

Please sign in to comment.