Kafka topic cloner
is a CLI that clones the content of a topic into another one.
The cloner supports two different hashers for key/partition assignment:
Murmur2
, which is the standard hasher used in the Java kafka community, including kafka scripts and kafka connect (default)FNV-1a
, which is the standard hasher used in a part of the Go kafka community (e.g. Sarama producers)
The CLI was written in Go using spf13/cobra.
It is strongly recommended that you use a released version. You can find the released binaries here.
wget https://github.com/ricardo-ch/kafka-topic-cloner/releases/download/v0.5.0/kafka-topic-cloner
This project has binaries for Linux(386), and since the 0.5.0 for Windows. If your platform is not supported, you can build the project manually, or open an issue asking us to add your platform to the supported ones. If you used an official release, there is no extra step required before you can start cloning!
Note: the project was created using Go 1.10.1, so you will need to have it installed before proceeding with the next steps.
If you want to dig inside the code, or build an executable for your own platform, you can download the source code with go get
.
go get -u github.com/ricardo-ch/kafka-topic-cloner
You then need to download the dependencies of the project. We are using dep as our dependency manager, so you will need to install it (instructions available on their own github repository).
Once dep
is installed, retrieve the source files of the dependencies:
dep ensure --update
You can then build an executable for your own using go build
. If you want to build for another platform, you will need to set the GOOS
and GOARCH
environment variables to match the target system specifications.
A standard use of the Kafka topic cloner
would look like this:
kafka-topic-cloner --from-brokers localhost:9092 --from foo --to bar
This is going to consume every event from the foo
topic and produce them inside the bar
topic.
If you choose the same hasher that was used to populate the source topic, and that you have the same number of partitions in the source and target topics, the cloned topic will be an exact replica of the original one. If some events were mistakenly placed on the wrong partition (e.g. by manually producing them), the cloning would place them back on the right one.
By default, Kafka topic cloner
will use Murmur2
as partitioning hasher. It is the algorithm that is used by the kafka scripts, kafka connect, and the majority of the Java kafka libraries (including the Stream API). If you prefer to use FNV-1a
instead, which is the hasher implemented in Sarama (Golang's most popular kafka library), you can use the hasher parameter:
kafka-topic-cloner --brokers localhost:9092 --from foo --to bar --hasher FNV-1a
If you would like to see another hasher implemented, feel free to open an issue about this!
You can clone a topic from a kafka cluster to a different one, by specifying the --to-cluster
parameter:
kafka-topic-cloner --from-brokers localhost:9092 --to-brokers remote-cluster:9092 --from foo --to bar
Technically, a Kafka topic has no definite end, but it is nice to know when the application is done cloning every available event in the source topic. To do so, Kafka topic cloner
comes with a built-in timeout that will close the application when it was unable to clone any event for a certain amount of time. The default timeout delay is 10 seconds. You can override this delay by using the timeout
parameter:
kafka-topic-cloner --brokers localhost:9092 --from foo --to bar --timeout 5000
As of today, there is no way to completely disable the timeout (to be implemented soon).
Loop-cloning, or same-topic cloning, is the action of cloning a topic into itself. Since it creates a continuous flow of new events inside the source topic, the cloning will never end and quickly multiply the number of events. Since this can be quite a dangerous action if done unintentionally, this action it protected by the --loop parameter. When loop-cloning, you should not specify the target topic, and the source topic will be used as target:
kafka-topic-cloner --from-brokers localhost:9092 --from foo --loop
You can find the complete list of parameters below:
Argument | Shorthand | Description |
---|---|---|
from-brokers | F | Semicolon-separated list of the source kafka brokers |
from | f | Source topic's name |
to-brokers | T | Semicolon-separated list of the target kafka brokers, specify only for cross-clusters cloning |
to | t | Destination topic's name |
timeout | o | consumer timeout is ms (defaults to 10000) |
hasher | p | name of the hasher to use for partitioning, possible values: murmur2 (default), FNV-1a |
compression | c | name of the compression codec to use, possible values: none, gzip(default), snappy, lz4 |
loop | L | allow loop-cloning |
verbose | v | verbose mode (defaults to false) |
help | h | displays the CLI's help |
Kafka topic cloner
can not be considered production-ready since it does not come with any test at the moment. Use it at your own risk! :)
Contributions are always welcome and appreciated, the maintainers have a look at the new issues and consider every pull request.