This article has a lot of useful context on how to write and deploy your own connector.
jcustonborder's github repo has more examples of custom connectors.
This fork of Confluent's Kafka Connector project exists so that we can drop in our own custom Single Message Transform (SMT) that trims strings to lengths that can fit into Redshift's 65535-byte VARCHAR max.
To build it you need a boatload of dependencies that are laid out, not quite correctly, at https://github.com/confluentinc/kafka-connect-storage-common/wiki/FAQ -- here are the steps that I followed successfully:
-
git clone [email protected]:confluentinc/kafka.git
-
cd kafka
-
Make sure this line is in build.gradle at the top:
apply plugin: 'maven-publish'
-
And make sure buildscript has mavenLocal():
repositories { mavenLocal() ...
-
gradle
-
./gradlew installAll
-
./gradlew build publishToMavenLocal -x test
-
cd ..
-
git clone [email protected]:confluentinc/common.git
-
cd common
-
mvn install -Dmaven.test.skip=true
-
cd ..
-
git clone [email protected]:confluentinc/rest-utils.git
-
cd rest-utils
-
mvn install -Dmaven.test.skip=true
-
cd ..
-
git clone [email protected]:confluentinc/schema-registry.git
-
cd schema-registry
-
mvn install -Dmaven.test.skip=true
-
cd ..
-
git clone [email protected]:confluentinc/kafka-connect-storage-common.git
-
cd kafka-connect-storage-common
-
mvn install -Dmaven.test.skip=true
-
cd ..
-
git clone [email protected]:RescueTime/kafka-connect-storage-cloud.git (this project home)
-
cd kafka-connect-storage-cloud
-
mvn install -Dmaven.test.skip=true
-
cp ./kafka-connect-s3/target/kafka-connect-s3-5.4.0-SNAPSHOT.jar ~/confluent-5.3.0/share/java/kafka-connect-s3/
-
~/confluent-5.3.0/bin/confluent local stop (if already running)
-
~/confluent-5.3.0/bin/confluent local start
The last step assumes you have the developer version of Confluent installed in your home directory. If not, get it here: https://docs.confluent.io/current/quickstart/index.html
Once ready to ship the jar, put it in the rt-playbooks project, which is how it gets on the Kafka brokers:
- cp ./kafka-connect-s3/target/kafka-connect-s3-5.4.0-SNAPSHOT.jar ~/dev/rt-playbooks/kafka/roles/confluent.kafka_connect/files/
Then run the appropriate ansible playbook to deploy Kafka.
kafka-connect-storage-cloud is the repository for Confluent's Kafka Connectors designed to be used to copy data from Kafka into Amazon S3.
Documentation for this connector can be found here.
Blogpost for this connector can be found here.
To build a development version you'll need a recent version of Kafka as well as a set of upstream Confluent projects, which you'll have to build from their appropriate snapshot branch. See the kafka-connect-storage-common FAQ for guidance on this process.
You can build kafka-connect-storage-cloud with Maven using the standard lifecycle phases.
- Source Code: https://github.com/confluentinc/kafka-connect-storage-cloud
- Issue Tracker: https://github.com/confluentinc/kafka-connect-storage-cloud/issues
This project is licensed under the Confluent Community License.