Skip to content

Kafka Connect suite of connectors for Cloud storage (Amazon S3)

License

Notifications You must be signed in to change notification settings

RescueTime/kafka-connect-storage-cloud

 
 

Repository files navigation

Kafka Connect Connector for S3

Background on Writing Connectors

This article has a lot of useful context on how to write and deploy your own connector.

jcustonborder's github repo has more examples of custom connectors.

Building the RescueTime Custom Connector

This fork of Confluent's Kafka Connector project exists so that we can drop in our own custom Single Message Transform (SMT) that trims strings to lengths that can fit into Redshift's 65535-byte VARCHAR max.

To build it you need a boatload of dependencies that are laid out, not quite correctly, at https://github.com/confluentinc/kafka-connect-storage-common/wiki/FAQ -- here are the steps that I followed successfully:

  1. git clone [email protected]:confluentinc/kafka.git

  2. cd kafka

  3. Make sure this line is in build.gradle at the top:

    apply plugin: 'maven-publish'

  4. And make sure buildscript has mavenLocal():

        repositories {
          mavenLocal()
          ...
    
  5. gradle

  6. ./gradlew installAll

  7. ./gradlew build publishToMavenLocal -x test

  8. cd ..

  9. git clone [email protected]:confluentinc/common.git

  10. cd common

  11. mvn install -Dmaven.test.skip=true

  12. cd ..

  13. git clone [email protected]:confluentinc/rest-utils.git

  14. cd rest-utils

  15. mvn install -Dmaven.test.skip=true

  16. cd ..

  17. git clone [email protected]:confluentinc/schema-registry.git

  18. cd schema-registry

  19. mvn install -Dmaven.test.skip=true

  20. cd ..

  21. git clone [email protected]:confluentinc/kafka-connect-storage-common.git

  22. cd kafka-connect-storage-common

  23. mvn install -Dmaven.test.skip=true

  24. cd ..

  25. git clone [email protected]:RescueTime/kafka-connect-storage-cloud.git (this project home)

  26. cd kafka-connect-storage-cloud

  27. mvn install -Dmaven.test.skip=true

  28. cp ./kafka-connect-s3/target/kafka-connect-s3-5.4.0-SNAPSHOT.jar ~/confluent-5.3.0/share/java/kafka-connect-s3/

  29. ~/confluent-5.3.0/bin/confluent local stop (if already running)

  30. ~/confluent-5.3.0/bin/confluent local start

The last step assumes you have the developer version of Confluent installed in your home directory. If not, get it here: https://docs.confluent.io/current/quickstart/index.html

Once ready to ship the jar, put it in the rt-playbooks project, which is how it gets on the Kafka brokers:

  1. cp ./kafka-connect-s3/target/kafka-connect-s3-5.4.0-SNAPSHOT.jar ~/dev/rt-playbooks/kafka/roles/confluent.kafka_connect/files/

Then run the appropriate ansible playbook to deploy Kafka.

Confluent's Readme Follows

kafka-connect-storage-cloud is the repository for Confluent's Kafka Connectors designed to be used to copy data from Kafka into Amazon S3.

Kafka Connect Sink Connector for Amazon Simple Storage Service (S3)

Documentation for this connector can be found here.

Blogpost for this connector can be found here.

Development

To build a development version you'll need a recent version of Kafka as well as a set of upstream Confluent projects, which you'll have to build from their appropriate snapshot branch. See the kafka-connect-storage-common FAQ for guidance on this process.

You can build kafka-connect-storage-cloud with Maven using the standard lifecycle phases.

Contribute

License

This project is licensed under the Confluent Community License.

FOSSA Status

About

Kafka Connect suite of connectors for Cloud storage (Amazon S3)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 97.5%
  • HTML 2.5%