This sample project illustrates different ways to integrate Kafka with Spring, processing a Streams workload. The following aspects are being considered:
- Producing messages to Kafka using
StreamBridge
- Handling Streams of Kafka messages using Spring Cloud Streams with the Functions and Kafka Streams API
- Consuming Streams of Kafka Messages using Consumer Functions
- Maintaining application state using Kafka State Stores (via
KTable
) - Illustrating error handling for deserialization errors
To simulate a data processing pipeline, 3 separate services produce, process, and consume messages. The interactions between the services can be observed using AKHQ and the application logs (see below for instructions on how to run it).
We are helping a space agency to set up their telemetry data receivers. Therefore, we need to keep track of the telemetry data we receive from the various space probes that are roaming the solar system (and beyond!). The space agency's requirements are:
- We need frequent updates on the aggregated telemetry data per probe, i.e. the total distance traveled by a given probe and the maximum speed it has reached on its journey so far.
- Every probe, regardless of whether it's a NASA or an ESA probe, sends its telemetry data in the imperial format. However we need to make sure that the data is converted if needed, based on the receiver's requirements:
- NASA probe data should be made available in the imperial system (i.e. speed in miles/hour, distances in feet)
- ESA probes data should be made available in the metric system (i.e. speed in kilometres/hour, distances in metres)
For the ESA probes, the received data thus needs to be converted into the metric system. As we learned from the Mars Climate Orbiter fail, this is pretty important to keep our probes from ending up in a flaming fireball.
Our setup consists of 5 components:
- Sample producer that simulates imperial and metric telemetry data that comes in from a number of different space probes
- Aggregator component that brings the data into the desired aggregated format
- Sample consumer that ingests the (aggregated) imperial telemetry data and converts it into metric units if necessary
- Kafka cluster for data transfer
- AKHQ web UI for observing the Kafka cluster
In order to build & run this sample, you need gradle and Docker Desktop. You can run it by executing the following commands:
- Start up the Kafka Cluster and AKHQ:
docker compose --project-directory ./docker up
- Once started, you can explore the Kafka Topics and messages with AKHQ:
http://localhost:9080
- Once started, you can explore the Kafka Topics and messages with AKHQ:
- Start up the respective services:
- Sample producer:
gradle -p ./kafka-samples-producer bootRun
- Sample consumer:
gradle -p ./kafka-samples-consumer bootRun
- Streams processor:
gradle -p ./kafka-samples-streams bootRun
- Sample producer:
The sample producer will emit a sample telemetry record for one of 10 different arbitrary probes every second (waiting for 5s after Spring context startup), so you should see the first aggregated telemetry records come in after a few seconds.
If you want to have a look at the Streams Topology used in this sample, you find a visualization in the doc
subdirectory. Generate your own via Spring Boot Actuator and the excellent Kafka Streams Topology Visualizer.
Unless you're running this example for the first time, your docker containers might be struggling with a previous inconsistent state. Giving yourself a fresh start by running docker-compose down --remove-orphans -v
inside the docker
directory should do the trick.
Have fun!