Skip to content

Tool To Aggregate Events from Kafka Topics into Elasticsearch

License

Notifications You must be signed in to change notification settings

galarragas/event-aggregator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

event-aggregator

Tool To Aggregate Events from Kafka Topics into Elasticsearch

Build Status

Event Aggregator

The event aggregator is a simple infrastructure component designed to consume events written into Kafka in AVRO format and write them into Elasticsearch. Event schemas are maintained into a Schema Registry. The one currently supported is the Confluent Schema Registry

The tool is essentially a rewriting for open source purposed of a similar tool I wrote and extensively used when working on a microservice event driven system. It proven very useful to me so I think it would have been good to share with the community.

Feel free to use it and extend it.

Purpose

One of the major infrastructural requirements when implementing a distributed system, and mostly when working on a microservice architecutre, is log aggregation. Developers and administrators need to be able to have a centralised view of how a user requrest propagates through the system and to trace when an error occurs and how it propagates.

An event based system imposes that the main interactions across the different services are based on events published into an event bus. Different (micro)services are then able to subscribe to different event topics and react to the different events.

In such an architecture is therefore highly beneficial to be able to browse, search and inspect events in the same place where the services logs are. This will allow to completely track the overall system reaction to every event.

If you think about a REST based microservice system, the event aggregation plays a similar role of having the HTTP trafic in your log aggregator.

Well designed events would include tracing facilities such as Correlation IDs that is used to track the whole flow of events originated by the same source. This ID is then usually used as an enricher of the application logs (see also Log4J MDC).

Implementing the above mentioned solutions and having logs and events in your log aggregation facility will allow you to have an incredibly powerful view of your system behavior.

Status

The project is ready to be used but I still need to complete the publishing to DockerHUB. Feel free to start using it and contribute!!

Features

  • Simple command line application deployed as a Docker Container
  • Designed to support workload sharing through the deployment of multiple instances
  • Aggregates events from multiple topics
  • Supports writing to Elasticsearch with both native Elasticsearch transport or REST over HTTP (the last one is required if you are trying to integrate with the AWS Elastisearch Service)
  • Events are assumed to be written in AVRO format, you could extend the tool to support different format and I will be more than happy to merge your PR
  • AVRO Schemas are assumed to be managed via Confluent's Schema Registry
  • Connection to Elasticsearch is protected via a Circuit Breaker (connection to the Schema Registry is not since the current client doesn't suport)
  • The target Elasticsearch index where the events are stored to is changed every day
  • Configurable path for event timestamp and ID extraction
  • Container status can be monitored via an HTTP Health Check Endpoint

Extensions

A series of possible extension I might work on and I will be happy to accept PRs for are:

  • Supporting different event serialization formats (Protbuf, Thrift, ...)
  • Supporting different log aggregation targets (SolR, ...)
  • Supporting different event bus (Kinesis, ...) (this might be more effort considering current implementation)

Technologies

The tool is based on

About

Tool To Aggregate Events from Kafka Topics into Elasticsearch

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published