DataHub uses Kafka as the pub-sub message queue in the backend. There are 2 Kafka topics used by DataHub which are
MetadataChangeEvent
and MetadataAuditEvent
.
MetadataChangeEvent:
This message is emitted by any data platform or crawler in which there is a change in the metadata.MetadataAuditEvent:
This message is emitted by DataHub GMS to notify that metadata change is registered.
To be able to consume from these two topics, there are two Kafka Streams jobs DataHub uses:
- MCE Consumer Job: Writes to DataHub GMS
- MAE Consumer Job: Writes to Elasticsearch & Neo4j