Skip to content

Commit

Permalink
Update module-09-6-data-lineage-with-dataproc.md
Browse files Browse the repository at this point in the history
  • Loading branch information
anagha-google authored Sep 22, 2023
1 parent 7563317 commit f7295e3
Showing 1 changed file with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@ Successful completion of prior lab modules

### Lineage support for Dataproc

Once you enable the lineage feature in your Dataproc cluster or Dataproc Spark job, the jobs capture lineage events and publish them to the Dataplex Data Lineage API. Dataproc integrates with the Data Lineage API through OpenLineage, using the OpenLineage Spark plugin.<br>
Dataproc on GCE supports that can be enabled at a cluster lebel or at a job level. Once you enable the lineage feature in your Dataproc cluster or Dataproc Spark job, the jobs capture lineage events and publish them to the Dataplex Data Lineage API. Dataproc integrates with the Data Lineage API through OpenLineage, using the OpenLineage Spark plugin.<br>

Dependencies are: Data Lineage API and Data Catalog API <br>
API dependencies are: Data Lineage API and Data Catalog API <br>

Required roles are detailed at: https://cloud.google.com/dataproc/docs/guides/lineage#required-roles <br>

Expand Down

0 comments on commit f7295e3

Please sign in to comment.