Skip to content

Releases: datahub-project/datahub

DataHub v0.8.2

18 Jun 13:42
550a9de
Compare
Choose a tag to compare

Release Notes

Bug fix release that fixes installation, upgrade and usability issues with v0.8.1 specifically around product analytics.
Read the release notes for v0.8.1 here.
Full list of improvements follow.

Changelog

DataHub v0.8.1

04 Jun 20:16
a483933
Compare
Choose a tag to compare

Release Notes

  • Bug fix release that fixes installation and upgrade issues with v0.8.0.
  • Read the release notes for v0.8.0 here.

Changelog

DataHub v0.8.0

03 Jun 20:38
97e9660
Compare
Choose a tag to compare

Notable Highlights

  • Product Analytics : Understand how your users are interacting with DataHub
  • Product Improvements: Auto-complete across types, Task list view under Pipelines
  • Features: Business Glossary (incubating)
  • Integration improvements
    • Looker, dbt, Hive, Redshift, Glue, MongoDB
    • Kafka Connect (incubating)

and finally,

NoCodeMetadata

This release introduces a major refactor that permits extension of DataHub’s metadata model without writing any imperative code.

Highlights:

  • Removed strongly-typed, entity-specific DAOs. Added more generic services.
  • Introduced Elastic settings & mappings generation, dynamic index registration & evolution
  • Decoupled persistence layer from Pegasus + Java by removing fully-qualified class names (aspects, relationships)
  • Introduced declarative, annotation-based mechanisms for defining indexed fields, foreign key fields, entities & aspects
  • In-place upgrade CLI to aid in adopting this upgrade (datahub-upgrade)

For more information, see

The PR: #2629
Technical Overview
The DataHub Metadata Model
Extending the Metadata Model
No Code Upgrade Guide

ChangeLog

Read more

DataHub v0.7.1

23 Apr 07:48
ae4def2
Compare
Choose a tag to compare

Notable Highlights

  • Lineage Visualization
  • Pipelines and Tasks, Flows and Jobs
  • Airflow Lineage
  • Editable Field Descriptions
  • Nested Schema Viz
  • Search Improvements
  • datahub CLI
  • Official PyPi packages
  • Production-quality Helm scripts
  • New Integrations
    • Officially-supported Sources: Airflow, AWS Glue, dbt, Druid, Superset, MongoDB, Oracle

Changelog

Read more

DataHub v0.7.0

19 Mar 03:02
5e91014
Compare
Choose a tag to compare

Notable Highlights

  • New React Application re-written from the ground up
  • Support for GraphQL
  • New Metadata Ingestion Framework (Python)
    • Officially-supported Sources: Kafka, MySQL, SQL Server, Hive, Postgres, Snowflake, BigQuery, AWS Athena, Druid, LDAP
  • New Homepage and Hosted Docs redesign at datahubproject.io
  • Product Features: SSO (OIDC), Tags, Themes, Dashboards
  • Metadata Backend Implementations: MLModel ecosystem, DataFlow ecosystem
  • Move to Elasticsearch 7. Migration guide from 5.x here

Changelog

Read more

DataHub v0.6.1

03 Dec 17:31
5f9d967
Compare
Choose a tag to compare

Added

Changed

DataHub v0.6.0

29 Oct 18:00
0c92a8e
Compare
Choose a tag to compare

Added

Changed

Deleted

DataHub v0.5.0

07 Oct 19:13
273844c
Compare
Choose a tag to compare

Added

  • #1775 feat(dashboard): Dashboard metadata models @ksahin
  • #1818 doc(rfc): Add requirements / non requirements section to RFC. @jplaisted
  • #1805 Start adding java ETL examples, starting with kafka etl. @jplaisted
  • #1812 feat(ML models): RFC for ML models @jywadhwani
  • #1721 feat: add ML models @arunvasudevan
  • #1859 feat(platform): add "postgres" as a supported data platform @mars-lan
  • #1844 feat(frontend): Module consolidation for some test modules and reduces errors from unsupported API calls @catran
  • #1837 feat: add MCE ingestion support for CorpGroup @mars-lan
  • #1821 feat(frontend): Module consolidation - clean up for OS logic - init virtual assistant @catran

Changed

Removed

DataHub v0.5.0-beta

27 Aug 22:42
7299fd5
Compare
Choose a tag to compare
DataHub v0.5.0-beta Pre-release
Pre-release

Changed

  • #1806 Updated the frontend code. The frontend code was very far (> 6 months) behind the internal frontend code. We're not caught up yet, hence the BETA release, but we did go pretty far. Major refactorings were included.

Added

  • #1810 Added open source NOTICE.
  • #1817 Added backend and mid tier support for /dataPlatforms as an end point

DataHub v0.4.3

20 Aug 16:34
236d5e6
Compare
Choose a tag to compare

Added

  • #1782 improve security of k8s / helm charts
  • #1791 Add description of dataset to the search index
  • #1803 Add an example crawler for MS SQL
  • #1811 Sync our internal backend code externally to HEAD (we're caught up now!)
    • Added ESBulkWriterDAO to bulk write to ElasticSearch. Planned usage is for integration tests.
    • Add Strongly Consistent Secondary Index (SCSI) Implementation for MySQL.
    • Start adding code to generate aspect-entity specific metadata events, rather than our current single event approach.
    • Add support in the GMS to ask for no aspects on entities by setting the aspectNames param to null (omitting the param is still considered as asking for all aspects). Useful if checking the existence of an entity to avoid a large response (i.e. performing a search to just get URNs back, and nothing else).

Changed

  • #1777 Add docker files for development

Fixed

  • #1748 Remove unused model
  • #1788 Remove unused model
  • #1789 Remove unused model

Fixed

  • #1808 Clear dataset description from search index when cleared in source