Releases: datahub-project/datahub
Releases · datahub-project/datahub
DataHub v0.8.12
Release Highlights
- RBAC Phase 1: Added abilities to control access through policies in the UI and backend
- Dataset page refresh!!! + improved home page, search and browse screens
- Added the ability to monitor DataHub through Prometheus and provided example Grafana dashboards
- GraphQL API browser hosted on /api/graphql endpoint.
- Support for Business Glossary ingestion through yml file
- Support for Azure AD ingestion source
Notable Changes
- Fixed unicode rendering bug introduced in v0.8.11
- Added the ability to search by properties in the customProperties bag: supports case-insensitive matches of the form ‘key=value’
- For instance, query “encoding=utf-8” will return entities with “encoding”: “utf-8” in the property bag
- Full changelog below
Changelog
- #3214 @dexter-mh-lee fix(docker): pin setuptools version in docker ingestion build
- #3212 @gabe-lyons fix(metadata-ingestion): fixing lint issues
- #3196 @abdvl fix(react): safely access caught Error properties
- #3195 @dexter-mh-lee feat(perf): Add perf testing and monitoring framework
- #3136 @dexter-mh-lee feat(search): Add searchable annotation to maps
- #3158 @karoliskascenas feat(ingest): optionally ingest deleted looker dashboards
- #3210 @gabe-lyons fix(admin): moving admin links to header
- #3211 @dexter-mh-lee fix(build): specify setuptools version for dev install
- #3208 @dexter-mh-lee fix(search): Move filters to query instead of post query
- #3209 @gabe-lyons fix(react): fix tag schema search on tag profile
- #3190 @jjoyce0510 fix(graphql): fix ml model properties resolver
- #3200 @jjoyce0510 fix(bootstrap): making bootstrap manager run once
- #3197 @jjoyce0510 feat(access control): Adding "authorizedActors" method to AuthorizationManager
- #3201 @EnricoMi ci: upload test reports
- #3199 @jjoyce0510 Fix GraphQL Variables
- #3193 @abdvl refactor(test): remove the
datahub-frontend.graphql
- #3198 @dexter-mh-lee fix(platform): fix kafka env name for MCL_timeseries
- #3194 @jjoyce0510 fix(react): fix add links
- #3192 @gabe-lyons fix(react): fixing format of search snippets
- #3191 @jjoyce0510 fix(react): pin the control center menu icon
- #3189 @jjoyce0510 fix(404): Fix 404 Exit Error.
- #3182 @jjoyce0510 feat(access control): Fine-Grained Access Control M1
- #3187 @gabe-lyons fix(react): Fix the fieldPath grouping logic in the front-end
- #3188 @nickwu241 docs: fix "data platforms" link in dbt.md
- #3184 @dexter-mh-lee fix(kafka): Change env variable name for MCL_versioned to be consistent
- #3185 @gabe-lyons fix(react): removing preview artifact from platform logo
- #3183 @chinmay-bhat fix(business_glossary): added init.py
- #3181 @chinmay-bhat refactor(ingest): rename azure source to azure_ad
- #3159 @sgomezvillamor feat(ingest): add optional config for ownership type in ownership transformers
- #3179 @remisalmon fix(dbt): use_identifiers option and avoid duplicate descriptions
- #3164 @shirshanka feat(ingest): Add a business glossary source
- #3178 @gabe-lyons fix(react): show schema-attached description
- #3177 @dexter-mh-lee Revert "fix(search): move filters to query instead of postFilter (#3112)"
- #3173 @dexter-mh-lee fix(docs): Add documentation for AWS MSK
- #3176 @dexter-mh-lee feat(airflow): add example docker setup for airflow
- #3175 @gabe-lyons fix(dataflow): optimize topological sort logic
- #3170 @chinmay-bhat docs(ingestion): updated hive ingestion docs with Databricks recipe
- #3171 @chinmay-bhat fix(doc): add use_odbc to mssql doc example
- #3169 @gabe-lyons feat(react): Dataset page refresh + improved homepage, search and browse screens
- #3168 @gabe-lyons fix(frontend): fix utf8 encoding bug
- #3167 @shirshanka docs: update Aug townhall details and announce Sep townhall
- #3112 @dexter-mh-lee fix(search): move filters to query instead of postFilter
- #3148 @frsann feat(ingest): Minor Kafka Connect source improvements
- #3161 @chinmay-bhat feat(ingest): Adding Azure Source integration to ingest users, groups and group memberships
- #3165 @jjoyce0510 feat(graphql): add GraphQL Explorer (GraphiQL)
DataHub v0.8.11
Release Highlights
- Business Glossary: Phase 1 is feature complete. Full support for UI viewing and API-based edits, no support for UI edits.
- Users and Groups: Just-in-time User and Group provisioning on login (SSO/OIDC), basic Group pages with membership information
- New Integrations: Redash
Notable Changes
- GraphQL and REST API-s are now both served by datahub-metadata-service (new name for gms). Frontend is now a proxy. Container names are not changed.
- Kafka source will no longer tokenize on
.
in the topic name. This will result in a flat browse experience in UI. - Airflow lineage emission will only populate specific properties of Tasks and DAGs to limit bloat and avoid leaking environment variables.
- Schema history feature turned off in UI based on feedback from the community. Will re-emerge in a future release!
- Mongodb collections with extremely wide schemas will have schema fields sampled to keep UI responsive.
- Full changelog below.
ChangeLog
- #3156 @swaroopjagadish fix(frontend): replacing broken link for default avatar
- #3154 @swaroopjagadish fix(frontend): fixing broken link to default avatar
- #3153 @swaroopjagadish feat(ingest): adding maxSchemaSize to mongodb source
- #3150 @saxo-lalrishav fix(business-glossary): business glossary visual changes
- #3142 @greysond fix(metadata-service): actually load keys from keystore for elastic connections
- #3110 @frsann feat(ingestion): bring your own SQL parser
- #3146 @jjoyce0510 fix(react): refactoring hasKeySchema computation
- #3145 @swaroopjagadish deps(ingest): upgrade to pick up acryl-pyhive changes
- #3144 @sgomezvillamor fix(profiles): prevent NoneType exception when profiling empty datasets
- #3140 @swaroopjagadish fix(glossary): Make terms searchable and browseable
- #3139 @swaroopjagadish fix(deps): Adding min version to python-dateutil to guard against isoparse failures
- #3135 @dexter-mh-lee fix(kafka): Change consumer id of mae/mce processor
- #3137 @swaroopjagadish fix(airflow): only emit specific keys for airflow lineage properties
- #3131 @jjoyce0510 feat(graphql): migrating GraphQL API to metadata-service (nee GMS)
- #3082 @jjoyce0510 feat(sso): Just-In-Time User & Group Provisioning on SSO Login (oidc)
- #3129 @saxo-lalrishav feat(business-glossary): Business glossary relationship UI
- #3113 @dexter-mh-lee feat(ingest): Add custom browse paths for kafka sources and remove browse lowercase filter
- #2918 @taufiqibrahim feat(ingest): adding redash source
- #3103 @saxo-lalrishav feat(business-glossary): glossary term relationship graphql changes
- #3015 @jjoyce0510 refactor: remove unused gms code, frontend endpoints part 2/4
- #3094 @jjoyce0510 feat(group ui): Basic group search membership in UI
- #3012 @Shikha-Trivedi-Saxo feat(business-glossary): Glossary term relationship backend
- #3049 @neojunjie feat(frontend): logout with oidc
- #3099 @gabe-lyons fix(schema-version): temporarily hide schema version tab
- #3048 @saxo-lalrishav feat(business-glossary): added field level glossary terms
- #3095 @shirshanka fix(ingest): increasing default ingestion REST timeout to 30 seconds
- #3096 @dexter-mh-lee fix(upgrade): Fix MAE consumer and upgrade's dependency issue
- #3092 @jensenity fix(postgres): fix postgres setup to handle existing database
DataHub v0.8.10
Release Highlights
Bugfix release for 0.8.9
- [#3096] Fix dependency injection issue introduced by this PR
- Increase REST emitter timeout to 30 seconds by default
ChangeLog
- #3095 @shirshanka fix(ingest): increasing default ingestion REST timeout to 30 seconds
- #3096 @dexter-mh-lee fix(upgrade): Fix MAE consumer and upgrade's dependency issue
- #3092 @jensenity fix(postgres): fix postgres setup to handle existing database
DataHub v0.8.9
Release Highlights
- Support for nested structs, union types and key-value schemas in Kafka
- Support for JDBC Connector based sources in Kafka Connect
- Support for Okta as a source for User and Group metadata
- Support for using AWS Glue schema registry
Breaking Changes
- [#3079] : Introduces a change to fieldPath encoding in schema metadata. Note: This is a backwards compatible change for the storage layer. Old fieldPaths will still be rendered correctly. At read time, fieldPaths in the new encoding will be translated to the old encoding to discover tags written before this change. Tags and Descriptions applied to fields earlier (which were being stored in the old format) will be migrated on applying new tags or editing descriptions.
Important Bug Fixes
- [#3070] Charts and Dataset lineage was broken in release 0.8.8. This has been fixed via [gma-125]
ChangeLog
- #3093 @gabe-lyons fix: fixing key-value after adding version
- #3088 @dexter-mh-lee fix(mysql-setup): Change default charset to utf8mb4
- #3091 @gabe-lyons feat: Adding clarity around qualified unions and removing extra lines for structs
- #3090 @dexter-mh-lee feat(workflow): Add mysql/postgres setup workflow
- #3083 @dexter-mh-lee feat: add support for AWS glue schema registry
- #3043 @jjoyce0510 feat(ingest): Adding an Okta Integration to extract Users, Groups, Group Membership
- #3080 @gabe-lyons fix(react): bolding field name if single token
- #3081 @shirshanka docs: Update Aug town-hall dates and previous town-hall details
- #3079 @rslanka feat: Adding support for nested schemas in ingestion and visualization
- #3078 @kevinhu fix(ingest): sqlalchemy-snowflake add constraints to make sure we don't pull in 1.2.5
- #2987 @jjoyce0510 docs(deploy): Adding confluent cloud doc
- #3076 @chinmay-bhat feat(ingest): add support for jdbc connector to kafka-connect source
- #3071 @kevinhu feat(docs): link to SQL profiling docs from each SQL source
- #3074 @jjoyce0510 refactor(build): Remove unnecessary ext modules.
- #3073 @dexter-mh-lee fix(docker): remove unnecessary components from docker-compose
- #3072 @dexter-mh-lee fix(docker): upgrade base image version
- #3068 @kevinhu feat(docs): add overrides for sidebar labels and S3 guide in sources dropdown
- #3069 @kevinhu fix(ingest): remove tags from bootstrap_mce since that is deprecated
- #3070 @gabe-lyons chore: upgrading gma to 0.2.80
- #2990 @rahulbsw feat(ingest): Added support for "add dataset ownership by regex match"
- #3067 @kevinhu fix(ingest): apply case insensitive regex matching by default
- #3066 @aseembansal-gogo docs(ingest): fix typos
- #3041 @kevinhu feat(docs): reorder and restyle navbar
- #3062 @gabe-lyons feat(cli): datahub init & docs for it
- #3064 @dexter-mh-lee fix(quickstart): remove mem_limit for datahub containers
- #3059 @kevinhu feat(docs): link to ingestion quickstart under ingestion section
- #3058 @kevinhu docs(ingest): link to docs from recipes
- #3014 @jjoyce0510 chore(frontend): Remove unused files 1/4
- #2986 @aseembansal-gogo feat(ingest): add transformers to clear dataset ownership, mark status, add browse paths
- #3056 @dexter-mh-lee fix(ingest): stop looker source from unnecessarily filling out owners
- #3055 @dexter-mh-lee fix(ingest): add default configurable timeout for rest emitter
- #3039 @kevinhu fix(docs): update metadata ingestion dev guide
- #3031 @kevinhu feat(docs): refactor source and sink ingestion docs
- #3054 @gabe-lyons fix(frontend): fixing homepage jitter
- #3051 @gabe-lyons feat(quickstart): linking to slack from quickstart
- #3053 @jjoyce0510 docs: Add Exact match search CURL example.
- #3033 @kevinhu feat(ingest): replace and warn against relative imports
- #3035 @aseembansal-gogo feat(ingest): add underlying platform for glue
- #3052 @jjoyce0510 docs: Add tags GMS API documentation
- #2973 @saxo-lalrishav fix(Business Glossary): updated glossary term search strategy
- #2996 @jensenity feat(postgres): add postgres setup docker image
- #3044 @gabe-lyons fix(frontend): fixing external url logic in charts and dashboard mapper
- #3045 @gabe-lyons fix(frontend): hide dashboard date when null
- #3022 @jsotelo fix(frontend): add support for SASL_KERBEROS_SERVICE_NAME & SASL_PLAINTEXT
- #3036 @jjoyce0510 fix(frontend): Fix exception casting in EntityClient
- #3037 @kevinhu fix(ingest): detect malformed Glue S3 script paths
- #3038 @kevinhu fix(ingest): replace backticks for lookml
- #3040 @kevinhu fix(ingest): add bigquery type mappings
- #3042 @gabe-lyons fix(frontend): fixing lineage tokenization
- #3034 @jjoyce0510 fix(gms): Validate unrecognized model fields.
DataHub v0.8.8
Release Notes
- Bugfix release for release 0.8.7
- Fixes issues with Airflow emitters, Glue default dependencies and handling system_metadata column correctly
- Adds feature to handle redirects for non-logged in users
Changelog
- #3032 @kevinhu fix(ingest): glue import type stubs only for testing
- #3030 @gabe-lyons fix(gms): handling partial system metadata in gms
- #3026 @jjoyce0510 feat(frontend): encode Original URI in Authentication Redirect
- #3029 @gabe-lyons fix(restore-indices): add system metadata restoration to restore-indices
DataHub v0.8.7
Release Stability
- There are a few bugs reported on this release that are fixed in 0.8.8. Users are highly recommended to skip past this release!
Release Highlights
- Dataset Profiling and support for time-series metadata
- UI for ML Models, Features; support for AWS SageMaker and Feast
- Cli: support for rollback operations after ingestion
- Integration fixes for Looker, dbt, and many more.
- Demos for all these features are available in our July Townhall video
ChangeLog
- #3021 @kevinhu feat(ingest): extract dbt versions into custom properties
- #3020 @gabe-lyons fix(caching): refetch query on update
- #3019 @kevinhu fix(ingest): don't assume Glue job description always exists
- #3000 @topwebtek7 fix(react): fix weird 0 rendering possible bugs
- #3018 @dexter-mh-lee feat(ingest): add kafka emitters for MetadataChangeProposal format
- #2999 @jjoyce0510 fix(gms): Adding Rest.li Write-Time Model Validation
- #3009 @jjoyce0510 fix(quickstart): Bumping Default Memory for GMS and Frontend
- #3007 @jjoyce0510 fix(gms): better logging on failed MCL / MAE
- #3008 @gabe-lyons fix(blank pages): removing apollo caching
- #3006 @jjoyce0510 fix(ci): using AspectExtractor instead of removed SnapshotToAspectMap
- #2998 @gabe-lyons fix(graphql): fetching data platforms using standard procedure
- #2944 @EnricoMi refactor(test): Refactor GraphService tests
- #2972 @jameslamb fix(ingest): map all LookML dimension types to corresponding avro types
- #3005 @dexter-mh-lee fix(ingestion): Safeguard against empty values for profile ingestion
- #3002 @dexter-mh-lee fix(datahub-upgrade) add config registry to datahub upgrade container
- #3003 @jjoyce0510 fix(dataset stats): Fix checks for existence of row and column counts
- #2997 @topwebtek7 feat(react): update dataset documents tab with a merged document column
- #2991 @topwebtek7 feat(react): update search result has result counts for each entities that has result
- #2983 @jjoyce0510 Introducing TimeSeries Aspects + Dataset Profile (Stats) Aspect
- #2984 @dexter-mh-lee fix(browse): Fix browse pagination and multi-browse path issue
- #2995 @aseembansal-gogo docs(ingest): Add instructions to install required dependency
- #2960 @gabe-lyons feat(deletes): add run commands (list, show, rollback) to datahub ingest
- #2994 @chinmay-bhat docs(ingest): fixed Snowflake recipe to escape dollar-sign
- #2981 @hsheth2 docs: remove a few outdated docs
- #2988 @jjoyce0510 docs: add docs on extracting container logs
- #2963 @hsheth2 test(ingestion): run full tests on both python versions
- #2967 @jameslamb fix(ingest): add more debug logging to LookML metadata ingestion
- #2966 @jameslamb fix(ingest): ensure that LookML files are always parsed in the same order
- #2965 @jameslamb fix(ingest): ensure workunits are created for all LookML views
- #2982 @gabe-lyons fix(tags): fixing tag applied to module for tags w/ colons in the name
- #2961 @gabe-lyons feat(ml-model): adding ml models and ml model groups
- #2975 @kevinhu feat(ingest): type stubs for boto3
- #2979 @jameslamb perf(ingest): remove unused variable in Looker ingestion
- #2980 @hsheth2 fix(ingest): infer bigquery project identifier
- #2978 @chinmay-bhat fix(ingest): fix hive ingestion to respect database configuration
- #2976 @hsheth2 feat(ingest): stricter deserialization for MCE JSONs
- #2959 @kevinhu feat(docs): tutorial for writing a custom transformer
- #2977 @hsheth2 fix(ingestion): isolate dependency requirements of airflow hooks
- #2962 @hsheth2 feat(ingest): add timezone validation to bigquery usage
- #2974 @dexter-mh-lee fix(elasticsearch-setup): fix elasticsearch setup for aws
- #2952 @hsheth2 text(ingestion): test multiple python versions in CI
- #2958 @hsheth2 feat(ingest): add Airflow TaskFlow example
- #2950 @kevinhu fix(ingest): patch lookml types and refactor ingestion sources layout
- #2957 @jameslamb fix(ingest): match nested LookML files mentioned in 'include' statements
- #2956 @gabe-lyons Revert "fix(gql): removing data platform caching in gql (#2947)"
- #2955 @kevinhu feat(ingest): ingest descriptions from dbt models
- #2948 @hsheth2 fix(ingestion): add more mypy annotations
- #2946 @hsheth2 feat(ingestion): test GMS connections before ingestion
- #2947 @gabe-lyons fix(gql): removing data platform caching in gql
- #2949 @hsheth2 test(ingestion): fix flaky package discovery test
- #2951 @kevinhu feat(docs): update videos and integration logos
- #2953 @hsheth2 fix(ingestion): resolve test bugs for 3.6
- #2943 @kevinhu feat(ingest): add logo and platform entry for Glue
- #2940 @hsheth2 fix(ingest): handle quotes in lookml properly
- #2938 @kevinhu feat(models): remove versions from metrics and hyperparams
- #2942 @hsheth2 fix(ingestion): make snowflake database names lowercase
- #2939 @hsheth2 feat(ingest): use urn builders in looker and validate data platforms
- #2941 @aseembansal-gogo refactor(ingest): make code pythonic
- #2937 @kevinhu fix(ingest): allow custom Glue scripts
- #2921 @kafkahw refactor(datahub-web): removing frontend Ember app (i.e. datahub-web folder)
- #2913 @hsheth2 fix(ingest): refactor + fix recursion in lookml file loading logic
- #2925 @hsheth2 feat(ingest): improve bigquery-usage robustness and docs
- #2931 @aseembansal-gogo fix(ingest): fix workunit name to be consistent with other sources
- #2935 @kevinhu fix(ingest): fix browsepaths and ownership urns
- #2930 @aseembansal-gogo fix(ingest): glue add support for mapping varchar, decimal types
- #2929 @kevinhu feat(ingest): refactor mlModel grouping and add browsepaths
- #2934 @hsheth2 docs(ingest): update looker + docker script docs
- #2926 @hsheth2 feat(ingest): add
make_data_platform_urn
method to builder - #2932 @topwebtek7 feat(react): surface edited descriptions on search preview for dataset, datajob, dataflow, chart, dashboard
- #2911 @hsheth2 fix(ingest): add quotes to secured kafka yaml config example
- #2927 @kevinhu feat(ingest): dbt aliases
- #2806 @saxo-lalrishav fix(react): enable relation between glossary term and datasets searchable
- #2910 @kevinhu feat(ingest): extract SageMaker metrics, hyperparameters, and external URLs
- #2915 @aseembansal-gogo docs: update docs for consistency in naming
- #2922 @kevinhu feat(ingest): test dbt ingestion with and without schemas
- #2924 @hsheth2 fix(ingest): note that views are not supported for Athena
- #2920 @hsheth2 feat(ingestion): support multiple project IDs in bigquery usage stats
- #2923 @hsheth2 fix(ingest): pin snowflake sqlalchemy connector
- #2909 @hsheth2 feat(ingest): add support for Oracle spatial types
- #2917 @kevinhu docs(ingest): update sample recipe and test input for dbt
- #2887 @topwebtek7 feat(mlFeatureTable): add graphql, ui/ux for mlFeatureTable, mlFeature, mlPrimaryKey entities
- #2916 @kevinhu fix(ingest): stringify all dbt custom props
- #2898 @aseembansal-gogo feat(ingest): Add option to change name of database for postgres
- #2912 @hsheth2 fix(ingest): issue a warning if the column list is empty
- #2894 @kevinhu feat(ingest): lineage for SageMaker model endpoints and groups
- #2905 @hsheth2 feat(ingest): add
can_add_aspect
method for MCEs - #2906 @hsheth2 test(ingest): update tox test configurations and test airflow 2.x by default
- #2904 @jjoyce0510 fix(frontend): Don't use Apollo Cache for IsAnalyticsEnabled query.
- #2877 @remisalmon feat(ingest): use node comment as description if existing else default to key
- #2889 @hsheth2 fix(react): avoid displaying "0" for ignored timestamps
- #2890 @gabe-lyons fix(search): fixing case where someone issues a null query
- #2893 @hsheth2 fix(ingest): use logger.warning instead of logger.warn
- #2888 @jameslamb fix(ingest): change LookMLSource._get_upsteam_lineage() to _get_upstream_lineage()
- #2901 @topwebtek7 feat(react): update schema history visualizing, truncate long type, original desc bug
- #2891 @hsheth2 fix(ingest): correct globs in lookml model discovery
- #2902 @kevinhu feat(ingest): add connectivity check for Looker
- #2597 @wan54 feat(react): configure Cypress + MirageJS + GraphQL mock for functional testing plus a couple of example tests
- #2903 @shirshanka docs: update docs for July townhall
- #2900 @kevinhu fix(ingest): string-ify dbt custom props
- #2899 @jjoyce0510 fix(docs): fixing miscellaneous docs
- #2788 @saxo-lalrishav fix(glossary):default browse path for glossary term
- #2868 @kevinhu feat(ingest): extract lineage between SageMaker jobs and models
- #2884 @dexter-mh-lee fix(search): Fix index builder
- #2883 @hsheth2 docs: revamp adoption section
- #2882 @hsheth2 fix(ingest): fix druid misconfiguration bug
- #2881 @hsheth2 fix(ingest): default to unlimited query log delay in bigquery-usage
- #2790 @saxo-lalrishav fix(search): enable search on business glossary terms
- #2872 @hsheth2 build(ingest): reduce dependencies for dev install
- #2874 @topwebtek7 fix(react): fix bug in description update modal
- #2866 @hsheth2 build(ingestion): add version prompt to release script
- #2869 @kevinhu ...
DataHub v0.8.6
Release Highlights
- Fix issue when using Elasticsearch as graph database in certain configurations
- Fix caching issues in React UI
- Efficiency improvement for schema aspect storage
- Improvements and fixes to various ingestion sources
Changelog
- #2861 @gabe-lyons fix: fixing lint issue
- #2860 @gabe-lyons fix(frontend): handling null aspects in AspectType
- #2857 @kevinhu feat(docs): carousels for videos and articles
- #2855 @topwebtek7 feat(react): implement visualizing historical versions of dataset schema
- #2856 @topwebtek7 fix(react): fix searchbar width issue in small screen
- #2848 @dexter-mh-lee feat(k8s): Extract helm charts into a separate repo
- #2844 @hsheth2 fix(ingest): handle 'fields' list missing in bigquery-usage
- #2852 @hsheth2 fix(ingest): delete pycache files when running clean
- #2853 @hsheth2 docs(ingest): remove hanging sentence from docs
- #2830 @kevinhu feat(ingest): SageMaker jobs and models
- #2847 @hsheth2 feat(docs): throttle and retry requests in doc generation
- #2842 @kevinhu fix(ingest): check for dbt materialization before proceeding
- #2840 @dexter-mh-lee fix(analytics): Fix SSL issue with analytics on frontend
- #2845 @hsheth2 feat(ingest): prettify stack traces in CLI
- #2851 @frsann fix(ingest): Fix glob pattern and handle possible recursion in lookml
- #2846 @hsheth2 fix(docs): fix some broken links in the docs
- #2843 @hsheth2 fix(ingest): avoid setting timestamps unless source system provides it
- #2841 @topwebtek7 feat(react): fix apollo client cache issues in entities profile pages
- #2832 @datascienceChris feat (helm datahub-gms): add ingress template to datahub-gms helm chart
- #2833 @kevinhu fix(ingest): do not fail dbt ingestion when encountering missing nodes
- #2837 @hsheth2 docs(ingest): clarify that the Kafka options are pass-through
- #2836 @hsheth2 fix(ingest): various BigQuery source fixes
- #2835 @hsheth2 fix(ingest): mask password in info-level logs
- #2834 @hsheth2 docs(ingest): update links to Kafka docs
- #2829 @shirshanka docs: update H2 2021 roadmap
- #2824 @gabe-lyons fix(mae-consumer): support standalone mae consumer without neo4j
- #2825 @hsheth2 fix(cli): change docker nuke to also remove stopped containers
- #2828 @hsheth2 feat(ci): separate metadata-ingestion into a separate workflow
- #2827 @hsheth2 fix(ingest): convert superset timestamps to micros
- #2826 @gabe-lyons docs(elastic-for-graph): Add migrating from neo4j to elastic instructions
- #2822 @hsheth2 docs(docker): fix check command reference
- #2823 @jjoyce0510 fix(docs): update OIDC docs
- #2817 @hsheth2 docs(ingest): add extra info for Redshift behind a proxy
- #2816 @hsheth2 fix(ingest): view handling resilience for redshift
- #2818 @jjoyce0510 feat(datahub-frontend): Adding basic file-based authentication to datahub-frontend
- #2814 @dexter-mh-lee chore(helm): upgrade version to v0.8.5
DataHub v0.8.5
Release Highlights
- Various stability fixes for v0.8.4
- Address docker image vulnerabilities
- New integrations: AWS SageMaker
- Support for restoring indexes with how-to
- Ingestion improvements: mongodb, looker, hive, snowflake
Changelog
- #2813 @dexter-mh-lee fix(datahub-upgrade): fix vulnerabilities
- #2804 @hsheth2 feat(ingest): basic support for complex hive types
- #2810 @dexter-mh-lee fix(k8s): upgrade helm version
- #2812 @hsheth2 feat(ingest): refactor mce comparison and add pytest update golden files option
- #2809 @hsheth2 docs(website): hide outdated FAQs page
- #2779 @dexter-mh-lee feat(backup): Add restore indices and restore backup tasks
- #2793 @hsheth2 feat(ingest): support ingesting from multiple snowflake dbs
- #2811 @jjoyce0510 fix(quickstart): Fixing manual mysql quickstart
- #2786 @hsheth2 refactor(ingest): extract sqlalchemy uri generation logic
- #2803 @dexter-mh-lee fix(usage-stats): Add usage stats factory to mae processor config
- #2800 @hsheth2 fix(ingest): better warnings and error handling for rest sink
- #2758 @kevinhu feat(ingest): SageMaker feature store ingestion
- #2770 @remisalmon feat(ingest): Improve lookml sql derived tables detection, add cascading derived tables to lineage
- #2781 @dexter-mh-lee fix(search): Filter out "removed" entities from autocomplete and analytics
- #2799 @jjoyce0510 fix(datahub-upgrade): add runtime dependency on logbackClassic
- #2801 @hsheth2 fix(ingest): quote table names in hive
- #2796 @hsheth2 fix(ingest): handle case when view definition handler is not implemented
- #2798 @dexter-mh-lee fix(usage-stats): add indices target for usage stats query
- #2797 @hsheth2 refactor(ingest): remove deprecated methods and warn on deprecated import
- #2776 @hsheth2 docs(website): add releases page
- #2794 @jjoyce0510 Update quickstart to include system prune
- #2783 @kevinhu fix(ingest): use correct platform for MongoDB ingestion
- #2785 @jjoyce0510 fix(frontend): making nested SSL configs optional
- #2792 @dexter-mh-lee feat(k8s): Upgrade to v0.8.4
- #2780 @kevinhu fix(docs): links to Feast entities
- #2782 @hsheth2 refactor(ingest): use common get_sys_time method
- #2784 @dexter-mh-lee fix(docker): Upgrade to 4.1.44 netty-all library
- #2778 @kevinhu feat(ingest): add non-random sampling for mongo
- #2774 @hsheth2 docs: upgrade docusaurus, minor ingestion updates
DataHub v0.8.4
Release Highlights
- Dataset Popularity, Recent Queries powered by Usage logs (support for Snowflake, BigQuery)
- Markdown descriptions and editing
- New Integrations : Glue Jobs, Feast
- Versioned API for metadata GETs
- No neo4j requirement, Elastic for Graph
- Docker image hardening
- Improved logging
- GCP Deployment Guide
Changelog
- #2773 @jjoyce0510 feat(logs): add thresholding, misc cleanup
- #2771 @topwebtek7 fix(react): update platform text in dataset profile header
- #2772 @dexter-mh-lee fix(nocode): removing service PDL
- #2761 @jjoyce0510 feat(logs): improve logging in GMS and datahub-frontend
- #2769 @topwebtek7 fix(react): fix graphql apollo cache update issue cause of usagestats
- #2768 @dexter-mh-lee feat(k8s): add GCP deploy recipe
- #2767 @hsheth2 fix(react): update the query frequency text label
- #2766 @hsheth2 docs(ingest): move usage stats docs into the "sources" section
- #2728 @jjoyce0510 fix(datahub-upgrade): removing the CleanupStep from datahub upgrade
- #2763 @dexter-mh-lee fix(docker): Fix dependency vulnerability
- #2765 @hsheth2 fix(react): move percent sign after number and update meta tag
- #2695 @thomasplarsson fix(frontend): auth session ttl specified in hours instead of days. M…
- #2764 @hsheth2 docs: update for Jun townhall
- #2762 @hsheth2 feat: usage stats (part 2)
- #2750 @hsheth2 feat: usage stats (part 1)
- #2759 @topwebtek7 fix(react): reverse result of topological sort, autofocus in add tag modal
- #2753 @gabe-lyons feat(elastic-as-graph): defaulting to elastic in quickstart
- #2760 @hsheth2 fix(docker): use head tag for datahub-ingestion
- #2754 @gabe-lyons Revert "fix(gms): add rest.li validation in gms (#2745)"
- #2751 @dexter-mh-lee fix(browse): sort by doc count descending
- #2752 @gabe-lyons fix(elastic-as-graph): adding elasticsearch setup back in
- #2739 @jjoyce0510 fix(gms): make get return entity type
- #2749 @remisalmon feat(ingest): add option to specify source platform database in lookml ingestion
- #2729 @kevinhu feat(ingest): ingest last-modified from dbt sources.json
- #2746 @dexter-mh-lee fix(docker): modernize docker images and fix vulnerabilities
- #2740 @jjoyce0510 fix(datahub-upgrade): add support for postgres migration
- #2745 @jjoyce0510 fix(gms): add rest.li validation in gms
- #2748 @jjoyce0510 feat(quickstart): remove orphaned docker containers on quickstart through cli
- #2744 @jjoyce0510 feat(docker): reduce quickstart footprint
- #2747 @kevinhu fix(ci): increase wait-for-it timeout to fix flaky feast test
- #2743 @kevinhu feat(ingest): print docker logs on timeout
- #2726 @gabe-lyons feat(graph): support using elasticsearch as graph backend.
- #2723 @gabe-lyons fix(gms): fixes for version aspect fetching
- #2741 @hsheth2 fix(ingest): update lookml test
- #2742 @kevinhu fix(ingest): fix lookml platform URN
- #2687 @kevinhu feat(ingest): add support for Glue ETL jobs
- #2716 @kevinhu fix(ingest): types for dbt
- #2722 @kevinhu fix(ci): increase Feast docker setup timeout
- #2737 @remisalmon fix(looker): fix invalid URN syntax error
- #2730 @dexter-mh-lee fix(docker): update default tags to head
- #2727 @gabe-lyons fix(docs): update extending-the-metadata-model.md
- #2721 @gabe-lyons fixing docs sidebar
- #2719 @dexter-mh-lee Update version for helm
DataHub v0.8.3
Release Notes
Bug fix release that fixes editable descriptions bug from previous release.
Previous version release notes: https://github.com/linkedin/datahub/releases/tag/v0.8.2
Changelog
#2718 @topwebtek7 fix(react): update schema description edit behavior