Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DABs support for Unity Catalog volumes #1762

Merged
merged 78 commits into from
Dec 2, 2024
Merged

Conversation

shreyas-goenka
Copy link
Contributor

@shreyas-goenka shreyas-goenka commented Sep 9, 2024

Changes

This PR adds support for UC volumes to DABs.

Can I use a UC volume managed by DABs in artifact_path?

Yes, but we require the volume to exist before being referenced in artifact_path. Otherwise you'll see an error that the volume does not exist. For this case, this PR also adds a warning if we detect that the UC volume is defined in the DAB itself, which informs the user to deploy the UC volume in a separate deployment first before using it in artifact_path.

We cannot create the UC volume and then upload the artifacts to it in the same bundle deploy because bundle deploy always uploads the artifacts to artifact_path before materializing any resources defined in the bundle. Supporting this in a single deployment requires us to migrate away from our dependency on the Databricks Terraform provider to manage the CRUD lifecycle of DABs resources.

Why do we not support preset.name_prefix for UC volumes?

UC volumes will not have a dev_shreyas_goenka prefix added in mode: development. Configuring presets.name_prefix will be a no-op for UC volumes. We have decided not to support prefixing for UC resources. This is because:

  1. UC provides its own namespace hierarchy that is independent of DABs.
  2. Users can always manually use ${workspace.current_user.short_name} to configure the prefixes manually.

Customers often manually set up a UC hierarchy for dev and prod, including a schema or catalog per developer. Thus, it's often unnecessary for us to add prefixing in mode: development by default for UC resources.

In retrospect, supporting prefixing for UC schemas and registered models was a mistake and will be removed in a future release of DABs.

Tests

Unit, integration test, and manually.

Manual Testing cases:

  1. UC volume does not exist:
➜  bundle-playground git:(master) ✗ cli bundle deploy
Error: failed to fetch metadata for the UC volume /Volumes/main/caps/my_volume that is configured in the artifact_path: Not Found
  1. UC Volume does not exist, but is defined in the DAB
➜  bundle-playground git:(master) ✗ cli bundle deploy
Error: failed to fetch metadata for the UC volume /Volumes/main/caps/managed_by_dab that is configured in the artifact_path: Not Found

Warning: You might be using a UC volume in your artifact_path that is managed by this bundle but which has not been deployed yet. Please deploy the UC volume in a separate bundle deploy before using it in the artifact_path.
  at resources.volumes.bar
  in databricks.yml:24:7

@shreyas-goenka shreyas-goenka marked this pull request as ready for review September 16, 2024 02:10
bundle/config/resources/volume.go Show resolved Hide resolved

// We don't need to display any prompts in this case.
if len(dltActions) == 0 && len(schemaActions) == 0 {
if len(schemaActions) == 0 && len(dltActions) == 0 && len(volumeActions) == 0 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces yet again interactive flow to DABs. At this point, I think it's worth it to invest some time in adding proper end to end tests for this. I'd like to pick it up in the next ~month.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's pick this next quarter as in scope for terminal UX improvements.

@shreyas-goenka
Copy link
Contributor Author

Nightlies passed once. Triggered another round after some minor updates.

Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more on consistency in messages.

Approving to unblock merge once that is addressed.

bundle/libraries/filer_volume.go Outdated Show resolved Hide resolved
if errors.Is(err, apierr.ErrNotFound) {
// Since the API returned a 404, the volume does not exist in the workspace.
// Modify the error message to provide more context.
baseErr.Summary = fmt.Sprintf("volume %s does not exist: %s", volumePath, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This results in:

Error: volume /Volumes/my/artifact/path does not exist: Not Found

The suffix is superfluous.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this on DM: error conditions need more work but will be done in a separate PR.

bundle/libraries/filer_volume.go Show resolved Hide resolved
bundle/libraries/filer_volume.go Show resolved Hide resolved
Copy link

github-actions bot commented Dec 2, 2024

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/cli

Inputs:

  • PR number: 1762
  • Commit SHA: d460bd68591f016748d1b06f635e648b82e077e0

Checks will be approved automatically on success.

@eng-dev-ecosystem-bot
Copy link
Collaborator

Test Details: go/deco-tests/12122596769

@shreyas-goenka shreyas-goenka added this pull request to the merge queue Dec 2, 2024
Merged via the queue into main with commit 2847533 Dec 2, 2024
10 checks passed
@shreyas-goenka shreyas-goenka deleted the feature/uc-volumes branch December 2, 2024 21:23
pietern added a commit that referenced this pull request Dec 5, 2024
**New features for Databricks Asset Bundles:**

This release adds support for managing Unity Catalog volumes as part of your bundle configuration.

Bundles:
 * Add DABs support for Unity Catalog volumes ([#1762](#1762)).
 * Support lookup by name of notification destinations ([#1922](#1922)).
 * Extend "notebook not found" error to warn about missing extension ([#1920](#1920)).
 * Skip sync warning if no sync paths are defined ([#1926](#1926)).
 * Add validation for single node clusters ([#1909](#1909)).
 * Fix segfault in bundle summary command ([#1937](#1937)).
 * Add the `bundle_uuid` helper function for templates ([#1947](#1947)).
 * Add default value for `volume_type` for DABs ([#1952](#1952)).
 * Properly read Git metadata when running inside workspace ([#1945](#1945)).
 * Upgrade TF provider to 1.59.0 ([#1960](#1960)).

Internal:
 * Breakout variable lookup into separate files and tests ([#1921](#1921)).
 * Add golangci-lint v1.62.2 ([#1953](#1953)).

Dependency updates:
 * Bump golang.org/x/term from 0.25.0 to 0.26.0 ([#1907](#1907)).
 * Bump github.com/Masterminds/semver/v3 from 3.3.0 to 3.3.1 ([#1930](#1930)).
 * Bump github.com/stretchr/testify from 1.9.0 to 1.10.0 ([#1932](#1932)).
 * Bump github.com/databricks/databricks-sdk-go from 0.51.0 to 0.52.0 ([#1931](#1931)).
github-merge-queue bot pushed a commit that referenced this pull request Dec 5, 2024
**New features for Databricks Asset Bundles:**

This release adds support for managing Unity Catalog volumes as part of
your bundle configuration.

Bundles:
* Add DABs support for Unity Catalog volumes
([#1762](#1762)).
* Support lookup by name of notification destinations
([#1922](#1922)).
* Extend "notebook not found" error to warn about missing extension
([#1920](#1920)).
* Skip sync warning if no sync paths are defined
([#1926](#1926)).
* Add validation for single node clusters
([#1909](#1909)).
* Fix segfault in bundle summary command
([#1937](#1937)).
* Add the `bundle_uuid` helper function for templates
([#1947](#1947)).
* Add default value for `volume_type` for DABs
([#1952](#1952)).
* Properly read Git metadata when running inside workspace
([#1945](#1945)).
* Upgrade TF provider to 1.59.0
([#1960](#1960)).

Internal:
* Breakout variable lookup into separate files and tests
([#1921](#1921)).
* Add golangci-lint v1.62.2
([#1953](#1953)).

Dependency updates:
* Bump golang.org/x/term from 0.25.0 to 0.26.0
([#1907](#1907)).
* Bump github.com/Masterminds/semver/v3 from 3.3.0 to 3.3.1
([#1930](#1930)).
* Bump github.com/stretchr/testify from 1.9.0 to 1.10.0
([#1932](#1932)).
* Bump github.com/databricks/databricks-sdk-go from 0.51.0 to 0.52.0
([#1931](#1931)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants