Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encourage the use of root_path in production to ensure single deployment #1712

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
3 changes: 3 additions & 0 deletions bundle/bundle.go
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ type Bundle struct {
// It is loaded from the bundle configuration files and mutators may update it.
Config config.Root

// Target stores a snapshot of the Root.Bundle.Target configuration when it was selected by SelectTarget.
Target *config.Target `json:"target_config,omitempty" bundle:"internal"`

// Metadata about the bundle deployment. This is the interface Databricks services
// rely on to integrate with bundles when they need additional information about
// a bundle deployment.
Expand Down
22 changes: 20 additions & 2 deletions bundle/config/mutator/process_target_mode.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package mutator

import (
"context"
"fmt"
"strings"

"github.com/databricks/cli/bundle"
Expand Down Expand Up @@ -155,8 +156,21 @@ func validateProductionMode(ctx context.Context, b *bundle.Bundle, isPrincipalUs
}
}

if !isPrincipalUsed && !isRunAsSet(r) {
return diag.Errorf("'run_as' must be set for all jobs when using 'mode: production'")
// We need to verify that there is only a single deployment of the current target.
// The best way to enforce this is to explicitly set root_path.
advice := fmt.Sprintf(
"set 'workspace.root_path' to make sure only one copy is deployed. A common practice is to use a username or principal name in this path, i.e. root_path: /Workspace/Users/%s/.bundle/${bundle.name}/${bundle.target}",
b.Config.Workspace.CurrentUser.UserName,
)
if !isExplicitRootSet(b) {
if isRunAsSet(r) || isPrincipalUsed {
// Just setting run_as is not enough to guarantee a single deployment,
// and neither is setting a principal.
// We only show a warning for these cases since we didn't historically
// report an error for them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is breaking IFF:

  1. You're a regular user
  2. You don't have run_as set (i.e. it runs as self)

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We previously showed an error under those conditions ('run_as' must be set). We now show an error about root_path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should continue to comment setting run_as, even if root_path is configured?

If a user goes in and looks at defaults and configures it to ~/some/path, it'll still have multiple deployments if multiple people deploy it, even though the warning is silenced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think it's a good practice to set run_as despite its limitations and it seems okay to set it in the templates. Providing a warning when you only set run_as and don't set root_path seems like a sweet spot to me. Providing an error would be a breaking change and doesn't really seem warranted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should de-emphasize run_as in general?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_as still seems helpful to me and it will become much more usable once we add support for it in pipelines.

return diag.Warningf("target with 'mode: production' should " + advice)
}
return diag.Errorf("target with 'mode: production' must " + advice)
}
return nil
}
Expand All @@ -173,6 +187,10 @@ func isRunAsSet(r config.Resources) bool {
return true
}

func isExplicitRootSet(b *bundle.Bundle) bool {
lennartkats-db marked this conversation as resolved.
Show resolved Hide resolved
return b.Target != nil && b.Target.Workspace != nil && b.Target.Workspace.RootPath != ""
}

func (m *processTargetMode) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
switch b.Config.Bundle.Mode {
case config.Development:
Expand Down
21 changes: 19 additions & 2 deletions bundle/config/mutator/process_target_mode_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -323,15 +323,15 @@ func TestProcessTargetModeProduction(t *testing.T) {
b := mockBundle(config.Production)

diags := validateProductionMode(context.Background(), b, false)
require.ErrorContains(t, diags.Error(), "run_as")
require.ErrorContains(t, diags.Error(), "target with 'mode: production' must set 'workspace.root_path' to make sure only one copy is deployed. A common practice is to use a username or principal name in this path, i.e. root_path: /Workspace/Users/[email protected]/.bundle/${bundle.name}/${bundle.target}")

b.Config.Workspace.StatePath = "/Shared/.bundle/x/y/state"
b.Config.Workspace.ArtifactPath = "/Shared/.bundle/x/y/artifacts"
b.Config.Workspace.FilePath = "/Shared/.bundle/x/y/files"
b.Config.Workspace.ResourcePath = "/Shared/.bundle/x/y/resources"

diags = validateProductionMode(context.Background(), b, false)
require.ErrorContains(t, diags.Error(), "production")
require.ErrorContains(t, diags.Error(), "target with 'mode: production' must set 'workspace.root_path' to make sure only one copy is deployed. A common practice is to use a username or principal name in this path, i.e. root_path: /Workspace/Users/[email protected]/.bundle/${bundle.name}/${bundle.target}")

permissions := []resources.Permission{
{
Expand Down Expand Up @@ -377,6 +377,23 @@ func TestProcessTargetModeProductionOkForPrincipal(t *testing.T) {
require.NoError(t, diags.Error())
}

func TestProcessTargetModeProductionOkWithRootPath(t *testing.T) {
b := mockBundle(config.Production)

// Our target has all kinds of problems when not using service principals ...
diags := validateProductionMode(context.Background(), b, false)
require.Error(t, diags.Error())

// ... but we're okay if we specify a root path
b.Target = &config.Target{
Workspace: &config.Workspace{
RootPath: "some-root-path",
},
}
diags = validateProductionMode(context.Background(), b, false)
require.NoError(t, diags.Error())
}

// Make sure that we have test coverage for all resource types
func TestAllResourcesMocked(t *testing.T) {
b := mockBundle(config.Development)
Expand Down
7 changes: 5 additions & 2 deletions bundle/config/mutator/select_target.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ type selectTarget struct {
}

// SelectTarget merges the specified target into the root configuration.
// After merging, it removes the 'Targets' section from the configuration.
func SelectTarget(name string) bundle.Mutator {
return &selectTarget{
name: name,
Expand All @@ -31,7 +32,7 @@ func (m *selectTarget) Apply(_ context.Context, b *bundle.Bundle) diag.Diagnosti
}

// Get specified target
_, ok := b.Config.Targets[m.name]
target, ok := b.Config.Targets[m.name]
if !ok {
return diag.Errorf("%s: no such target. Available targets: %s", m.name, strings.Join(maps.Keys(b.Config.Targets), ", "))
}
Expand All @@ -43,13 +44,15 @@ func (m *selectTarget) Apply(_ context.Context, b *bundle.Bundle) diag.Diagnosti
}

// Store specified target in configuration for reference.
b.Target = target
b.Config.Bundle.Target = m.name

// We do this for backward compatibility.
// TODO: remove when Environments section is not supported anymore.
b.Config.Bundle.Environment = b.Config.Bundle.Target

// Clear targets after loading.
// Cleanup the original targets and environments sections since they
// show up in the JSON output of the 'summary' and 'validate' commands.
b.Config.Targets = nil
b.Config.Environments = nil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of keeping these around here, could you break out a field on the bundle struct where we can keep a snapshot of the selected target? Then you can interrogate it and there's no risk of other mutators changing it after selection. The targets in the configuration have no significance beyond this point.

E.g.

// Target stores a snapshot of the target configuration when it was selected.
Target *config.Target

Copy link
Contributor Author

@lennartkats-db lennartkats-db Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it cleaner to just remove the side effect from select_target? Instead of just recording which target is selected, the mutator removes fields, which is a bit hard to discover and not really motivated in the code. It's a bit surprising if you want to build a new mutator that consumes this value. Based on your comments, the motivation is to clean things up in order for consumption by summary/validate; shouldn't we just make that a separate step?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I see as a risk is that keeping them around means another location that new mutators can go and look at, even though everything under targets no longer has any effect. Variable interpolation won't run either, so values under it shouldn't be used.

I see how this is most convenient though. @andrewnester What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just that over the past year I ran into this problem twice. You need to use a step through debugger to find where this property is secretly deleted. And the code that deletes it includes no rationale and is just meant to select the default target.

Copy link
Contributor Author

@lennartkats-db lennartkats-db Oct 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I don't want to leave this PR open just because we can't make a decision here. I added 6ea5306 which merges the cleanup behavior back into SelectTarget and adds a few comments about the behavior for maintainers.

Expand Down
4 changes: 2 additions & 2 deletions bundle/config/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,8 @@ type Root struct {

// Targets can be used to differentiate settings and resources between
// bundle deployment targets (e.g. development, staging, production).
// If not specified, the code below initializes this field with a
// single default-initialized target called "default".
// Note that this field is set to 'nil' by the SelectTarget mutator;
// use bundle.Bundle.Target to access the selected target configuration.
Targets map[string]*Target `json:"targets,omitempty"`

// DEPRECATED. Left for backward compatibility with Targets
Expand Down
Loading