-
-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Superset deployment #3715
Superset deployment #3715
Conversation
… database connection yet
These changes have not been applied! I couldn't figure out how mount the bucket as a volume in cloud run using tf so I used the GUI. Now, when I run terraform plan tf wants to change a bunch of attributes for the cloud run instance. This commit also changes the default superset registration user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Driveby comments since I was looking at a bunch of tickets. Glad you got that role stuff sorted out - ping me whenever you want a more in-depth review!
superset/Dockerfile
Outdated
@@ -0,0 +1,14 @@ | |||
# hadolint ignore=DL3006 | |||
FROM apache/superset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdangerx should we pin the base image version? If so, can we set up dependabot to keep the base image and the requirements.txt
file up to date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should pin the base image version since a superset upgrade might require a DB migration as well: https://superset.apache.org/docs/installation/upgrading-superset/
I think we can set up Dependabot to make PRs to keep the docker image up to date by adding a docker
block to the updates
in dependabot.yml
, and then labeling the Dockerfile source: https://github.com/dependabot-fixtures/docker-with-source/blob/main/Dockerfile
And I think we can add a pip
block for requirements.txt
.
I think both of those can be in a follow-up PR, though, and we should set some sort of reminder later to check to see if the dependabot config actually worked...
If this is the first time running superset locally or you recently ran `docker compose down` you'll need to run the commands in `setup.sh`. | ||
|
||
## Making changes to the production deployment | ||
TODO: instructions on how to connect to Cloud SQL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to figure out how to use the Cloud Auth Proxy to make changes to the production database. Also, how can we protect against people making changes to the production database when they are just experimenting with the local deployment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you make changes to the Cloud SQL database for the initial deploy?
I think we can use the Docker version of the Cloud SQL Auth Proxy in a mutate-prod
docker compose file. That would just be the same as our dev docker-compose, but with the Cloud SQL proxy standing in for the postgres service. Then executing superset db-upgrade
or whatever, within the pudl-superset
docker-compose service, would point at prod Cloud SQL.
We'd need to make a new service account to give that Cloud SQL Auth Proxy. To stop people from accidentally changing the prod DB, we could restrict the ability to create a key for that SA to only a subset of Catalyst. So it'd require a fair amount of effort/oversight to be able to make a change to prod DB.
## Making changes to the production deployment | ||
TODO: instructions on how to connect to Cloud SQL | ||
|
||
## Deploy to Cloud Run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should the deployment flow be for superset? I can think of a few reasons to trigger a new deployment:
When do we want to update superset
- package in requirements.txt is updated
- the base image is updated
- we make a change to
superset_config.py
- I think we’ll probably want to redeploy superset when there is new data because changes to tables might break shared dashboards. Also, should superset point at
nightly
orstable
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think redeploying Superset for nightly builds is a good way to test that our deployment infrastructure still works, and has the benefit of catching all these other changes too - so long as the cloud build doesn't cost a ton I think that's the easiest way forward.
I also think that eventually we probably want both nightly
and stable
to be on Superset - we can default to using stable
, but have the option to connect to nightly
in the database selector if people want.
- "8080:8088" | ||
volumes: | ||
# - {path to your pudl.duckdb file}:/app/pudl.duckdb | ||
- ./roles.json:/app/roles.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to try to track the role definitions in git so our local deployments can use the roles we're using in production?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes a ton of sense to me :)
def FLASK_APP_MUTATOR(app: Flask) -> None: # noqa: N802 | ||
"""Superset function that allows you to configure the Flask app. | ||
|
||
Args: | ||
app: The Flask app instance | ||
""" | ||
app.config.update( | ||
PREFERRED_URL_SCHEME="https", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this didn't resolve the HTTP redirect issue. I think the issue might be related to the auth0 Oauth development tokens we're using.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, nothing in that list looks suspicious to me, I'd love to see the authentication logs and see if there are specific errors showing up, or what the callback URL is being parsed as, etc.
@@ -102,3 +102,297 @@ resource "google_storage_bucket_iam_binding" "binding" { | |||
"user:[email protected]", | |||
] | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to learn more about how to structure terraform projects. It'd be helpful to isolate the superset setup in a separate file or something so we can easily reuse it for other projects like the pudl usage metrics repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's pretty straightforward - terraform mashes all the *.tf files into one thing before parsing, so you can split however you want. Check out this article: https://build5nines.com/terraform-split-main-tf-into-seperate-files/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for wrangling all this! There's definitely bits to change but also it seems like it works.
The main blocking thing is just pinning the docker image - once that's done I'm happy to merge to main
.
It's probably also a good idea to get people to run their own Auth0 apps instead of using the production one for development.
Some improvements/follow-ups that could be separate PRs:
- triggering redeploys on nightly build
- define a new SA specifically for Superset and give that all the accesses it needs - probably refactoring to use
for_each
along the way. - get a docker-compose setup for pointing at production
superset/Dockerfile
Outdated
@@ -0,0 +1,14 @@ | |||
# hadolint ignore=DL3006 | |||
FROM apache/superset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should pin the base image version since a superset upgrade might require a DB migration as well: https://superset.apache.org/docs/installation/upgrading-superset/
I think we can set up Dependabot to make PRs to keep the docker image up to date by adding a docker
block to the updates
in dependabot.yml
, and then labeling the Dockerfile source: https://github.com/dependabot-fixtures/docker-with-source/blob/main/Dockerfile
And I think we can add a pip
block for requirements.txt
.
I think both of those can be in a follow-up PR, though, and we should set some sort of reminder later to check to see if the dependabot config actually worked...
If this is the first time running superset locally or you recently ran `docker compose down` you'll need to run the commands in `setup.sh`. | ||
|
||
## Making changes to the production deployment | ||
TODO: instructions on how to connect to Cloud SQL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you make changes to the Cloud SQL database for the initial deploy?
I think we can use the Docker version of the Cloud SQL Auth Proxy in a mutate-prod
docker compose file. That would just be the same as our dev docker-compose, but with the Cloud SQL proxy standing in for the postgres service. Then executing superset db-upgrade
or whatever, within the pudl-superset
docker-compose service, would point at prod Cloud SQL.
We'd need to make a new service account to give that Cloud SQL Auth Proxy. To stop people from accidentally changing the prod DB, we could restrict the ability to create a key for that SA to only a subset of Catalyst. So it'd require a fair amount of effort/oversight to be able to make a change to prod DB.
def FLASK_APP_MUTATOR(app: Flask) -> None: # noqa: N802 | ||
"""Superset function that allows you to configure the Flask app. | ||
|
||
Args: | ||
app: The Flask app instance | ||
""" | ||
app.config.update( | ||
PREFERRED_URL_SCHEME="https", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, nothing in that list looks suspicious to me, I'd love to see the authentication logs and see if there are specific errors showing up, or what the callback URL is being parsed as, etc.
@@ -102,3 +102,297 @@ resource "google_storage_bucket_iam_binding" "binding" { | |||
"user:[email protected]", | |||
] | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's pretty straightforward - terraform mashes all the *.tf files into one thing before parsing, so you can split however you want. Check out this article: https://build5nines.com/terraform-split-main-tf-into-seperate-files/
} | ||
} | ||
|
||
resource "google_sql_database_instance" "postgres_pvp_instance_name" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does pvp
mean here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I grabbed this terraform code from the google docs and forgot to rename the terraform resource. Is it too late to rename it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you rename it, it will delete the resource and create a new one. So it sort of depends on how hard it is to re-initialize the state.
deletion_protection = true | ||
} | ||
|
||
resource "google_secret_manager_secret" "superset_database_username" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hooray! I love that you're using the secret manager to manage secrets, it makes my shriveled little security heart sing.
It's a little heavyweight to define all this boilerplate just to say "there are these secrets, here are their names, and this service account has access to them." We can use the for_each
syntax of Terraform to drastically reduce that.
Note that if you use for_each
it will change all the resource names in Terraform, so it will want to recreate everything -_- which means you'll have to manually re-populate the secrets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good to know! I tackle it when I create a new service account.
resource "google_secret_manager_secret_iam_member" "superset_database_username_compute_iam" { | ||
secret_id = google_secret_manager_secret.superset_database_username.id | ||
role = "roles/secretmanager.secretAccessor" | ||
member = "serviceAccount:[email protected]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to make a new service account that's only for Superset instead of using the default Compute Engine SA. Then it's easier to restrict access to the prod database, which makes it harder to accidentally blow it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes it sure does. I'll create an issue for this.
Co-authored-by: Dazhong Xia <[email protected]>
Co-authored-by: Dazhong Xia <[email protected]>
…ker compose env vars
Thanks for all the great feedback! I pinned the docker base image version, updated auth0 env var instructions and set docker compose env var defaults. I also created some draft issues that I'll flesh out tomorrow. |
You're hitting a bunch of this error in CI:
Which is related to dagster / python version incompatibilities: dagster-io/dagster#22985 |
Overview
This PR contains superset configuration and cloud deployment changes for our data exploration tool.
Testing
How did you make sure this worked? How can a reviewer verify this?
To-do list