Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task - Create Clear Workflow to Populate Dev Environment with Example Data #23

Open
adambechtold opened this issue Feb 11, 2024 · 5 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@adambechtold
Copy link
Owner

Background

We're opening this project up to more collaborators.

Frictions

  • It's not clear how to add example data
  • Adding data requires collaborators to have their own to last.fm and Spotify developer apps

Goals

  • 🎯 Goal - It is very easy spin up Taste Explorer with example data
  • 🎯 Goal - Maintain Security of Production Data
  • 🎯 Goal - Keep costs low
@adambechtold adambechtold added the documentation Improvements or additions to documentation label Feb 11, 2024
@adambechtold adambechtold self-assigned this Feb 11, 2024
@adambechtold
Copy link
Owner Author

Approach - Write Instructions for How to Add Users from Scratch

  1. Create database
  2. Create api key to last.fm and spotify
  3. Add users using admin api
  4. Run cron jobs to backfill listening history

▼ Con - Requires collaborators to have their own last.fm account and spotify developer account
▼ Con - Complicated and time-consuming

@adambechtold
Copy link
Owner Author

Approach - Host a Small, Read-only Database

Create a very small database on AWS with example data and make read-only permissions to it public.

▲ Pro - Very easy for collaborators to get started
▼ Con - Performance could degrade as more people use it
▼ Con - Cost - This isn't free

Variant - Create separate database on existing prod infrastructure

▲ Pro - Free
▼ Con - Use by collaborators could degrade prod's performance

@adambechtold
Copy link
Owner Author

Approach - Provide Database Dump(s)

Provide .dump files with example data.

▲ Pro - Collaborators can choose their favorite database platform/approach
▲ Pro - Easy to provide various dataset sizes

Variant - Host on GitHub

▲ Pro - Version control
▼ Con - Some of the dumps could be very large, making the repo unnecessarily large

Variant - Host on S3

▲ Pro - Dumps can be very large
▼ Con - Version control is harder

@adambechtold
Copy link
Owner Author

Approach - Host docker Container Instance

Collaborators could simply pull down this repo, spin it up, and start going.

▲ Pro - Very easy to get started
▼ Con - More work to create than the .dump option

Variant - Have the container download a database .dump during start up

Host a .dump file on S3. Host a container on AWS container registry. Give the container an entrypoint script that checks if the database is already populated and, if not, downloads the dump and populates the database.

(See chat)

@adambechtold adambechtold changed the title Task - Create Clear Workflow to _Populate Dev Environment with Example Data_ Task - Create Clear Workflow to Populate Dev Environment with Example Data Feb 11, 2024
@adambechtold
Copy link
Owner Author

adambechtold commented Feb 11, 2024

Update - .dump files are available on S3

I was able to create dump files and put them on S3 but am running into some issues creating a docker container that can download them and pre-populate the database.

I'll keep working on it, but here are the .dump files in the meantime:

cc:
@tusharwebd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant