Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Superset persistence from sqlLite to Postgres #36

Open
himeshr opened this issue Jul 22, 2024 · 3 comments
Open

Migrate Superset persistence from sqlLite to Postgres #36

himeshr opened this issue Jul 22, 2024 · 3 comments
Assignees

Comments

@himeshr
Copy link
Contributor

himeshr commented Jul 22, 2024

Migrate Superset persistence from sqlLite to Postgres DB on Prod RDS.

Options to migrate:

  1. Setup empty psql db for version 4.0.1 and Copy over data in CSV format, seems most feasible. Refer https://medium.com/@aaronbannin/migrating-superset-to-postgres-63d2c96c5102
  2. Convert sqlLite file to psql. Is tedious and its not guaranteed that the output superset DB would actually be usable by superset app correctly at all times. Refer https://stackoverflow.com/questions/4581727/how-to-convert-sqlite-sql-dump-file-to-postgresql
  3. Using import export Reference
@himeshr himeshr changed the title Migrate Superset persistence from sqlLite to Postgres DB on Prod RDS Migrate Superset persistence from sqlLite to Postgres Jul 22, 2024
@vedfordev
Copy link
Contributor

As discussed with @himeshr Followed 3rd approach

Currently working on assets export

  1. go to api file
cd /home/superset/superset/lib/python3.8/site-packages/superset/importexport
  1. change api.py file. comment attachment_filename variable and download_name=filename, in send_file function
  2. call /api/v1/assets/export/ and get data

Outcome and further process

  1. we are getting (databases, datasets, charts, dashboards, saved queries) in export
  2. need to find way to get users, roles & row level security

@vedfordev
Copy link
Contributor

vedfordev commented Aug 16, 2024

Steps to do :

  1. create db with superset 2.0.1
  2. migrate data with pgloader
  3. create docker for superset 4.0.1

Issue identify in pgloader

  1. giving error to convert binary(14) to uuid
    solution:
  • go to sql lite db and find the tables which have binary(14) uuid
    -- to get column which is used in uuid

  SELECT tbl_name
  FROM sqlite_master
  WHERE type='table'
    AND EXISTS (
      SELECT 1
      FROM pragma_table_info(tbl_name)
      WHERE name = 'uuid'
  );
  • created below config file
LOAD DATABASE
   FROM sqlite://superset.db
   INTO postgresql://superset_user:superset_password@localhost:5435/superset2

WITH data only,
   reset sequences,
   truncate,
   concurrency = 1,
   batch rows = 500,
   prefetch rows = 1000



CAST column sql_metrics.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column clusters.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column columns.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column metrics.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column dashboard_email_schedules.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column slice_email_schedules.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column saved_query.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column report_execution_log.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column table_columns.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column dashboards.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column datasources.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column dbs.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column slices.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column tables.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column embedded_dashboards.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column sl_datasets.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column sl_tables.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column sl_columns.uuid to uuid drop typemod using byte-vector-to-hexstring,
   column key_value.uuid to uuid drop typemod using byte-vector-to-hexstring

SET work_mem TO '128MB',
  maintenance_work_mem TO '512MB';


  1. create network for superset and postgres and run postgres
   docker network create superset_network
   docker run -d \
    -p 5435:5432 \
    --name postgres_db \
    --network superset_network \
    -e POSTGRES_USER=superset_user \
    -e POSTGRES_PASSWORD=superset_password \
    -e POSTGRES_DB=superset_db \
    postgres

  1. create db superset2

  2. create container of superset 2.0.1 and with below command

  docker run -d -p 8088:8088  --network superset_network  -e "SUPERSET_SECRET_KEY=abc" --name superset_2.0.1 apache/superset:2.0.1
  1. go to container and add path to postgres db and restart the container
docker exec -it  -u root superset_2.0.1 bash
echo "SQLALCHEMY_DATABASE_URI = 'postgresql+psycopg2://superset_user:superset_password@postgres_db:5432/superset2'" > superset_config.py
SUPERSET_CONFIG_PATH=/app/superset_config.py
  1. go to superset container and run below command
superset db upgrade
superset init

6.run command for pgloader and check the error

pgloader pgloaderdataonly.conf > error.txt

@vedfordev
Copy link
Contributor

move card to qa ready. implementation team will do sanity test. @himeshr @pkundu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: QA Ready
Development

No branches or pull requests

2 participants