- Python
- PDM
- Python Fire
$ pdm install
# See below examples for command prefixes where `...` is shown.
$ ... metabase-serialization-cli.py ORIGINAL_EXPORT_ALL_COLLECTIONS.tgz change_list.yml [--output_path ./OUTPUT_TARGET_PATH]
ORIGINAL_EXPORT_ALL_COLLECTIONS.tgz
- MUST EXPORT ALL COLLECTIONS.
- Failure to do so may cause naming collisions or overwrite your data when you import the results.
change_list.yml
- Follows
change_list.yml
format described below.
- Follows
OUTPUT_TARGET_PATH
optional- Defaults to current working directory.
- Directory must exist.
- New sub-directory created with each run with a timestamp suffix.
# Using pdm run
$ pdm run metabase-serialization-cli.py export-12345.tgz change_list.yml --output_path ./output
# Create, enter, and run inside a virtual environment (re: bash/csh/zsh)
$ pdm venv create serialization-environment
$ eval $(pdm venv activate serialization-environment)
(serialization-environment) $ ./src/metabase-serialization-cli.py export-12345.tgz change_list.yml --output_path ./output
# change_list.yml
# TODO: Add JSON Schema spec for this YAML
# TODO: handle copy model
# TODO: handle copy dashboard
# TODO: handle copy collection
# TODO: handle copy dependent content
# TODO: handle archive collection contents before update
# TODO: handle replace card
# TODO: handle replace model
# TODO: handle replace dashboard
# TODO: handle replace collection
# TODO: handle update database
# TODO: handle update schema
# TODO: handle update table
# TODO: handle update field
# TODO: handle update card
# TODO: handle update model
# TODO: handle update dashboard
changes:
# use a `copy` clause if the target collection does not exist
# you can leave `to` clauses blank if you want them to be generated
# a fatal error will occur if there are any naming collisions detected
# checks for naming collisions and does not overwrite
- create: # EXAMPLE: copying a collection
source:
entity_id: collection_entity_id_1234
changes:
collection_id: collection_entity_id_5678
database_id: 'Sample Database 2'
dataset_query:
database: 'Sample Database 2'
- create: # EXAMPLE: copying a card
source:
entity_id: card_entity_id_12345
changes:
collection_id: collection_entity_id_5678
database_id: 'Sample Database 2'
dataset_query:
database: 'Sample Database 2'
# use a `replace` clause if the target collection does not exist
# ?? existing contents in the collection will be unaltered unless they are included in the export and change list
# every object`to` must be specified
# a warning will occur if there are any naming collisions detected in the `to` clause
# warns on naming collisions but does overwrite
- replace:
target:
entity_id: collection_entity_id_5678
source:
entity_id: collection_entity_id_1234
changes:
authority_level: official
name: 'zendesk issues - original'
slug: zendesk_issues_original
options:
archive_existing_content: True # archives any content in target that is not included in source
# TODO: handle copy card
- update: # Example: archiving content
target:
entity_id: question_entity_id_12345
changes:
archived: True
- Export all collections via API to tgz file
- Process changes
- Untar/gzip
- Verify input files exist
- tar/gzip export
- change_list.yml
- Target output folder
- Index all entity_ids and names
- Find relevant files based on change_list.yml
- Examine the change list and trace dependencies
- Dependencies found to STDOUT
- Trace dependencies
- Start at highest level
- Dashboard
- Question/Model
- Can have multiple levels of nesting
- Model
- Field
- Table
- Schema
- Database
- Start at highest level
- Exceptions if not found
- Update files
- Update file sections
- Find relevant sections
- Exceptions if not found
- Update section
- Save new file/overwrite
- Examine the change list and trace dependencies
- export serialization files via API curl from localhost test environment
- Python script to load
change_list_path.yml
file with list of changes to be made like: - Python script to apply
change_list_path.yml
changes to givenEXPORT_PATH
- Python script create output file (check for overwrite with --force)
- Python script create output folders (check for overwrite with --force)
- Add a flag to the change_list.yml to clear the contents (archive/move to trash) of the destination of contents prior to changes
- Deletes/Archives/Moves to Trash
- Objects cannot be deleted
- Objects can be archived (in Metabase v49 and lower)
- Objects can be moved to trash (in Metabase v50 and higher)
- If a source object is archived/moved to trash, it will be archived/moved to trash in the target (or created in the archive/trash if it does not exist)
- If a source object is created on the target and does not exist in the source, it will be unaffected (assuming there are no identity collisions)
- Archiving objects just changes the
archived
flag totrue
, but otherwise do not alter the object
- Entity IDs must be unique
- Must handle nested dependencies
- Segments and Metrics
- referred to by entity_id instead of name
- because in Databases folder
- Filter IDs
- unsure how handled, but don't seem to collide when share the same identifier
- Filenames ignored re: entity_ids and names
- lookout for errors on "duplicated entity_ids"
- not sure when or why they occur