Skip to content

Commit

Permalink
docs updated
Browse files Browse the repository at this point in the history
  • Loading branch information
chrisclark committed Aug 8, 2024
1 parent 841549d commit e68c55f
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 14 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,5 @@ user_dbs/*
tmp2
chinook.sqlite
model_data.json
tst1
tst1-journal
26 changes: 13 additions & 13 deletions docs/features.rst
Original file line number Diff line number Diff line change
Expand Up @@ -176,16 +176,6 @@ File Uploads

Upload CSV or JSON files, or SQLite databases to immediately create connections for querying.

The base name of the file and the ID of the uploaded is used as the database name, to prevent collisions from multiple
users uploading a file with the same name. The base name of the file is also used as the table name (e.g. uploading
customers.csv results in a database file named customers_1.db, with a table named 'customers').

Of interest, you can also append uploaded files to previously uploaded data sources. For example, if you had a
'customers.csv' file and an 'orders.csv' file, you could upload customers.csv and create a new data source. You can
then go back and upload orders.csv with the 'Append' drop-down set to your newly-created customers database, and you
will have a resulting SQLite database connection with both tables available to be queried together. If you were to
upload a new 'orders.csv' and append it to customers, the table 'orders' would be *fully replaced* with the new file.

**How it works**

1. Your file is uploaded to the web server. For CSV files, the first row is assumed to be a header.
Expand All @@ -194,14 +184,24 @@ upload a new 'orders.csv' and append it to customers, the table 'orders' would b
4. A customer parser runs type-detection on each column for richer typer information.
5. The dataframe is coerced to these more accurate types.
6. The dataframe is written to a SQLite file, which is present on the server, and uploaded to S3.
7. The SQLite database is added as a new connection to SQL Explorer and is available for querying, just like any
7. The SQLite database file will be named <filename>_<userid>.db to prevent conflicts if different users uploaded files
with the same name.
8. The SQLite database is added as a new connection to SQL Explorer and is available for querying just like any
other data source.
8. If the SQLite file is not available locally, it will be pulled on-demand from S3 when needed.
9. Local SQLite files are periodically cleaned up by a recurring task after (by default) 7 days of inactivity.
9. If the SQLite file is not available locally, it will be pulled on-demand from S3 to the app server when needed.
10. Local SQLite files are periodically cleaned up by a recurring task after (by default) 7 days of inactivity.

Note that if the upload is a SQLite database, steps 2-5 are skipped and the database is simply uploaded to S3 and made
available for querying.

**Adding tables to uploads**

You can also append uploaded files to previously uploaded data sources. For example, if you had a
'customers.csv' file and an 'orders.csv' file, you could upload customers.csv and create a new data source. You can
then go back and upload orders.csv with the 'Append' drop-down set to your newly-created customers database, and you
will have a resulting SQLite database connection with both tables available to be queried together. If you were to
upload a new 'orders.csv' and append it to customers, the table 'orders' would be *fully replaced* with the new file.

**File formats**

- Supports well-formed .csv, and .json files. Also supports .json files where each line of the file is a separate json
Expand Down
2 changes: 1 addition & 1 deletion test_project/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@
# Tasks disabled by default, but if you have celery installed
# make sure the broker URL is set correctly
EXPLORER_TASKS_ENABLED = False
CELERY_BROKER_URL = os.environ.get('CELERY_BROKER_URL')
CELERY_BROKER_URL = os.environ.get("CELERY_BROKER_URL")

EXPLORER_S3_BUCKET = os.environ.get("EXPLORER_S3_BUCKET")
EXPLORER_S3_ACCESS_KEY = os.environ.get("EXPLORER_S3_ACCESS_KEY")
Expand Down

0 comments on commit e68c55f

Please sign in to comment.