This repo is no longer maintained. In Altair 5, the recommended way to work with large datasets is via Vegafusion. The documentation contains a section of how to use the Vegafusion renderer in Altair and if you need additional performance you can use the Vegafusion widget renderer instead.
This is a data transformer plugin for Altair that transparently serves data for Altair charts via a background WSGI server.
Note that charts will only render as long as your Python session is active.
The data server is a good option when you'll be generating multiple charts as part of an exploration of data.
First install the package and its dependencies:
$ pip install altair_data_server
Next import altair and enable the data server:
import altair as alt
alt.data_transformers.enable('data_server')
Now when you create an Altair chart, the data will be served in the background rather than embedded in the chart specification.
Once you are finished with exploration and want to generate charts that will have their data fully embedded in the notebook, you can restore the default data transformer:
alt.data_transformers.enable('default')
and carry on from there.
Remotely-hosted notebooks (like JupyterHub or Binder) usually do not allow the end user to access arbitrary ports. To enable users to work on that setup, make sure jupyter-server-proxy is installed on the jupyter server, and use the proxied data server transformer:
alt.data_transformers.enable('data_server_proxied')
The urlpath
parameter allows you to override the prefix of the proxy URL. By
default, it's set to ..
, which is currently the only way to make it work for
arbitrary users when running inside the classic notebook on Binder. If you
intend your notebooks to be run on Binder but inside JupyterLab, change it to
.
instead, which will work provided JupyterLab is in the default
workspace.
# for notebooks intended for JupyterLab on Binder
alt.data_transformers.enable('data_server_proxied', urlpath='.')
On a custom JupyterHub instance, a much more robust option is to take advantage
of JupyterHub's /user-redirect
feature (which is not available on Binder):
# this will work for any JupyterHub user, whether they're using the classic
# notebook, JupyterLab in the default workspace, or JupyterLab in a named
# workspace
alt.data_transformers.enable('data_server_proxied', urlpath='/user-redirect')
If your JupyterHub lives somewhere else than at your server's root, add the
appropriate prefix to urlpath
.
You can see this in action, as well as read some of the motivation for this plugin, in the example notebook: AltairDataServer.ipynb. Click the Binder or Colab links above to try it out in your browser.
Because jupyter-server-proxy requires at least Python 3.5, the methods described in Remote Systems do not work do not work for older versions of Python.