-
Notifications
You must be signed in to change notification settings - Fork 0
Getting Started
The easiest way to use Google Cloud DataLab is on Google Cloud Platform. Head over the DataLab site, and deploy an instance into a Cloud Project, so you can easily work with data in other cloud services such as BigQuery, and deploy your data pipelines for execution on the cloud.
DataLab is built and packaged as a docker container. You will need docker configured and running locally. If you're on Mac or Windows, the easiest way to get docker is via the Docker Toolbox. Download and install that, then open the Kitematic app that it installed which will create and start a 'default' VM and start the Docker server.
Add a port mapping so you can use localhost for Datalab:
VBoxManage modifyvm default --natpf1 "datalab,tcp,,8081,,8081"
Clone the Datalab repo, build it and run it:
git clone https://github.com/GoogleCloudPlatform/datalab.git
cd datalab
rm -rf build/
cd sources/
./build.sh
cd ../containers/datalab
./build.sh
./run.sh
Then open your browser to http://localhost:8081.
Note that to use any Google Cloud functionality you will need to set the project ID. You can do this by calling:
set_project_id('myproject')
within a cell in your notebook, or by setting an environment before running Datalab (in which case the project will be used as the default for all notebooks):
PROJECT_ID='myproject' ./run.sh
Replace 'myproject' with an appropriate ID for your use.