Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update workshop sections #1-2 #17

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
9 changes: 4 additions & 5 deletions 1-new-workspace/1-setup-compute.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,19 @@ To run through this workshop, you will need an Azure subscription and an Azure M
## Creating an AzureML compute cluster
We will do a number of actions that require a compute target to be exectuted on. We will start by creating a cluster of CPU VMs.

1. Navigate to 'Compute' in the Manage/Compute section and click 'Add' ![](add_compute.png)
1. Navigate to 'Compute' > 'Training Clusters' in the "Manage" section and click 'New'.

1. Call the cluster 'cpu-cluster' and choose the type 'Machine Learning Compute'.
1. Call the cluster 'cpu-cluster'.
- For machine size choose 'Standard_D2_v2' (that is an inexpensive general purpose VM size at about $0.14/hour).
- Set the 'minimum number of nodes' to 0 and the the 'maximum number of nodes' to 10. That way the cluster will scale automatically to up to 10 nodes as jobs require them.
- Set the 'Idle seconds before scale down' to 7200. That means that nodes will be kept around for 3 hours before they are spun down. That way, during our workshop, jobs won't have to wait for spin-up. Make sure that number is lower if you are using a more expensive VM size.
- Set the 'Idle seconds before scale down' to 10800. That means that nodes will be kept around for 3 hours before they are spun down. That way, during our workshop, jobs won't have to wait for spin-up. Make sure that number is lower if you are using a more expensive VM size.
![](create_cluster.png)

## Creating an AzureML Notebook VM

Next, we will create a Notebook VM. The Notebook VM will serve as an interactive workstation in the cloud that serves as a Jupyter server, but also hosts and instance of RStudio server and can run TensorBoard, Bokeh, Shiny or other apps used during the developement work of a data scientist.

1. Navigate to 'Notebook VMs' and click on 'New':
![](new_notebook_vm.png)
1. Navigate to 'Notebook VMs' tab in Compute and click on 'New'.

1. Choose some sufficiently unique name, keep the default VM type (STANDARD_DS3V2 -- a fairly inexpensive machine type costing about $0.27/hour) and click 'Create':
![](create_notebook_vm.png)
Expand Down
15 changes: 10 additions & 5 deletions 1-new-workspace/2-dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,19 +19,24 @@ Datasets enable:
1. Navigate to the left pane of your workspace. Select Datasets under the Assets section.
![](datasets.png)

1. Click on 'Create dataset' and choose 'From local files'.
Click on 'Create dataset' and choose 'From local files'. Name the dataset '**IBM-Employee-Attrition**' and then click 'Next'. Make sure to leave the dataset type as Tabular.
![](from_local_files.png)

1. Click 'Browse', choose the file you had downloaded, name the dataset **'IBM-Employee-Attrition'**, and then click 'Done' to complete the creation of the new dataset. Make sure to leave the Type set to Tabular.
1. Click 'Browse', choose the file you had downloaded, and click 'Next' to create the dataset in the workspace's default Blob storage.
![](upload.png)

## Generating a Profile
1. Click 'Next' through the following "Settings and preview" and "Schema" sections to verify that everything looks correct.

1. Finally, in the "Confirm Details" section, select "Profile this dataset after creation" and specify the 'cpu-cluster' that you previously created as the compute to use for profiling.
![](create_dataset.png)

## Explore the dataset

1. Now, click on the newly created dataset and click 'Explore'. Here you can see the fields of the Tabular dataset.
![](dataset_explore.png)

1. To get more details (in particulare for larger datasets), click 'Generate profile', select the cluster you created and then click 'Generate' to generate profile information for this dataset. This will take little while, since the cluster needs to spin up a node, so we will move to the next task and come back to this later.
![](generate_profile.png)
1. To view the profile of the dataset we generated in the previous step, click the "Profile" tab. If you want to regenerate a profile (or you created the dataset without selecting the profile option), you can click "Generate profile" and select a cluster to generate profile information for the dataset.
![](view_profile.png)


For more information on datasets, see the how-to for more information on creating and using Datasets. https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-create-register-datasets
Binary file modified 1-new-workspace/create_cluster.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 1-new-workspace/create_dataset.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified 1-new-workspace/dataset_explore.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified 1-new-workspace/from_local_files.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified 1-new-workspace/upload.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 1-new-workspace/view_profile.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -429,7 +429,7 @@
"metadata": {},
"outputs": [],
"source": [
"from azureml.contrib.interpret.visualize import ExplanationDashboard"
"from interpret_community.widget import ExplanationDashboard"
]
},
{
Expand All @@ -438,7 +438,7 @@
"metadata": {},
"outputs": [],
"source": [
"ExplanationDashboard(global_explanation, model, x_test)"
"ExplanationDashboard(global_explanation, model, datasetX=x_test)"
]
},
{
Expand Down
8 changes: 4 additions & 4 deletions 2-interpretability/2-explain-model-on-amlcompute.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,7 @@
"from azureml.core.compute_target import ComputeTargetException\n",
"\n",
"# Choose a name for your CPU cluster\n",
"cpu_cluster_name = \"cpu\"\n",
"cpu_cluster_name = \"cpu-cluster\"\n",
"\n",
"cpu_cluster = ws.compute_targets[cpu_cluster_name]"
]
Expand Down Expand Up @@ -271,7 +271,7 @@
" compute_target=cpu_cluster,\n",
" entry_script='train_explain.py',\n",
" pip_packages=pip_packages,\n",
" conda_packages=['scikit-learn'],\n",
" conda_packages=['scikit-learn<=0.21.3'], #need to pin scikit-learn version until next release of azureml-interpret to support latest scikit-learn release\n",
" inputs=[ws.datasets['IBM-Employee-Attrition'].as_named_input('attrition')])\n",
"\n",
"run = experiment.submit(estimator)\n",
Expand Down Expand Up @@ -431,7 +431,7 @@
"metadata": {},
"outputs": [],
"source": [
"from azureml.contrib.interpret.visualize import ExplanationDashboard"
"from interpret_community.widget import ExplanationDashboard"
]
},
{
Expand All @@ -440,7 +440,7 @@
"metadata": {},
"outputs": [],
"source": [
"ExplanationDashboard(global_explanation, original_model, x_test)"
"ExplanationDashboard(global_explanation, original_model, datasetX=x_test)"
]
},
{
Expand Down
5 changes: 1 addition & 4 deletions 2-interpretability/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,7 @@ You are going to work on the Notebook VM you created [earlier](../1-new-workspac

git clone https://github.com/danielsc/azureml-workshop-2019

2. After the clone completes, in the file explorer on the left, navigate to the folder `2-interpretability` and open the notebook `1-simple-feature-transformations-explain-local.ipynb`:
![](notebook.png)

Now follow the instructions in the notebook.
2. After the clone completes, in the file explorer on the left, navigate to the folder `2-interpretability` and run through the three notebooks starting with `0-Setup.ipynb`.



Binary file modified 2-interpretability/notebook.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.