Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapter 6 production polish #86

Merged
merged 49 commits into from
Jan 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
feb3a44
starting work on ch5+6; categorical type change; remove commented out…
trevorcampbell Dec 31, 2022
a507994
value counts, class name remap, replace in ch5
trevorcampbell Dec 31, 2022
fd9a882
remove warnings
trevorcampbell Dec 31, 2022
8b20e7f
polished ch5+6 up to euclidean dist
trevorcampbell Dec 31, 2022
bd28be9
minor bugfix
trevorcampbell Jan 1, 2023
9499a73
minor bugfix
trevorcampbell Jan 1, 2023
294103a
fixed worksheets link at end of chp
trevorcampbell Jan 1, 2023
1ad6164
fix minor section heading wording in Ch1
trevorcampbell Jan 1, 2023
ee90b8e
added nsmallest + note; better chaining for dist comps; removed comme…
trevorcampbell Jan 1, 2023
ece61a8
initial fit and predict polished; model spec -> model object
trevorcampbell Jan 1, 2023
e874666
polishing preprocessing
trevorcampbell Jan 2, 2023
c5c8769
balancing polished
trevorcampbell Jan 2, 2023
a9deb2e
pipelines
trevorcampbell Jan 2, 2023
d5b8af3
learning objs
trevorcampbell Jan 2, 2023
c1c8151
mute warnings in ch5
trevorcampbell Jan 2, 2023
863ca91
warn mute code; fixed links at end
trevorcampbell Jan 2, 2023
b2df742
restore cls2 to main branch
trevorcampbell Jan 2, 2023
938a1f6
Merge branch 'main' into ch5-6
trevorcampbell Jan 5, 2023
e7b157e
Merge branch 'ch5-6' into ch6
trevorcampbell Jan 8, 2023
c8e3a40
remove caption hack; minor fix to learning objs
trevorcampbell Jan 8, 2023
384ac14
Remove caption hack
trevorcampbell Jan 8, 2023
d80c8c3
initial improved seed explanation
trevorcampbell Jan 8, 2023
14d8825
random seed section polish done
trevorcampbell Jan 8, 2023
acda50d
polished ch6 up to tuning
trevorcampbell Jan 8, 2023
c135649
initial cross val example done
trevorcampbell Jan 8, 2023
0e3b733
in python -> in scikit
trevorcampbell Jan 8, 2023
170e267
working on cross-val
trevorcampbell Jan 9, 2023
3d72b33
polished ch6 up to predictor selection
trevorcampbell Jan 9, 2023
e05b5a8
commented out predictor selection
trevorcampbell Jan 9, 2023
ee81330
done ch6 except final under/overfit plot
trevorcampbell Jan 9, 2023
30001ee
Merge branch 'main' into ch6
trevorcampbell Jan 11, 2023
0639bc1
Merge branch 'main' into ch6
trevorcampbell Jan 11, 2023
46c487a
warnings filter in ch6; remove seed hack cell
trevorcampbell Jan 11, 2023
e191e98
remove reference to random state in train/test split
trevorcampbell Jan 11, 2023
6ca5c56
minor typesetting .method() vs method
trevorcampbell Jan 11, 2023
70dc2b5
put setup.md back in to fix broken links
trevorcampbell Jan 11, 2023
26155a0
Update source/classification2.md
trevorcampbell Jan 18, 2023
71d52b5
Update source/classification2.md
trevorcampbell Jan 18, 2023
36e725f
Update source/classification2.md
trevorcampbell Jan 18, 2023
0b7ecc4
Update source/classification2.md
trevorcampbell Jan 18, 2023
96493e4
Update source/classification2.md
trevorcampbell Jan 18, 2023
1bc8c36
Update source/classification2.md
trevorcampbell Jan 18, 2023
d533ac2
Update source/classification2.md
trevorcampbell Jan 18, 2023
28ceac8
Update source/classification2.md
trevorcampbell Jan 18, 2023
2ef3707
Update source/classification2.md
trevorcampbell Jan 18, 2023
bae52da
values -> to_numpy in randomness section
trevorcampbell Jan 18, 2023
66cf370
Update source/classification2.md
trevorcampbell Jan 18, 2023
4a88ec3
Update source/classification2.md
trevorcampbell Jan 18, 2023
a1f9454
remove code for area plot at the end of ch6
trevorcampbell Jan 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion source/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ parts:
- file: acknowledgements-python.md
- file: authors.md
- file: editors.md
#- file: setup.md
- file: setup.md
- caption: Chapters
numbered: 3
chapters:
Expand Down
5 changes: 1 addition & 4 deletions source/classification1.md
Original file line number Diff line number Diff line change
Expand Up @@ -942,7 +942,6 @@ we will discuss how to choose $K$ in the next chapter.
> which weigh each neighbor's vote differently, can be found on
> [the `scikit-learn` website](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html?highlight=kneighborsclassifier#sklearn.neighbors.KNeighborsClassifier).


```{code-cell} ipython3
knn = KNeighborsClassifier(n_neighbors=5)
knn
Expand Down Expand Up @@ -1048,7 +1047,6 @@ unscaled_cancer['Class'] = unscaled_cancer['Class'].replace({
'B' : 'Benign'
}).astype('category')
unscaled_cancer
unscaled_cancer
```

Looking at the unscaled and uncentered data above, you can see that the differences
Expand Down Expand Up @@ -1146,7 +1144,7 @@ is to *drop* the remaining columns. This default behavior works well with the re
in the {ref}`08:puttingittogetherworkflow` section), but for visualizing the result of preprocessing it can be useful to keep the other columns
in our original data frame, such as the `Class` variable here.
To keep other columns, we need to set the `remainder` argument to `'passthrough'` in the `make_column_transformer` function.
Furthermore, you can see that the new column names---{glue:}`scaled-cancer-column-0`
Furthermore, you can see that the new column names---{glue:}`scaled-cancer-column-0`
and {glue:}`scaled-cancer-column-1`---include the name
of the preprocessing step separated by underscores. This default behavior is useful in `sklearn` because we sometimes want to apply
multiple different preprocessing steps to the same columns; but again, for visualization it can be useful to preserve
Expand Down Expand Up @@ -1742,7 +1740,6 @@ unscaled_cancer['Class'] = unscaled_cancer['Class'].replace({
}).astype('category')
unscaled_cancer


# create the KNN model
knn = KNeighborsClassifier(n_neighbors=7)

Expand Down
Loading