Skip to content

Latest commit

 

History

History
48 lines (30 loc) · 2.09 KB

03-decision-trees.md

File metadata and controls

48 lines (30 loc) · 2.09 KB

6.3 Decision trees

Slides

Notes

Decision Trees are powerful algorithms, capable of fitting complex datasets. The decision trees make predictions based on the bunch of if/else statements by splitting a node into two or more sub-nodes.

With versatility, the decision tree is also prone to overfitting. One of the reason why this algorithm often overfits because of its depth. It tends to memorize all the patterns in the train data but struggle to performs well on the unseen data (validation or test set).

To overcome with overfitting problem, we can reduce the complexity of the algorithm by reducing the depth size.

The decision tree with only a single depth is called decision stump and it only has one split from the root.

Classes, functions, and methods:

  • DecisionTreeClassifier: classification model from sklearn.tree class.
  • max_depth: hyperparameter to control the depth of decision tree algorithm.
  • export_text: method from sklearn.tree class to display the text report showing the rules of a decision tree.

Note: we have already covered DictVectorizer in session 3 and roc_auc_score in session 4 respectively.

Add notes from the video (PRs are welcome)

⚠️ The notes are written by the community.
If you see an error here, please create a PR with a fix.

Navigation