Honest decision forests and trees implemented efficiently and scikit-learn compliant.
Honest trees and forests use sample splitting to unbias the estimates made in leaves. This leads to asytmptotic convergence guarantees and empirically better calibration (e.g. more accurate posterior probabilities, see our paper here).
An example can be seen here, comparing an honest forest to the traditional random forest and two other ad-hoc calibration approaches.
git clone https://github.com/neurodata/honest-forests.git
cd honest-forests
pip install -e .
The preferred workflow for contributing to hyppo is to fork the main repository on GitHub, clone, and develop on a branch. Steps:
-
Fork the project repository by clicking on the ‘Fork’ button near the top right of the page. This creates a copy of the code under your GitHub user account. For more details on how to fork a repository see this guide.
-
Clone your fork of the hyppo repo from your GitHub account to your local disk:
git clone [email protected]:YourGithubAccount/honest-forests.git cd honest-forests
-
Create a feature branch to hold your development changes:
git checkout -b my-feature
Always use a
feature
branch. Pull requests directly to eitherdev
ormain
will be rejected until you create a feature branch based ondev
. -
Develop the feature on your feature branch. Add changed files using
git add
and thengit commit
files:git add modified_files git commit
After making all local changes, you will want to push your changes to your fork:
git push -u origin my-feature
We recommended that your contribution complies with the following rules before you submit a pull request:
-
Follow the coding-guidelines.
-
Give your pull request a helpful title that summarizes what your contribution does.
-
Link your pull request to the issue (see: closing keywords for an easy way of linking your issue)
-
All public methods should have informative docstrings with sample usage presented as doctests when appropriate.
-
At least one paragraph of narrative documentation with links to references in the literature (with PDF links when possible) and the example.
-
If your feature is complex enough that a doctest is insufficient to fully showcase the utility, consider creating a Jupyter notebook to illustrate use instead
-
All functions and classes must have unit tests. These should include, at the very least, type checking and ensuring correct computation/outputs.
-
All code should be automatically formatted by
black
. You can run this formatter by calling:pip install black black path/to/your_module.py
Uniformly formatted code makes it easier to share code ownership. hyppo package closely follows the official Python guidelines detailed in PEP8 that detail how code should be formatted and indented. Please read it and follow it.
Properly formatted docstrings are required for documentation generation by Sphinx. The hyppo package closely follows the numpydoc guidelines. Please read and follow the numpydoc guidelines. Refer to the example.py provided by numpydoc.