本仓库旨在提供统计学习方法一书中算法的numpy
实现。诚然在GitHub上已经有了许多类似的实现,但是本仓库有着以下优点:
This repo aims to provide numpy implementation for the algorithms in the book Statistical Learning Method.
I notice there are lots of implementtions in the GitHub. Here, I highlight the advantage of my implementations:
- 快速运行:可以直接在Google Colab上直接运行,无需下载或克隆代码到本地。Directly running in Google Colab without downloading/cloning the repo to the local.
- 高效运行:使用
numpy
中向量化实现替代python中的循环等操作。Implement by vectorized built-in functions innumpy
instead of vanilla python operations, leading to better performance. - 真实数据集:使用来自libsvm中的数据集而不是生成的数据。Using more practical datasets from libsvm instead of synthetic datasets.
- 无需提前下载数据集:数据集将以在线的方式直接加载无需额外手动下载。Loading datasets directly from the internet (libsvm website) without extra manual downloading.
这个仓库不是:This repo is not:
- 算法详解:只关注代码实现而不是算法本身的原理。details of algorithms. It focuses on implementations instead of other details of algorithms.
- 通用算法库:本实现仅仅用于快速理解算法,并未对算法进行封装,也未考虑实际使用中的各类情况。a general algorithm library. This implementation is only used to quickly understand the algorithm, it does not encapsulate the algorithm, and does not consider various situations in actual use.