Skip to content

wbadart/hdsk

Repository files navigation

Haskell Data Science Kit

Build status License

The Haskell Data Science Kit (HDSK) project is an attempt to create a well-documented, well-tested, and performant data science library implemented in the Haskell language.

Sources suggest that in spite of huge potential for performance gains over current de facto methods [1], adoption of Haskell in the data science community lags for a variety of reasons, the greatest of which seems to be the dearth [2] of easy-to-use data science libraries (indeed, searching for "data science" on GitHub yields 14 Haskell-language repositories and 5,807 Python-language repositories [3]). This project seeks to mediate that issue by presenting a unified (though modular) library of data science utilities which support the entire life-cycle of a data science project.

Disclaimer: At the time of writing, I am still a beginner in Haskell, and this project is as much about the above stated goal as it is about me learning and practicing Haskell itself and the software development ecosystem around it. So, I make no guarantees that I will give the most optimal or idiomatic solution to any given function (and in cases when I don't, pull requests are gladly welcomed!).

Installation

To use HDSK within your stack project, you must add this repository to the extra-deps list in stack.yaml. NOTE: this step will change once HDSK is released on Hackage.

extra-deps:
- git: [email protected]:wbadart/hdsk.git
  commit: a52bed4216f607628e71594256dafd550ffe2d3e

The commit hash listed above is the most recent commit at the time of this writing. Be sure that the value you use is a recent enough to contain the features you need.

The cabal file generated by stack has been checked in, so if you aren't using stack, and are only using cabal, the library can be installed from a fresh clone of the repository.

Usage

Please see willbadart.com/hdsk for library documentation. Further project info, such as planned features, is made available on the wiki.

License

You'll notice a key theme in this document has been promoting adoption. As such, I'm developing and eventually releasing this project under the BSD-3-Clause [11] license, due to its general permissiveness. This is also one of the more popular licenses among the Haskell community [12].

Please see LICENSE for the full text.

About

Unified data science toolkit for Haskell and more

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published