Skip to content

Latest commit

 

History

History
67 lines (46 loc) · 1.54 KB

data_science.md

File metadata and controls

67 lines (46 loc) · 1.54 KB

Python - defacto for data science.

What do we need

Tabular Data Framework Charting support Linear algebra Statistics ML

What are the alternatives to Python/R/Julia/Matlab

Java:

Options: JTableSaw

Morpheus - was here: https://github.com/zavtech/morpheus-core Now - https://github.com/d3xsystems/d3x-morpheus

Part of the eclipse foundation: https://deeplearning4j.org/docs/latest/deeplearning4j-quickstart https://github.com/eclipse/deeplearning4j https://github.com/eclipse/deeplearning4j/tree/master/nd4j

Not using pure Java - special NDArray data structures and wrappers around C++ Wrappers around other C++ libraries Backed by Skymind https://docs.skymind.ai/docs

https://github.com/jtablesaw/tablesaw

Part of: http://beakerx.com/ https://github.com/twosigma/beakerx

Scala

https://github.com/ThoughtWorksInc/DeepLearning.scala Uses Java nd4j underneath - provided by Thoughtworks http://beakerx.com/ - also provides Scala

https://typelevel.org/spire/ Spire is a numeric library for Scala which is intended to be generic, fast, and precise.

Spark eco-system: https://spark.apache.org/mllib/ https://github.com/apache/spark/tree/master/mllib/src/main/scala/org/apache/spark/mllib/linalg

c++

http://arma.sourceforge.net/ Armadillo - C++ library for linear algebra & scientific computing https://gitlab.com/conradsnicta/armadillo-code https://www.mlpack.org - mlpack - fast, flexible C++ machine learning library https://github.com/mlpack/mlpack/

Whole program optimization - want it in one place Fragmented Python & Java eco-system