This example generates a distribution in Python, then runs a k-means clustering algorithm and plots it in R, demonstrating how scripts from multiple languages can be combined in Cocoon.
npx github-download-directory aengl/cocoon examples/interop
cd examples/interop
npm install
npm run editor
To run all nodes in this example, Python3 and R have to be installed, as well as the jsonlite and ggplot2 package for R:
$ r
> install.packages("jsonlite")
> install.packages("ggplot2")
Cocoon does not try to compete with excellent data science tools such as R Studio, Matlab, or even Excel. Rather, the idea is to provide a bridge between various languages and tools. Fortunately, interoperability in Cocoon is quite easy.
All that's needed is a Pipe node, which borrows its name from the Unix pipeline mechanism, since that is what we're using under the hood.
GenerateInPython:
in:
command: ./generator.py
data:
- num_points: 1000
mu:
x: 0
y: 0
sigma: 0.5
deserialise: JSON.parse
serialise: JSON.stringify
type: Pipe
This node essentially executes a shell script and passes arbitrary data via stdin. By default, plain strings are passed back and forth, but those values can be serialised and deserialised differently, if necessary.
In this example we pass configurations into a Python script that will generate a number of data points using a Gaussian function with different values for mu (the cluster center) and sigma (lower sigma results in a higher density).
VisualiseInR:
in:
command: ./plot.r
data: 'cocoon://GenerateInPython/out/data'
serialise: JSON.stringify
out:
src: plot.png
type: Pipe
view: Image
We then pipe this data into an R script which renders a plot. There's no need to write a complicated custom view: we can simply have the R script render the plot into an image which we then show in an Image
view provided by the @cocoon/plugin-views
package.
Despite using external scripts, all of Cocoon's mechanisms still work as expected: as we change parameters of the Gaussians, our R plot will update in real-time.