Skip to content

A collection of projects to study text clustering and classification.

Notifications You must be signed in to change notification settings

3pillarlabs/text-clustering-classification

Repository files navigation

About

This is a collection of Scala projects to study text clustering and classification. The projects are:

  • batch-cluster: A Scala executable that submits a job to a Spark installation.
  • infop-expo: A web application in Play framework that resolves multiple data sets with the batch cluster.

Setup

Prerequisites

  1. VirtualBox
  2. Vagrant
  3. Chef Development Kit (DK)

Steps

VirtualBox

vagrant up dev

This will install all the dependencies and data files. This might take a couple of hours, so get some coffee and something to read on the side.

Once the VM is up and running, restart the VM.

vagrant reload dev

AWS

You will need a private key for this. Place your key in the project root and rename the key file to infop.pem or replace these lines in Vagrantfile:

aws.keypair_name = "your_key_name"
override.ssh.private_key_path = "path/to/your/key.pem"

Next, create the AWS instance with:

vagrant up awsdemo --provider aws

Once it is running, restart it:

vagrant reload awsdemo

About

A collection of projects to study text clustering and classification.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published