$ git clone https://github.com/mcrts/nuage-learning nuage-learning
$ pip install -e nuage-learning
- A kafka server should be running on localhost:9092
- There should be 2 topics 'server' and 'clients'
$ python nuage-learning/example_02_kafkathreaded.py
This example train a SGDClassifier on the Iris dataset spread across 100 worker nodes for 10 loops.
- example_02_kafkathreaded.py : execution log
- example_02_kafkathreaded.py : client message
- example_02_kafkathreaded.py : server message
- example_02_kafkathreaded.py : training analytics
- Yannick Bouillard
- Paul Andrey
- Alexandre Filiot
- Martin Courtois
- Make an example with a kafka cloud instance (will probably need some work)
- CLI design
- Tweak FederatedSGDClassifier 's API
Installing, configuring and running Apache-Kafka
java8 is required
apt install openjdk-8-jdk
Download and install Apache-Kafka
wget https://downloads.apache.org/kafka/2.8.0/kafka_2.12-2.8.0.tgz
tar -xvf kafka_2.12-2.8.0.tgz /opt/kafka_2.12-2.8.0
/opt/kafka_2.12-2.8.0/bin/kafka-topics.sh
Sanity-check
/opt/kafka_2.12-2.8.0/bin/kafka-topics.sh
Add /opt/kafka_2.12-2.8.0/bin to your $PATH in ~/.bashrc
export PATH = $PATH:/opt/kafka_2.12-2.8.0/bin
Setup log directories
mkdir /opt/kafka_2.12-2.8.0/data
mkdir /opt/kafka_2.12-2.8.0/data/zookeeper
mkdir /opt/kafka_2.12-2.8.0/data/kafka
Update /opt/kafka_2.12-2.8.0/config/zookeeperproperties
dataDir = /opt/kafka_2.12-2.8.0/data/zookeeper
Update /opt/kafka_2.12-2.8.0/config/zookeeperproperties
log.dirs = /opt/kafka_2.12-2.8.0/data/kafka
zookeeper-server-start.sh /opt/kafka_2.12-2.8.0/config/zookeeper.properties
kafka-server-start.sh /opt/kafka_2.12-2.8.0/config/server.properties
wget https://releases.conduktor.io/linux-deb
dpkg -i Conduktor-2.13.1.deb