The aim of this project is demonstrating an effective implementation of the Gaussian Mixture Model (GMM) with Expectation-Maximization (EM) algorithm according to the vanilla federated learning paradigm as decribed in the paper Communication-Efficient Learning of Deep Networks from Decentralized Data.
The Gaussian Mixture Model is employed in unsupervised learning problems, especially in clustering tasks. This repository allows to execute alternatively a baseline local version of GMM and a federated distributed implementation of the same model, in order to compare their performance.
Name | Description | Default | Baseline | Federated |
---|---|---|---|---|
--dataset |
Name of the dataset. | blob | X | X |
--components |
Number of Gaussians to fit. | 3 | X | X |
--init |
Model initialization method: random or kmeans (over a 0.5% fraction of the dataset). | random | X | X |
--seed |
Number to have random consistent results across executions. | None | X | X |
--samples |
Number of samples to generate. | 10000 | X | X |
--features |
Number of features for each generated sample. | 2 | X | X |
--soft |
Specifies if cluster bounds are soft or hard. | True | X | X |
--plots_3d |
Specifies if plots are to be done in 3D or 2D. | False | X | X |
--plots_step |
Specifies the number of rounds or epochs after which saving a plot. | 1 | X | X |
--epochs |
Number of epochs of training. | 100 | X | |
--rounds |
Number of rounds of training. | 100 | X | |
--local_epochs |
Number of local epochs for each client at every round. | 10 | X | |
--K |
Total number of clients. | 100 | X | |
--C |
Fraction of clients to employ in each round. From 0 to 1. | 0.1 | X | |
--S |
Number of shards for each client. If None data are assumed to be IID, otherwise are non-IID. | None | X |
MIT