Add forward forward app (#126)

* add forward forward app * implement comments * import nebullvm as a dependency in apps * add readme * Add figs * add architecture description * rename title * Add links and change image format * Updated readme and added image * rename matrixmaster and modify readme Co-authored-by: diegofiori <[email protected]> Co-authored-by: Nebuly <[email protected]>
nebuly-ai · Dec 20, 2022 · 5fb48f6 · 5fb48f6
1 parent 8bbe607
commit 5fb48f6
Show file tree

Hide file tree

Showing 20 changed files with 1,685 additions and 14 deletions.
diff --git a/README.md b/README.md
@@ -27,10 +27,11 @@ Achieve sub-10ms response time for any AI application, including generative and
 
 
 - [x] [Speedster](https://github.com/nebuly-ai/nebullvm/blob/main/apps/accelerate/speedster): Automatically apply SOTA optimization techniques to achieve the maximum inference speed-up on your hardware.
-- [ ] [OptiMate](https://github.com/nebuly-ai/nebullvm/blob/main/apps/accelerate/optimate): Interactive tool guiding savvy users in achieving the best inference performance out of a given model / hardware setup.
+- [x] [Forward-Forward](https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/forward_forward): Test the performance of the Forward-Forward algorithm in PyTorch.
+- [ ] [OpenAlphaTensor](https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/open_alpha_tensor): Boost your DL model's performance with OpenAlphaTensor's custom-generated matrix multiplication algorithms (AlphaTensor open-source). 
 - [ ] [LargeSpeedster](https://github.com/nebuly-ai/nebullvm/blob/main/apps/accelerate/large_speedster): Automatically apply SOTA optimization techniques on large AI models to achieve the maximum acceleration on your hardware.
 - [ ] [CloudSurfer](https://github.com/nebuly-ai/nebullvm/blob/main/apps/accelerate/cloud_surfer): Discover the optimal inference hardware and cloud platform to run an optimized version of your AI model.
-- [ ] [MatrixMaster](https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/matrix_master): Boost your DL model's performance with MatrixMaster's custom-generated matrix multiplication algorithms (AlphaTensor open-source). 
+- [ ] [OptiMate](https://github.com/nebuly-ai/nebullvm/blob/main/apps/accelerate/optimate): Interactive tool guiding savvy users in achieving the best inference performance out of a given model / hardware setup.
 
 ## Maximize Apps
 Make your Kubernetes GPU infrastructure efficient. Simplify cluster management, maximize hardware utilization and minimize costs.
@@ -45,7 +46,7 @@ Don’t settle on generic AI-models. Extract domain-specific knowledge from larg
 ## Simulate Apps
 The time for trial and error is over. Simulate the performances of large models on different computing architectures to reduce time-to-market, maximize accuracy and minimize costs.
 - [ ] [Simulinf](https://github.com/nebuly-ai/nebullvm/blob/main/apps/simulate/simulinf): Simulate inference performances of your AI model on different hardware and cloud platforms.
-- [ ] [TrainingSim](https://github.com/nebuly-ai/nebullvm/blob/main/apps/simulate/training_sim): Easily simulate and optimize the training of large AI models on a distributed infrastructure.
+- [ ] [TrainingSim](https://github.com/nebuly-ai/nebullvm/blob/main/apps/simulate/training_sim): Easily simulate and optimize the training of large AI models on a distributed infrastructure.
 
 
 Couldn't find the optimization app you were looking for? Please open an issue or contact us at [email protected] and we will be happy to develop it together.

diff --git a/apps/accelerate/forward_forward/README.md b/apps/accelerate/forward_forward/README.md
@@ -0,0 +1,80 @@
+# Forward-Forward Algorithm App
+
+This app implements a complete open-source version of [Geoffrey Hinton's Forward Forward](https://www.cs.toronto.edu/~hinton/FFA13.pdf) Algorithm, an alternative approach to backpropagation.
+
+The Forward Forward algorithm is a method for training deep neural networks that replaces the backpropagation forward and backward passes with two forward passes, one with positive (i.e., real) data and the other with negative data that could be generated by the network itself.
+
+Unlike the backpropagation approach, forward forward does not require calculating the gradient of the loss function with respect to the network parameters. Instead, each optimization step can be performed locally and the weights of each layer can be updated immediately after the layer has performed its forward pass.
+
+If you appreciate the project, show it by [leaving a star ⭐](https://github.com/nebuly-ai/nebullvm/stargazers)
+
+<img width="1012" alt="Screenshot 2022-12-20 at 14 45 22" src="https://user-images.githubusercontent.com/83510798/208681462-2d8fc8f8-b24e-41a3-978a-72101f7f6392.png">
+
+## Installation
+
+The forward-forward app is built on top of nebullvm, a framework for efficiency-based apps. The app can be easily installed from source code. First you have to clone the repository and navigate to the app directory:
+
+```bash
+git clone https://github.com/nebuly-ai/nebullvm.git
+cd nebullvm/apps/accelerate/forward_forward
+```
+
+Then install the app:
+
+```bash
+pip install .
+```
+This process will just install the minimum requirements for running the app. If you want to run the app on a GPU you have to install the CUDA version of PyTorch. You can find the instructions on the official PyTorch website.
+
+## Usage
+At the current stage, this implementation supports the main architectures discussed by Hinton in his paper. Each architecture can be trained with the following command:
+
+```python
+from forward_forward import train_with_forward_forward_algorithm
+
+
+trained_model = train_with_forward_forward_algorithm(
+    model_type="progressive",
+    n_layers=3,
+    hidden_size=2000,
+    lr=0.03,
+    device="cuda",
+    epochs=100,
+    batch_size=5000,
+    theta=2.,
+)
+```
+
+Three architectures are currently supported:
+* `progressive`: the most simple architecture described in the paper. It has a pipeline-like structure and each layer can be trained independently from the following ones. Our implementation differs respect the original one since the labels are injected in the image concatenating them to the flattened tensor instead of replacing the first n_classes pixels value with a one-hot-representation of the label.
+
+* `recurrent`: the recurrent architecture described in the paper. It has a recurrent-like structure and its based on the `GLOM` architecture proposed by Hinton. 
+
+* `nlp`: A simple network which can be used as a language model.
+
+The recurrent and nlp network architectures are better explained below.
+
+## Recurrent Architecture
+The recurrent architecture is based in the `GLOM` architecture for videos, proposed by Hinton in the paper [How to represent part-whole hierarchies in a neural network](https://arxiv.org/pdf/2102.12627.pdf). Its application to the forward-forward algorithm aims at enabling each layer to learn not just from the previous layer output, but from the following layers as well. This is done by concatenating the outputs of the previous layer and following layers computed at the previous time-step. A learned representation of the label (positive or negative) it is given as input to the last layer. The following figure shows the structure of the network:
+
+<p align="center">
+    <img width="500" alt="recurrent_net" src="https://user-images.githubusercontent.com/38586138/208651417-498c4bd4-f2dc-4613-a376-0b69317c73d4.png">
+</p>
+
+## NLP Architecture
+The forward-forward architecture developed for NLP is a simple network which can be used as a language model. The network is composed by few normalized fully connected layers followed by a ReLU activation. All hidden representations are then concatenated together and given as input to the softmax for predicting the next token. The network can be trained in a progressive way, i.e. each layer can be sequentially trained separately from the following ones. The following figure shows the structure of the network:
+
+<p align="center">
+    <img width="500" class="center" alt="nlp_net" src="https://user-images.githubusercontent.com/38586138/208651624-c159b230-f903-4e13-aaa7-b39a0d1c52fc.png">
+</p>
+
+## What is missing
+This app implements the main architectures exposed by hinton in its paper. However, there are still some features that are not implemented yet. In particular, the following features are missing:
+
+* [ ] Implementation of unsupervised training.
+* [ ] Implementation of the `progressive` architecture using local receptive fields instead of fully connected layers.
+* [ ] Training on CIFAR-10 for CV-based architectures.
+
+And don't forget to [leave a star ⭐](https://github.com/nebuly-ai/nebullvm/stargazers) if you appreciate the project!
+If you have any questions about the implementation, [open an issue](https://github.com/nebuly-ai/nebullvm/issues) or contact us in the [community chat](https://discord.gg/RbeQMu886J).
+
diff --git a/apps/accelerate/forward_forward/forward_forward/__init__.py b/apps/accelerate/forward_forward/forward_forward/__init__.py
@@ -0,0 +1,3 @@
+from forward_forward.api.functions import (  # noqa F401
+    train_with_forward_forward_algorithm,
+)
diff --git a/apps/__init__.py → ...d_forward/forward_forward/api/__init__.py b/apps/__init__.py → ...d_forward/forward_forward/api/__init__.py
diff --git a/apps/accelerate/forward_forward/forward_forward/api/functions.py b/apps/accelerate/forward_forward/forward_forward/api/functions.py
@@ -0,0 +1,52 @@
+from torchvision import datasets
+
+from forward_forward.root_op import (
+    ForwardForwardRootOp,
+    ForwardForwardModelType,
+)
+
+
+def train_with_forward_forward_algorithm(
+    n_layers: int = 2,
+    model_type: str = "progressive",
+    device: str = "cpu",
+    hidden_size: int = 2000,
+    lr: float = 0.03,
+    epochs: int = 100,
+    batch_size: int = 5000,
+    theta: float = 2.0,
+    shuffle: bool = True,
+    **kwargs,
+):
+    model_type = ForwardForwardModelType(model_type)
+    root_op = ForwardForwardRootOp(model_type)
+
+    output_size = None
+    if model_type is ForwardForwardModelType.PROGRESSIVE:
+        input_size = 28 * 28 + len(datasets.MNIST.classes)
+    elif model_type is ForwardForwardModelType.RECURRENT:
+        input_size = 28 * 28
+        output_size = len(datasets.MNIST.classes)
+    else:  # model_type is ForwardForwardModelType.NLP
+        input_size = 10  # number of characters
+        output_size = 30  # length of vocabulary
+        assert (
+            kwargs.get("predicted_tokens") is not None
+        ), "predicted_tokens must be specified for NLP model"
+
+    root_op.execute(
+        input_size=input_size,
+        n_layers=n_layers,
+        hidden_size=hidden_size,
+        optimizer_name="Adam",
+        optimizer_params={"lr": lr},
+        loss_fn_name="alternative_loss_fn",
+        batch_size=batch_size,
+        epochs=epochs,
+        device=device,
+        shuffle=shuffle,
+        theta=theta,
+        output_size=output_size,
+    )
+
+    return root_op.get_result()
diff --git a/apps/accelerate/forward_forward/forward_forward/app.py b/apps/accelerate/forward_forward/forward_forward/app.py
@@ -0,0 +1,12 @@
+from nebullvm.apps.base import App
+
+from forward_forward.root_op import ForwardForwardRootOp
+
+
+class ForwardForwardApp(App):
+    def __init__(self):
+        super().__init__()
+        self.root_op = ForwardForwardRootOp()
+
+    def execute(self, *args, **kwargs):
+        return self.root_op.execute(*args, **kwargs)
diff --git a/apps/accelerate/forward_forward/forward_forward/operations/__init__.py b/apps/accelerate/forward_forward/forward_forward/operations/__init__.py
diff --git a/apps/accelerate/forward_forward/forward_forward/operations/build_models.py b/apps/accelerate/forward_forward/forward_forward/operations/build_models.py
@@ -0,0 +1,114 @@
+from abc import ABC, abstractmethod
+
+import torch
+
+from nebullvm.operations.base import Operation
+
+from forward_forward.utils.modules import (
+    FCNetFFProgressive,
+    RecurrentFCNetFF,
+    LMFFNet,
+)
+
+
+class BaseModelBuildOperation(Operation, ABC):
+    def __init__(self):
+        super().__init__()
+        self.model = None
+
+    @abstractmethod
+    def execute(
+        self,
+        input_size: int,
+        n_layers: int,
+        hidden_size: int,
+        optimizer_name: str,
+        optimizer_params: dict,
+        loss_fn_name: str,
+        output_size: int = None,
+    ):
+        raise NotImplementedError
+
+    def get_result(self):
+        return self.model
+
+
+class FCNetFFProgressiveBuildOperation(BaseModelBuildOperation):
+    def __init__(self):
+        super().__init__()
+
+    def execute(
+        self,
+        input_size: int,
+        n_layers: int,
+        hidden_size: int,
+        optimizer_name: str,
+        optimizer_params: dict,
+        loss_fn_name: str,
+        output_size: int = None,
+    ):
+        layer_sizes = [input_size] + [hidden_size] * n_layers
+        model = FCNetFFProgressive(
+            layer_sizes=layer_sizes,
+            optimizer_name=optimizer_name,
+            optimizer_kwargs=optimizer_params,
+            loss_fn_name=loss_fn_name,
+            epochs=-1,
+        )
+        if output_size is not None:
+            output_layer = torch.nn.Linear(layer_sizes[-1], output_size)
+            model = torch.nn.Sequential(model, output_layer)
+
+        self.model = model
+
+
+class RecurrentFCNetFFBuildOperation(BaseModelBuildOperation):
+    def __init__(self):
+        super().__init__()
+
+    def execute(
+        self,
+        input_size: int,
+        n_layers: int,
+        hidden_size: int,
+        optimizer_name: str,
+        optimizer_params: dict,
+        loss_fn_name: str,
+        output_size: int = None,
+    ):
+        layer_sizes = [input_size] + [hidden_size] * n_layers + [output_size]
+        model = RecurrentFCNetFF(
+            layer_sizes=layer_sizes,
+            optimizer_name=optimizer_name,
+            optimizer_kwargs=optimizer_params,
+            loss_fn_name=loss_fn_name,
+        )
+        self.model = model
+
+
+class LMFFNetBuildOperation(BaseModelBuildOperation):
+    def __init__(self):
+        super().__init__()
+
+    def execute(
+        self,
+        input_size: int,
+        n_layers: int,
+        hidden_size: int,
+        optimizer_name: str,
+        optimizer_params: dict,
+        loss_fn_name: str,
+        output_size: int = None,
+    ):
+        model = LMFFNet(
+            token_num=output_size,
+            hidden_size=hidden_size,
+            n_layers=n_layers,
+            seq_len=input_size,
+            optimizer_name=optimizer_name,
+            optimizer_kwargs=optimizer_params,
+            loss_fn_name=loss_fn_name,
+            epochs=-1,
+            predicted_tokens=-1,
+        )
+        self.model = model