Multi-Stream Transformer

1. Introduction

This project implements a novel transformer architecture that processes text through multiple parallel streams, each trained on different objectives. The core idea is to enhance the model's understanding and generation capabilities by combining different types of pattern recognition. The parallel streams will be fused into a common residual stream optimized for next-token prediction, and the goal is to enhance the model's underlying comprehension.

2. Setup

The required libraries can be downloaded throught the requirements file:

pip install -r requirements.txt

Run this afterwards for installing pytorch version compatible with CUDA 11.6:

pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
mst		mst
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Stream Transformer

1. Introduction

2. Setup

About

Releases

Packages

Languages

License

Mathiasotnes/Multi-Stream-Transformer

Folders and files

Latest commit

History

Repository files navigation

Multi-Stream Transformer

1. Introduction

2. Setup

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages