Skip to content
Aritra Roy Gosthipaty edited this page Feb 22, 2023 · 7 revisions

The goal of this work is to design an architecture for autoregressive modelling that has an induction bias towards learning tempoorally compressed representations that retains the benefits of Transformers while preserving long-range interactions.

Perceptual Module (Fast Stream)

The fast stream has a short term memory with a high capacity that reacts quickly to sensory input. This is modelled with Transformers.

Temporal Latent Bottleneck (Slow Stream)

The slow stream has a long term memory which updates at a slower rate and summarizes the most important information in the input sequence.

Implementation

  • Divide the input into fixed size chunks.
  • Fast stream operates within each chunk.
  • Slow stream consolidates and aggregates information across chunks.
Clone this wiki locally