Flax implementation of gMLP from "Pay Attention to MLPs" #1410

SauravMaheshkar · 2021-07-05T21:07:19Z

SauravMaheshkar
Jul 5, 2021

It's no news that transformers have dominated the field of deep learning ever since 2017. But, Hanxiao Liu, Zihang Dai, David R. So and Quoc V. Le in their recent work titled "Pay Attention to MLPs" propose a new architecture gMLP (essentially MLPs with gating) that performs as well as Transformers in key language and vision applications. Based on the comparisons shown in the paper the authors show that self-attention is not critical for Vision Transformers !!, as gMLP can achieve the same accuracy, thus bringing into question the validity of Attention.

My repository includes an implementation of gMLP written in Flax. Most of the codebase is inspired from Phil Wang's implementations in Pytorch and Haiku.

Answered by SauravMaheshkar

Jul 22, 2021

Marking as answered

View full answer

marcvanzee · 2021-07-06T07:45:00Z

marcvanzee
Jul 6, 2021
Maintainer

Awesome, thanks for sharing!

0 replies

SauravMaheshkar · 2021-07-22T03:25:06Z

SauravMaheshkar
Jul 22, 2021
Author

Marking as answered

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flax implementation of gMLP from "Pay Attention to MLPs" #1410

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Flax implementation of gMLP from "Pay Attention to MLPs" #1410

SauravMaheshkar Jul 5, 2021

Replies: 2 comments

marcvanzee Jul 6, 2021 Maintainer

SauravMaheshkar Jul 22, 2021 Author

SauravMaheshkar
Jul 5, 2021

marcvanzee
Jul 6, 2021
Maintainer

SauravMaheshkar
Jul 22, 2021
Author