Skip to content

Latest commit

 

History

History
11 lines (8 loc) · 1.29 KB

Perez-Rua2019MFAS.md

File metadata and controls

11 lines (8 loc) · 1.29 KB

Title

MFAS: Multimodal Fusion Architecture Search

Author

Juan-Manuel Perez-Rua, Valentin Vielzeuf, Stephane Pateux, Moez Baccouche, Frederic Jurie

Abstract

We tackle the problem of finding good architectures for multimodal classification problems. We propose a novel and generic search space that spans a large number of possible fusion architectures. In order to find an optimal architecture for a given dataset in the proposed search space, we leverage an efficient sequential model-based exploration approach that is tailored for the problem. We demonstrate the value of posing multimodal fusion as a neural architecture search problem by extensive experimentation on a toy dataset and two other real multimodal datasets. We discover fusion architectures that exhibit state-of-the-art performance for problems with different domain and dataset size, including the NTU RGB+D dataset, the largest multimodal action recognition dataset available.

Bib

@INPROCEEDINGS{8954353, author={J. {Perez-Rua} and V. {Vielzeuf} and S. {Pateux} and M. {Baccouche} and F. {Jurie}}, booktitle={2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, title={MFAS: Multimodal Fusion Architecture Search}, year={2019}, volume={}, number={}, pages={6959-6968}, doi={10.1109/CVPR.2019.00713}}