Separable STA network implementation

This is an implementation of the separable spatio-temporal attention (STA) network described in (Das et al. 2019).

This network consists of a spatio-temporal attention block trained using an 3-layer LSTM network that learnt from skeletal information. This attention block modulates the convolutional feature output g of an I3D network (see DAIGroup/i3d repository), both spatially and temporally. This modulated features gt and gs are then concatenated and feed to a classifier for the final one-hot output vector. Please see the original paper (Das et al. 2019) for further details. The two required branches are provided in separate repositories:

DAIGroup/i3d: for the I3D branch, here.
DAIGroup/LSTM_action_recognition: for the LSTM branch, here.

Please checkout these first, and then change the code_path variable of sta_config.py to the common directory where all three projects will reside.

Following their instructions, we provide the implementation we used to replicate their experiments, as well as for our modified version with alternative data preprocessing.

Description of files

sta_config.py is a file that can be modified to change training behaviours, run cross-subject (CS) or cross-view (CV2) experiments. As well as to tune other parameters (GPUs used, etc.).
toyota_generator.py is the data generator for each epoch. The provided class contains a flag that can be modified to determine whether it is a training or a test data generator (is_test flag).
separable_sta.py contains the implementation of the additional layers of the attention block.
sta_train.py is the main file to run for training.
sta_evaluate.py is the main file to run for testing/evaluation.

References

(Das et al. 2019) Das, S., Dai, R., Koperski, M., Minciullo, L., Garattoni, L., Bremond, F., & Francesca, G. (2019). Toyota smarthome: Real-world activities of daily living. In Proceedings of the IEEE International Conference on Computer Vision (pp. 833-842).
(Climent-Pérez et al. 2021) Climent-Pérez, P., Florez-Revuelta, F. (2021). Improved action recognition with Separable spatio-temporalattention using alternative Skeletal and Video pre-processing, Sensors 21(3), 1005. DOI: https://doi.org/10.3390/s21031005

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Separable STA network implementation

Description of files

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
README.md		README.md
separable_sta.py		separable_sta.py
sta_config.py		sta_config.py
sta_evaluate.py		sta_evaluate.py
sta_train.py		sta_train.py
toyota_generator.py		toyota_generator.py

DAIGroup/separable_STA

Folders and files

Latest commit

History

Repository files navigation

Separable STA network implementation

Description of files

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages