Multimodal Prediction of the Audience's Impression in Political Debates

Source code and data (raw and preprocessed) for training multimodal approaches to predict the impression given by politicians in TV debates.

https://www.ukp.tu-darmstadt.de

https://www.tu-darmstadt.de

Project maintainer: Pedro Santos (@pedrobisp)

This repository contains experimental software and is published for the sole purpose of giving additional background details.

Dataset

The transcripts and impression scores are available by request. If you use this dataset in your work, please cite the following paper in your publication:

Nagel, F., Maurer, M., & Reinemann, C. (2012). Is there a visual dominance in political communication? How verbal, visual, and vocal communication shape viewers’ impressions of political candidates. In: Journal of Communication, 62, 833-850.

Please contact the project maintainer for obtaining these resources: https://www.informatik.tu-darmstadt.de/ukp/ukp_home/staff_ukp/detailseite_mitarbeiter_1_42304.en.jsp

Due to legal issues, we cannot redistribute the debate recording. However, the debate is publicly available online at YouTube: https://youtu.be/Hybsgj1MIZ4

Preprocessing

We used Python 3 in our experiments, so we recommend also using it to reproduce the experiments. We also recommend using a virtual enviroment. The packages used in our experiments can be installed by the following command:

(venv) pip3 install -r requirements.txt

Besides that, we also used third party open source software for preprocessing the debate video file:

FFmpeg - https://www.ffmpeg.org/
OpenFace - https://github.com/TadasBaltrusaitis/OpenFace
openSMILE - https://www.audeering.com/technology/opensmile/ (If you cannot compile the latest version in Ubuntu, try this naxingyu/opensmile#2 (comment))

The debate is segmented into turns, where each turn consists of a politician talking without a major interruption by the other politician or one of the journalists. Small interruptions can occur though. So, the first step for preprocessing the debate video file is to segment it in turns. Please make sure that FFmpeg is installed and is callable by the command line. Run the following script to segment the debate video:

$(venv) python preprocessing/turns_segmentation.py resources/turns_segmentation.yaml

The YAML file "turns_segmentation.yaml" has the input the video file, the textual transcripts file, the output folder path containing the turns segmented, and the input content response measurement (crm) file containing the average impression scores together with other information manually annotated (speaker, type of argumentation being used, and so on). Some turns will have less than 3 seconds, and should not be used. Please run the following script to delete these small turns:

$(venv) python preprocessing/delete_small_turns.py resources/delete_small_turns.yaml

To extract the Mel-Frequency Cepstral Coefficients, run the following script:

$(venv) python preprocessing/extract_mfcc.py resources/mfcc.yaml

The opensmile configuration file is in the resources folder. Please use this configuration file. The input folder should be the output folder generated by the turns_segmentation.py script. Do not forget to specify the path to the openSMILE binary file in the YAML file.

To extract the visual features, run the folloowing script:

$(venv) python preprocessing/extract_visual_features.py resources/visualfeatures.yaml

In the YAML file, you should specify the local path of the FeatureExtraction binary file from OpenFace. The input folder is also the output folder generated by the turns_segmentation.py script.

We used fastText pretrained word embeddings: https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md For aligning the modalities, run the following script:

$(venv) python preprocessing/multimodal_io.py resources/multimodal_io.yaml

Please specify in the YAML file the path for the pretrained word embeddings, the oov list, the folder path containing the segmented turns and the path for the output pickle file containing the aligned modalities.

Reproducing the results

To reproduce the results reported in the paper, you have to create the yaml files for each fold and modality combination. Run the following script to do so:

$(venv) python preprocessing/create_yaml_files.py resources/${modality}.yaml

The following yaml files are available in the resources folder: text.yaml, speech.yaml, vision.yaml, text_speech.yaml, text_vision.yaml, speech_vision.yaml, all_modalities.yaml.

After creating the yaml files for each fold, run the following script for perfoming leave-one-turn-out cross-validation:

$(venv) sh run_scripts.sh $modality_folder

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
classification		classification
preprocessing		preprocessing
resources		resources
LICENSE.txt		LICENSE.txt
NOTICE.txt		NOTICE.txt
README.md		README.md
requirements.txt		requirements.txt
run_scripts.sh		run_scripts.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal Prediction of the Audience's Impression in Political Debates

Dataset

Preprocessing

Reproducing the results

About

Releases

Packages

Languages

License

UKPLab/icmi2018-multimodal-debates

Folders and files

Latest commit

History

Repository files navigation

Multimodal Prediction of the Audience's Impression in Political Debates

Dataset

Preprocessing

Reproducing the results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages