DEAP: A Database for Emotion Analysis using Physiological Signals
S. Koelstra, C. Muehl, M. Soleymani, J.-S. Lee, A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, I. Patras. In IEEE Transaction on Affective Computing, Special Issue on Naturalistic Affect Resources for System Building and Evaluation, in press [pdf] [bibtex]
@article{Koelstra11_2,
author = {S. Koelstra, C. Muehl, M. Soleymani, J.-S. Lee, A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, I. Patras},
journal = {IEEE Transaction on Affective Computing, Special Issue on Naturalistic Affect Resources for System Building and Evaluation},
title = {DEAP: A Database for Emotion Analysis using Physiological Signals},
note = {in press},
abstract={Viewers' preference for multimedia selection depends highly on their emotional experience. In this paper, we present an emotion detection method for music videos using central and peripheral nervous system physiological signals as well as multimedia content analysis. A set of 40 music clips eliciting a broad range of emotions were first selected. After extracting the one minute long emotional highlight of each video, they were shown to 32 participants while their physiological responses were recorded. Participants self-reported their felt emotions after watching each clip by means of arousal, valence, dominance, and liking ratings. The physiological signals included electroencephalogram, galvanic skin response, respiration pattern, skin temperature, electromyograms and blood volume pulse using plethysmograph. Emotional features were extracted from the signals and the multimedia content. The emotional features were used to train a linear ridge regressor to detect emotions for each participant using a leave-one-out cross-validation strategy. The performance of the personalized emotion detection is shown to be significantly superior to a random regressor.},
}Continuous Emotion Detection in Response to Music Videos
M. Soleymani, S. Koelstra, I. Patras, T. Pun. In International Workshop on Emotion Synthesis, rePresentation, and Analysis in Continuous spacE (EmoSPACE) In conjunction with the IEEE FG 2011, pages 803-808, 2011. [pdf] [bibtex]
@inproceedings{Soleymani,
author = {M. Soleymani, S. Koelstra, I. Patras, T. Pun},
booktitle = {International Workshop on Emotion Synthesis, rePresentation, and Analysis in Continuous spacE (EmoSPACE) In conjunction with the IEEE FG 2011},
title = {{Continuous Emotion Detection in Response to Music Videos}},
pages = {803-808},
year = {2011},
abstract={We present a multimodal dataset for the analysis of human affective states. The electroencephalogram (EEG) and peripheral physiological signals of 32 participants were recorded as each watched 40 one-minute long excerpts of music videos. Participants rated each video in terms of the levels of arousal, valence, like/dislike, dominance and familiarity. For 22 of the 32 participants, frontal face video was also recorded. A novel method for stimuli selection is proposed using retrieval by affective tags from the last.fm website, video highlight detection and an online assessment tool. An extensive analysis of the participants' ratings during the experiment is presented. Correlates between the EEG signal frequencies and the participants' ratings are investigated. Methods and results are presented for single-trial classification of arousal, valence and like/dislike ratings using the modalities of EEG, peripheral physiological signals and multimedia content analysis. Finally, decision fusion of the classification results from the different modalities is performed. The dataset is made publicly available and we encourage other researchers to use it for testing their own affective state estimation methods.}
}A Dynamic Texture based Approach to Recognition of Facial Actions and their Temporal Models
S. Koelstra, M. Pantic and I. Patras. In IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, number 11, pages 1940-1954, 2010. [pdf] [bibtex]
@article{Koelstra10,
author = {S. Koelstra, M. Pantic and I. Patras},
title = {A Dynamic Texture based Approach to Recognition of Facial Actions and their Temporal Models},
journal = {IEEE Trans. Pattern Analysis and Machine Intelligence},
pages={1940--1954},
year={2010},
volume={32},
number={11},
abstract = "In this work we propose a dynamic-texture-based approach to the recognition of facial Action Units (AUs, atomic facial gestures) and their temporal models (i.e., sequences of temporal segments: neutral, onset, apex, and offset) in near-frontal-view face videos. Two approaches to modelling the dynamics and the appearance in the face region of an input video are compared: an extended version of Motion History Images and a novel method based on Non-rigid Registration using Free-Form Deformations (FFDs). The extracted motion representation is used to derive motion orientation histogram descriptors in both the spatial and temporal domain. Per AU, a combination of discriminative, frame-based GentleBoost ensemble learners and dynamic, generative Hidden Markov Models detects the presence of the AU in question and its temporal segments in an input image sequence. When tested for recognition of all 27 lower and upper face AUs, occurring alone or in combination in 264 sequences from the MMI facial expression database, the proposed method achieved an average event recognition accuracy of 89.2\% for the MHI method and of 94.3\% for the FFD method. The generalization performance of the FFD method has been tested using the Cohn-Kanade database. Finally, we also explored the performance on spontaneous expressions in the Sensitive Artificial Listener dataset.",
}
Single Trial Classification of EEG and Peripheral Physiological Signals for Recognition of Emotions Induced by Music Videos
S. Koelstra, A. Yazdani, M. Soleymani, C. Muehl, J.-S. Lee, A. Nijholt, T. Pun, T. Ebrahimi, I. Patras. In Conference on Brain Informatics, pages 89-100, 2010. [pdf] [bibtex]
@inproceedings{Koelstra10_1,
author = {S. Koelstra, A. Yazdani, M. Soleymani, C. Muehl, J.-S. Lee, A. Nijholt, T. Pun, T. Ebrahimi, I. Patras},
booktitle = {Conference on Brain Informatics},
pages = {89-100},
year = 2010,
title = {{Single Trial Classification of EEG and Peripheral Physiological Signals for Recognition of Emotions Induced by Music Videos}},
abstract = {{Recently, the field of automatic recognition of users' affective states has gained a great deal of attention. Automatic, implicit recognition of affective states has many applications, ranging from personalized content recommendation to automatic tutoring systems. In this work, we present some promising results of our research in classification of emotions induced by watching music videos. We show robust correlations between users' self-assessments of arousal and valence and the frequency powers of their EEG activity. We present methods for single trial classification using both EEG and peripheral physiological signals. For EEG, an average (maximum) classification rate of 55.7\% (67.0\%) for arousal and 58.8\% (76.0\%) for valence was obtained. For peripheral physiological signals, the results were 58.9\% (85.5\%) for arousal and 54.2\% (78.5\%) for valence.}},
}EEG analysis for implicit tagging of video data
S. Koelstra, C. Muehl and I. Patras. In Workshop on Affective Brain-Computer Interfaces, Proc. ACII, pages 27-32, 2009. [pdf] [bibtex]
@inproceedings{Koelstra09_2,
author = {S. Koelstra, C. Muehl and I. Patras},
title = {EEG analysis for implicit tagging of video data},
booktitle = {Workshop on Affective Brain-Computer Interfaces, Proc. ACII},
year = {2009},
pages = {27--32},
abstract = "In this work, we aim to find neuro-physiological indicators to validate tags attached to video content.
Subjects are shown a video and a tag and we aim to determine whether the shown tag was congruent with the presented video by detecting the occurrence of an N400 event-related potential.
Tag validation could be used in conjunction with a vision-based recognition system as a feedback mechanism to improve the classification accuracy for multimedia indexing and retrieval.
An advantage of using the EEG modality for tag validation is that it is a way of performing implicit tagging.
This means it can be performed while the user is passively watching the video.
Independent Component Analysis and repeated measures ANOVA are used for analysis.
Our experimental results show a clear occurrence of the N400 and a significant difference in N400 activation between matching and non-matching tags."
}The FAST-3D Spatio-Temporal Interest Region Detector
S. Koelstra and I. Patras. In Workshop on Image Analysis for Multimedia Interactive Services, pages 242-245, 2009. [pdf] [bibtex]
@inproceedings{Koelstra09,
author = {S. Koelstra and I. Patras},
title = {The FAST-3D Spatio-Temporal Interest Region Detector},
booktitle = {Workshop on Image Analysis for Multimedia Interactive Services},
year = {2009},
pages = {242--245},
abstract="Spatio-temporal interest region detectors can be used in the analysis of video to determine sparse, informative regions as candidates for feature extraction.
In this paper we compare existing detectors and introduce the new FAST-3D detector, loosely based on the FAST spatial interest region detector.
We compare the invariance of detectors to rotation, scale and compression by measuring the similarity between detected interest regions in original and transformed versions of videos.
We measure both the repeatibility and introduce a new similarity measure based on mutual information.
The FAST-3D detector is shown to be on par with the other detectors, while showing a significant increase in speed."
}Non-rigid registration using free-form deformations for recognition of facial actions and their temporal dynamics
S. Koelstra and M. Pantic. In Proc. IEEE Conf. Face and Gesture Recognition, pages 1-8, 2008. [pdf] [bibtex]
@INPROCEEDINGS{Koelstra08,
author = {S. Koelstra and M. Pantic},
title = {Non-rigid registration using free-form deformations for recognition of facial actions and their temporal dynamics},
booktitle = "Proc. IEEE Conf. Face and Gesture Recognition",
year = {2008},
pages = {1-8},
abstract = "In this paper we propose an appearance-based approach to recognition
of facial Action Units (AUs) and their temporal segments in frontal-view face videos. Non-rigid registration using free-form deformations is used to determine motion in the face region of an input video. The extracted motion fields are then used to derive motion histogram descriptors. Per AU, a combination of ensemble learners and Hidden Markov Models detects the
presence of the AU in question and its temporal segment in each frame of an input sequence.
When tested for recognition of all 27 lower and upper face AUs, occurring alone or in
combination in 264 sequences from the MMI facial expression database, an average sequence classification rate of 94.3\% was achieved."
}


