BEE-MER: Bimodal Embeddings Ensemble for Music Emotion Recognition

Lima Louro, Pedro Miguel; Ribeiro, Tiago F. R.; Malheiro, Ricardo; Panda, Renato; Pinto de Carvalho e Paiva, Rui Pedro

Publicação:
BEE-MER: Bimodal Embeddings Ensemble for Music Emotion Recognition

dc.contributor.author	Lima Louro, Pedro Miguel
dc.contributor.author	Ribeiro, Tiago F. R.
dc.contributor.author	Malheiro, Ricardo
dc.contributor.author	Panda, Renato
dc.contributor.author	Pinto de Carvalho e Paiva, Rui Pedro
dc.date.accessioned	2025-10-20T13:22:52Z
dc.date.available	2025-10-20T13:22:52Z
dc.date.issued	2025-07-07
dc.description.abstract	Static music emotion recognition systems typically focus on audio for classification, although some research has explored the potential of analyzing lyrics as well. Both approaches face challenges when it comes to accurately discerning emotions that have similar energy but differing valence, and vice versa, depending on the modality used. Previous studies have introduced bimodal audio-lyrics systems that outperform single-modality solutions by combining information from standalone systems and conducting joint classification. In this study, we propose and compare two bimodal approaches: one strictly based on embedding models (audio and word embeddings) and another one following a standard spectrogram-based deep learning method for the audio part. Additionally, we explore various information fusion strategies to leverage both modalities effectively. The main conclusions of this work are the following: i) the two approaches show comparable overall classification performance; ii) the embedding-only approach leads to a higher confusion between quadrants 3 and 4 of Russell’s circumplex model; iii) and this approach requires significantly less computational cost for training. We discuss the insights gained from the approaches we experimented with and highlight promising avenues for future research.	eng
dc.identifier.source-work-id	cv-prod-id-4810273
dc.identifier.uri	http://hdl.handle.net/10400.26/59278
dc.language.iso	eng
dc.peerreviewed	yes
dc.publisher	7
dc.relation	MERGE
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.title	BEE-MER: Bimodal Embeddings Ensemble for Music Emotion Recognition	eng
dc.type	conference paper
dcterms.references	https://zenodo.org/records/13939205
dcterms.references	https://arxiv.org/abs/2407.06060
dspace.entity.type	Publication
oaire.citation.conferenceDate	2025-07-07
oaire.citation.conferencePlace	Graz, Austria
oaire.citation.title	22nd Sound and Music Computing Conference – SMC 2025
oaire.version	http://purl.org/coar/version/c_ab4af688f83e57aa
relation.isAuthorOfPublication	953b69db-f6a0-4a54-8618-93a912df6df6
relation.isAuthorOfPublication	d38dd344-0942-4fcb-b740-a4713fb170e7
relation.isAuthorOfPublication	9cd470af-3968-45cc-ad6f-3b59e00ae823
relation.isAuthorOfPublication	238ee6f8-61cd-49b4-9392-9e7763fd35f3
relation.isAuthorOfPublication.latestForDiscovery	953b69db-f6a0-4a54-8618-93a912df6df6
relation.isProjectOfPublication	32dd2667-ca4e-430b-a130-687ba6eee2e9
relation.isProjectOfPublication.latestForDiscovery	32dd2667-ca4e-430b-a130-687ba6eee2e9

Ficheiros

Pacote original

A mostrar 1 - 1 de 1

Nome:: SMC_2025_Louro.pdf
Tamanho:: 203.61 KB
Formato:: Adobe Portable Document Format

Descarregar

Licença do pacote

A mostrar 1 - 1 de 1

Nome:: license.txt
Tamanho:: 1.85 KB
Formato:: Item-specific license agreed upon to submission
Descrição:

Descarregar

Coleções

IPT - Ci2 - Artigos em Conferências

Publicação: BEE-MER: Bimodal Embeddings Ensemble for Music Emotion Recognition

Ficheiros

Pacote original

Licença do pacote

Coleções

Publicação:
BEE-MER: Bimodal Embeddings Ensemble for Music Emotion Recognition