Details der Publikation

BanglaSER : A speech emotion recognition dataset for the Bangla language

© 2022 The Author(s). Published by Elsevier Inc..

The speech emotion recognition system determines a speaker's emotional state by analyzing his/her speech audio signal. It is an essential at the same time a challenging task in human-computer interaction systems and is one of the most demanding areas of research using artificial intelligence and deep machine learning architectures. Despite being the world's seventh most widely spoken language, Bangla is still classified as one of the low-resource languages for speech emotion recognition tasks because of inadequate availability of data. There is an apparent lack of speech emotion recognition dataset to perform this type of research in Bangla language. This article presents a Bangla language-based emotional speech-audio recognition dataset to address this problem. BanglaSER is a Bangla language-based speech emotion recognition dataset. It consists of speech-audio data of 34 participating speakers from diverse age groups between 19 and 47 years, with a balanced 17 male and 17 female nonprofessional participating actors. This dataset contains 1467 Bangla speech-audio recordings of five rudimentary human emotional states, namely angry, happy, neutral, sad, and surprise. Three trials are conducted for each emotional state. Hence, the total number of recordings involves 3 statements × 3 repetitions × 4 emotional states (angry, happy, sad, and surprise) × 34 participating speakers = 1224 recordings + 3 statements × 3 repetitions × 1 emotional state (neutral) × 27 participating speakers = 243 recordings, resulting in a total number of recordings of 1467. BanglaSER dataset is created by recording speech-audios through smartphones, and laptops, having a balanced number of recordings in each category with evenly distributed participating male and female actors, and would serve as an essential training dataset for the Bangla speech emotion recognition model in terms of generalization. BanglaSER is compatible with various deep learning architectures such as Convolutional neural networks, Long short-term memory, Gated recurrent unit, Transformer, etc. The dataset is available at https://data.mendeley.com/datasets/t9h6p943xy/5 and can be used for research purposes.

Medienart:	E-Artikel

Erscheinungsjahr:	2022
Erschienen:	2022

Enthalten in:	Zur Gesamtaufnahme - volume:42
Enthalten in:	Data in brief - 42(2022) vom: 31. Juni, Seite 108091

Sprache:	Englisch

Beteiligte Personen:	Das, Rakesh Kumar [VerfasserIn] Islam, Nahidul [VerfasserIn] Ahmed, Md Rayhan [VerfasserIn] Islam, Salekul [VerfasserIn] Shatabda, Swakkhar [VerfasserIn] Islam, A K M Muzahidul [VerfasserIn]

Links:	Volltext

Themen:	Bangla language Deep Learning Journal Article Sound processing Speech emotion recognition

Anmerkungen:	Date Revised 09.04.2022 published: Electronic-eCollection Citation Status PubMed-not-MEDLINE

doi:	10.1016/j.dib.2022.108091

funding:
Förderinstitution / Projekttitel:

PPN (Katalog-ID):	NLM339223111

Internformat


LEADER	01000naa a22002652 4500
001	NLM339223111
003	DE-627
005	20231226002421.0
007	cr uuu---uuuuu
008	231226s2022 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1016/j.dib.2022.108091 \|2 doi
028	5	2	\|a pubmed24n1130.xml
035			\|a (DE-627)NLM339223111
035			\|a (NLM)35392615
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Das, Rakesh Kumar \|e verfasserin \|4 aut
245	1	0	\|a BanglaSER \|b A speech emotion recognition dataset for the Bangla language
264		1	\|c 2022
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 09.04.2022
500			\|a published: Electronic-eCollection
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a © 2022 The Author(s). Published by Elsevier Inc.
520			\|a The speech emotion recognition system determines a speaker's emotional state by analyzing his/her speech audio signal. It is an essential at the same time a challenging task in human-computer interaction systems and is one of the most demanding areas of research using artificial intelligence and deep machine learning architectures. Despite being the world's seventh most widely spoken language, Bangla is still classified as one of the low-resource languages for speech emotion recognition tasks because of inadequate availability of data. There is an apparent lack of speech emotion recognition dataset to perform this type of research in Bangla language. This article presents a Bangla language-based emotional speech-audio recognition dataset to address this problem. BanglaSER is a Bangla language-based speech emotion recognition dataset. It consists of speech-audio data of 34 participating speakers from diverse age groups between 19 and 47 years, with a balanced 17 male and 17 female nonprofessional participating actors. This dataset contains 1467 Bangla speech-audio recordings of five rudimentary human emotional states, namely angry, happy, neutral, sad, and surprise. Three trials are conducted for each emotional state. Hence, the total number of recordings involves 3 statements × 3 repetitions × 4 emotional states (angry, happy, sad, and surprise) × 34 participating speakers = 1224 recordings + 3 statements × 3 repetitions × 1 emotional state (neutral) × 27 participating speakers = 243 recordings, resulting in a total number of recordings of 1467. BanglaSER dataset is created by recording speech-audios through smartphones, and laptops, having a balanced number of recordings in each category with evenly distributed participating male and female actors, and would serve as an essential training dataset for the Bangla speech emotion recognition model in terms of generalization. BanglaSER is compatible with various deep learning architectures such as Convolutional neural networks, Long short-term memory, Gated recurrent unit, Transformer, etc. The dataset is available at https://data.mendeley.com/datasets/t9h6p943xy/5 and can be used for research purposes
650		4	\|a Journal Article
650		4	\|a Bangla language
650		4	\|a Deep Learning
650		4	\|a Sound processing
650		4	\|a Speech emotion recognition
700	1		\|a Islam, Nahidul \|e verfasserin \|4 aut
700	1		\|a Ahmed, Md Rayhan \|e verfasserin \|4 aut
700	1		\|a Islam, Salekul \|e verfasserin \|4 aut
700	1		\|a Shatabda, Swakkhar \|e verfasserin \|4 aut
700	1		\|a Islam, A K M Muzahidul \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t Data in brief \|d 2014 \|g 42(2022) vom: 31. Juni, Seite 108091 \|w (DE-627)NLM251298183 \|x 2352-3409 \|7 nnns
773	1	8	\|g volume:42 \|g year:2022 \|g day:31 \|g month:06 \|g pages:108091
856	4	0	\|u http://dx.doi.org/10.1016/j.dib.2022.108091 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a GBV_NLM
951			\|a AR
952			\|d 42 \|j 2022 \|b 31 \|c 06 \|h 108091

BanglaSER : A speech emotion recognition dataset for the Bangla language

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände