500-Hours-Minnan-Dialect-Conversational-Speech-Data-by-Mobile-Phone

Description

The 601 Hours – Minnan Dialect Conversational Speech Data collected by phone involved more than 1,000 native speakers, developed with a proper balance of gender ratio and geographical distribution. Speakers would choose a few familiar topics out of the given list and start conversations to ensure the dialogue's fluency and naturalness. The recording devices are various mobile phones. The audio format is 16kHz, 16bit, uncompressed WAV, and all the speech data was recorded in quiet indoor environments. All the speech audio was manually transcribed with text content, and the start and end timestamps of each effective sentence and speaker identification, including gender, were also annotated. The accuracy rate of sentences is ≥ 95%.

For more details, please refer to the link: https://www.nexdata.ai/datasets/speechrecog/1127?source=Github

Format

16kHz, 16bit, uncompressed wav, mono channel

Recording Environment

quiet indoor environment, without echo

Recording content

dozens of topics are specified, and the speakers make dialogue under those topics while the recording is performed

Demographics

about 1,000 speakers, balance for gender

Annotation

annotating for the transcription text, speaker identification and gender

Device

Android mobile phone, iPhone

Language

Minnan Dialect

Application scenarios

speech recognition; voiceprint recognition

Accuracy rate

95%

Licensing Information

Commercial License

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

500-Hours-Minnan-Dialect-Conversational-Speech-Data-by-Mobile-Phone

Description

Format

Recording Environment

Recording content

Demographics

Annotation

Device

Language

Application scenarios

Accuracy rate

Licensing Information

About

Releases

Packages

Contributors 2

Nexdata-AI/500-Hours-Minnan-Dialect-Conversational-Speech-Data-by-Mobile-Phone

Folders and files

Latest commit

History

Repository files navigation

500-Hours-Minnan-Dialect-Conversational-Speech-Data-by-Mobile-Phone

Description

Format

Recording Environment

Recording content

Demographics

Annotation

Device

Language

Application scenarios

Accuracy rate

Licensing Information

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages