-
Notifications
You must be signed in to change notification settings - Fork 0
/
params.json
1 lines (1 loc) · 5.22 KB
/
params.json
1
{"name":"MTDSR2015","tagline":"A mandarin corpus for text-dependent speaker recognition.","body":"## Introduction to MTDSR2015 database\r\nMTDSR2015 is the first public and free mandarin database recorded by smartphones, which is published by **_Advanced Data & Signal Processing Laboratory_** at Peking University.\r\nThe original recording was conducted in 2015 by Junhong Liu. The original name was 'MTDSR2015', standing for _Mandarin corpus for Text-dependent Speaker Recognition_. This database was supported by Prof. Yuexian Zou. We hope to provide a toy database for new researchers in the field of speaker verification and speech recognition. The database is totally free to academic users.\r\nThe MTDSR2015 database aims to provide the community with sufficient mandarin corpus for prompted text speaker verification research for smart phone applications. It currently contains **_52940_** audio recording from **_181_** speakers including **_102_** male speakers and **_79_** female speakers. Each speaker was recorded with **_five_** parts:\r\n\r\n 1. Twenty 8-digit sequences\r\n 2. Fifteen poems\r\n 3. Fifteen news sentences\r\n 4. Twenty to thirty phrases and daily expressions unequally\r\n 5. Two lyrics\r\n\r\nThe 8-digit sequences are randomly generated and other materials are selected randomly from the corresponding pre-defined text database.\r\n\r\nConsidering the popularity in China market, four top sailing smart phone models are selected as voice recorders including iPhone 5C, Samsung Note3, HUAWEI mate7 and XM4.\r\n\r\nFor specific applications, the population is selected to be as representative as possible of the target population and the database designed for generic research purposes tend to cover the largest possible population and scenario. \r\n\r\nFor MTDSR2015 database, we consider the demography of the population in terms of age and region which are often considered as two main criteria that affect the speaker verification engines. Selected speakers were between 22 and 51 years old to cover the target population who use smart phones more often. \r\n\r\nAdditionally, we consider the effect of the regions that speakers come from. We try to cover the target provinces where people mainly speak mandarin, which has covered 28 provinces and areas in China.\r\n\r\nIn MTDSR2015, we considered channel variability problem based on four mainstream smart phone types in China market, which are iPhone 5C, Samsung Note3, HUAWEI mate7 and XM4.\r\n\r\n##CONTENT\r\nThe entire package involves the full set of speech and language resources required to establish a Chinese speech recognition system.\r\n\r\n>|MTDSR2015\r\n>\r\n>|---------|-HWmate7\r\n>\r\n>|------------------|-spk001\r\n>\r\n>|---------------------------|-digits\r\n>\r\n>|------------------------------------|-001_01.wav\r\n>\r\n>|------------------------------------|-001_02.wav\r\n>\r\n>|------------------------------------|-001_03.wav\r\n>\r\n>|------------------------------------|-001_...\r\n>\r\n>|------------------------------------|-001_20.wav\r\n>\r\n>|---------------------------|-poem\r\n>\r\n>|------------------------------------|-001_21.wav\r\n>\r\n>|------------------------------------|-001_22.wav\r\n>\r\n>|------------------------------------|-001_23.wav\r\n>\r\n>|------------------------------------|-001_...\r\n>\r\n>|------------------------------------|-001_35.wav\r\n>\r\n>|------------------|-spk002\r\n>\r\n>|------------------|-spk003\r\n>\r\n>|------------------|-...\r\n>\r\n>|------------------|-spk128\r\n>\r\n>\r\n>|---------|-Samsung Note2\r\n>\r\n>|---------|-XM4\r\n>\r\n>|---------|-wav_data\r\n>\r\n>|---------|data.ls\r\n\r\n wav :signals including the training/cv/test sets.\r\n spkXXX :stands for speaker #XXX.\r\n ls :configuration files which maintains path routes.\r\n\r\n##PERFORMANCE\r\nWe call for comtetition on this database. We conducted experiments on MTDSR2015 and RSR2015 to evaluate the performance of our proposed CDDD-SVS with different channel compensation methods which refer to WCCN, NAP and LDA.Results have demonstrated the effectiveness of our proposed CDDD-SVS developed with ivector followed byand WCCN, which achieves the best performance onwith MTDSR2015.\r\n\r\nResearchers are welcomed to challenge and give advice to the current state-of-the-art!\r\n\r\n##LOCAL DOWNLOAD\r\nNot yet available\r\n\r\n##PUBLIC DOWNLOAD\r\nFor public download, you must fill an [application form](yichihuang.github.io) first.\r\n\r\n##LICENSE\r\nAll the resources contained in the database are free for research institutes and individuals.\r\n\r\n No commerical usage is permitted. \r\n\r\nWe are very happy if you cite the following paper in your publications:\r\n\r\nMTDSR2015 : A Free mandarin corpus for text-dependent speaker recognition. [pdf]\r\n\r\n##PEOPLE\r\nJunhong Liu, Yuexian Zou, Yichi Huang @ADSP, Peking University Shenzhen Graduate School.\r\n\r\n##CONTACTOR\r\n\r\nJunhong Liu,Yuexian Zou\r\n\r\nADSP, Peking University Shenzhen Graduate School.\r\n\r\[email protected]\r\n\r\nROOM A-306\r\n\r\nPeking University Shenzhen Graduate School\r\n\r\n[http://adsp.szpku.edu.cn/](http://adsp.szpku.edu.cn/)","google":"","note":"Don't delete this file! It's used internally to help with page regeneration."}