Skip to content

Change spoken corpus format into ELAN

Pre-release
Pre-release
Compare
Choose a tag to compare
@Guanchishan Guanchishan released this 19 Aug 12:18
· 211 commits to master since this release

In this version, we use ELAN to annote spoken corpus except videos.

We plan to annote remaining spoken corpus via ELAN, and upload our format of text corpus annotation.


In 0.3, we drew up format of text corpus annotation. (2020-04-17)

In v0.3-bata.2, we save sounds inside our corpus. (2020-04-16)

In v0.3-bata.1, we began multimedia annotation, saving as .srt. (2019-11-16)

In v0.2, we turn text corpus content into list data structure. (2019-11-16)

In v0.2-beta.1, we have collected corpus in conversational way, uploaded them into GitHub. However we did not record their sounds. (2019-10-26)

In v0.1.2, we began to collect narrative spoken corpus, saving as .mp3 and .doc. (2017-12-01)

In v0.1.1, we began collecting sounds of Eastern Min words, and published them via Wikimedia Commons. (2017-10-30)

In v0.1, we have tried to collect Eastern Min sentences. However we have not recorded sounds. (2017)