-
Notifications
You must be signed in to change notification settings - Fork 2
/
index.php
140 lines (125 loc) · 9.38 KB
/
index.php
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
<?php
/**
* The template for displaying main page.
*/
error_reporting(E_ALL);
ini_set("display_errors", 1);
?>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
<meta name="description" content="The Arabic Speech Corpus or the Arabic Speech Database is an annotated speech corpus for high quality speech synthesis. The anotations are to the phoneme level and include stress marks.">
<meta name="author" content="Nawar Halabi">
<link rel="icon" href="img/favicon.ico">
<title>Arabic Speech Corpus</title>
<!-- Bootstrap core CSS -->
<link href="css/bootstrap.min.css" rel="stylesheet">
<!-- Custom styles for this template -->
<link href="css/main.css" rel="stylesheet">
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.0.0-alpha1/jquery.min.js"></script>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-70300143-1', 'auto');
ga('send', 'pageview');
</script>
<script src="js/analytics.js"></script>
</head>
<body>
<!-- Begin page content -->
<div class="container">
<a href="http://en.arabicspeechcorpus.com">
<img id="site-logo" src="img/logo.png" alt="logo" />
</a>
<div class="language-selector">
<a href="http://ar.arabicspeechcorpus.com/" title="مجموع (قاعدة بيانات) النطق بالعربية">Arabic</a>
|
<a href="http://en.arabicspeechcorpus.com/" title="Arabic Speech Corpus">English</a>
</div>
<div class="page-header">
<h1>Arabic Speech Corpus</h1>
</div>
<p>This Speech corpus has been developed as part of PhD work carried out by <a href="https://uk.linkedin.com/pub/nawar-halabi/65/532/67b" title="Nawar Halabi">Nawar Halabi</a> at the <a href="http://www.southampton.ac.uk/" title="University of Southampton">University of Southampton</a>. The corpus was recorded in south Levantine Arabic (Damascian accent) using a professional studio. Synthesized speech as an output using this corpus has produced a high quality, natural voice.</p>
<p>It is released here under the creative commons license specified below. In case further rights are required, or you require consultancy for building Arabic speech corpora, please contact <a href="mailto:[email protected]" target="_top">Nawar Halabi</a> by email. Thank you for your interest.</p>
<p>
<a class="btn btn-success btn-lg center-block download-btn" gacode="whole-version-1" href="arabic-speech-corpus.zip" title="Download Package">Download Corpus Package</a>
</p>
<p>Please feel free to try my Conditional Random Field based, high quality diacritiser for Arabic which can work on mobile phones.</p>
<p>
<a class="btn btn-info btn-lg center-block download-btn" gacode="whole-version-1" href="diacritiser.php" target="_blank" title="High Quality Diacritier Demo">High Quality Diacritiser Demo</a>
</p>
<div class="page-header">
<h1>The package includes</h1>
</div>
<ul>
<li>1813 .wav files containing spoken utterances.</li>
<li>1813 .lab files containing text utterances.</li>
<li>1813 .TextGrid files containing the phoneme labels with time stamps of the boundaries where these occur in the .wav files. These files can be opened using <a href="www.fon.hum.uva.nl/praat/" title="praat">Praat software</a>.</li>
<li>phonetic-transcript.txt which has the form "[wav_filename]" "[Phoneme Sequence]" in every line.</li>
<li>orthographic-transcript.txt which has the form "[wav_filename]" "[Orthographic Transcript]" in every line. Orthography is in <a href="http://www.qamus.org/transliteration.htm" title="buckwalter transliteration">Buckwalter Format</a> which is friendlier where there is software that does not read Arabic script. It can be easily converted back to Arabic.</li>
<li>There is an extra 18 minutes of fully annotated corpus (separate from above but with the same structure as above) which was used to evaluted the corpus (see PhD thesis). Feel free to use it in your applications.</li>
</ul>
<div class="page-header">
<h1>Documentation</h1>
</div>
<p>More documentation will be added in the future. Please refer to Nawar Halabi's <a href="Nawar Halabi PhD Thesis Revised.pdf" alt="Download Nawar's PhD Thesis">PhD Thesis</a> for more details. Please note that the apostrophe which follows some vowel phonemes in the corpus indicates that the vowel is in a stressed syllable. Feel free to visit the <a href="https://en.wikipedia.org/wiki/Arabic_Speech_Corpus">Arabic Speech Corpus Wikipedia page</a> for more information about the corpus.</p>
<div class="page-header">
<h1>Demo<h1>
</div>
<p><a href="https://github.com/nawarhalabi/festival-tts-arabic-voices-docker">https://github.com/nawarhalabi/festival-tts-arabic-voices-docker</a> In this repo there is a Docker image for this TTS server which can run on most platforms easily</p>
<p>Thank you very much to Taha Zerrouki, Ahmad Barqawi, Karim Hemina and Oussama Hemina for their work to produce this TTS:</p>
<ol><li><a href="https://github.com/linuxscout/festival-tts-arabic-voices">Festival for Arabic</a></li>
<li><a href="https://github.com/linuxscout/mishkal">Mishkal Diacritiser</a></li>
<li><a href="https://github.com/Barqawiz/Shakkala">Shakkala Diacritiser</a></li></ol>
<p>Thank you to Ali Hamdi, Ibrahim Tuffaha, Baraa' Al-Jawarneh and Mahmoud Al-Ayyoub for their work on Shakkelha which is the best diacritiser as far as I know. <a href="https://github.com/AliOsm/shakkelha">https://github.com/AliOsm/shakkelha</a></p>
<textarea id="input-text" dir="rtl" class="col-xs-12" name="arabic-text" rows="5" placeholder="Please enter text"></textarea>
<!--<div class="row">
<div class="col-xs-12">
<div class="g-recaptcha" data-sitekey="6LfAiCQUAAAAAAkuQUSoRpD6L-g4bHTOftuhI0yA"></div>
</div>
</div>-->
<input id="tts-btn-mishkal" class="btn btn-success" type="button" name="synthesise-mishkal" value="Synthesise (Mishkal as diacritiser)" />
<input id="tts-btn-shakkala" class="btn btn-success" type="button" name="synthesise-shakkala" value="Synthesise (Shakkala as diacritiser)" />
<input id="tts-btn-shakkelha" class="btn btn-success" type="button" name="synthesise-shakkelha" value="Synthesise (Shakkelha as diacritiser)" />
<audio controls>
<source id="source" src="" type="audio/wav">
Browser does not support this
</audio>
<div id='waiting-gif'></div>
<div class="page-header">
<h1>License</h1>
</div>
<p>
<a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" /></a>.
<br />
Arabic Speech Corpus by <a href="/" title="Nawar Halabi" rel="cc:attributionURL">Nawar Halabi</a> is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/"> Creative Commons Attribution 4.0 International License</a>. Based on a work at <a title="Arabic Speech Corpus" href="/" rel="dct:source">www.arabicspeechcorpus.com</a>.
</p>
<div class="page-header">
<h1>Help us keep the corpus free</h1>
</div>
<p>Developing and hosting the corpus costs time and money. You are welcome to make a contribution if you think we deserve it :)</p>
<form action="https://www.paypal.com/donate" method="post" target="_top">
<input type="hidden" name="hosted_button_id" value="GRY6H726LX5HG" />
<input type="image" src="https://www.paypalobjects.com/en_US/DK/i/btn/btn_donateCC_LG.gif" border="0" name="submit" title="PayPal - The safer, easier way to pay online!" alt="Donate with PayPal button" />
<img alt="" border="0" src="https://www.paypal.com/en_DE/i/scr/pixel.gif" width="1" height="1" />
</form>
</div>
<footer class="footer">
<div class="container">
<p class="text-muted">© <?php echo date("Y"); ?> Nawar Halabi. All rights reserved.</p>
</div>
</footer>
<script src="js/synthesise.js"></script>
</body>
</html>