-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolves #118: text to speech #119
base: master
Are you sure you want to change the base?
Conversation
branch for #118
...just for fun :P
Please have a review (before I can't stop myself and put even more in this "just trying smth real quick"-issue :D). Unfortunately currently the game is only startable from the IDE. When building with maven and starting the
:( Hint: for proper testing/debugging you should adjust the logback.xml: <logger name="de.gleex.pltcmd.game.ui" level="INFO"/>
<logger name="de.gleex.pltcmd.game.sound" level="DEBUG"/> |
game/options/src/main/kotlin/de/gleex/pltcmd/game/options/GameOptions.kt
Outdated
Show resolved
Hide resolved
Done. BTW as a future improvement I thought about "randomized effects". To not make it too monotonous have slight variations in the used effects for every voice line, for example adjust the rate to have slightly faster or slower replay. Or even better: Derive some pitching or equal effect from the callsign so that every element has "its own unique voice". But that is something we can add later. That is the reason why the filename consists of the text and the effect hash. Actually the speech example first had the option to set different effects to directly compare how they sound on the same text. And it always generated one wav file for a voice line and changing the effect did nothing :D |
import javax.sound.sampled.SourceDataLine | ||
|
||
/** | ||
* Use this object for text-to-speech. All strings given to [say] wll be read aloud one after another. This means |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Use this object for text-to-speech. All strings given to [say] wll be read aloud one after another. This means | |
* Use this object for text-to-speech. All strings given to [say] will be read aloud one after another. This means |
|
||
playLoop = GlobalScope.launch { | ||
log.debug("Launching play loop") | ||
while (isActive) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be always true as "Stopped play loop" is never logged! I think an own scope is needed that can be cancelled.
log.debug("Ignored phrase detected, not saying '$text'") | ||
return | ||
} | ||
val speechDirectory = File(FOLDER_SPEECH_FILES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
// reuse existing texts if they had the same effects | ||
mary.audioEffects = effects.toString() | ||
val path = "${speechDirectory.absolutePath}/${text.hashCode()}_${mary.audioEffects.hashCode()}.wav" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, use a model class (File
/Path
) instead of String! So you don't have to use the os specific path separator '/ '.
|
||
// reuse existing texts if they had the same effects | ||
mary.audioEffects = effects.toString() | ||
val path = "${speechDirectory.absolutePath}/${text.hashCode()}_${mary.audioEffects.hashCode()}.wav" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hashCode()
method returns the same hash only in the same JRE runtime instance! It is not fix for multiple application startups!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are deleted at (normal) JVM-termination anyways ;)
} | ||
soundFile.deleteOnExit() | ||
|
||
filenames.send(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use soundFile
instead of path
as String is not very useful.
log.debug("Starting MaryTTS engine...") | ||
mary = LocalMaryInterface() | ||
if(Mary.currentState() == Mary.STATE_RUNNING) { | ||
log.debug("MaryTTS started.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you mention in the KDoc that the startup may take some time, it would be nice to log the time it took. For me currently the difference between the two log messages is about 1.500 ms.
@@ -50,6 +55,8 @@ open class Main { | |||
open fun run() { | |||
val application = SwingApplications.startApplication(UiOptions.buildAppConfig()) | |||
|
|||
Speaker.startup() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this takes some time consider calling this asynchronous!
private fun play(filename: String) { | ||
val soundFile = File(filename) | ||
if (soundFile.exists()) { | ||
val audioStream = AudioSystem.getAudioInputStream(soundFile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check if reading files is faster than calculating the text-to-speech. Because disk i/o is relatively slow. But we could also cache the bytes in memory (for example with a fixed 10 MiB pool). But this are performance considerations.
|
||
it.drain() | ||
} | ||
} finally { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you want to lock the audio when in use. So instead of an AtomicBoolean I would suggest a corresponding mutual exclusion functionality from java.util.concurrent. Maybe a Semaphore
is more approtiate as a Lock
. The benefit is that you can just wait for it to be freed instead of the while-delay in waitForQueueToEmpty()
.
Edit: This looks like a Guarded Block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS: maybe we can just use Jobs? They can be waited for (joined).
it.start() | ||
|
||
var count = 0 | ||
val buffer = ByteArray(4096) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 4096 instead of sourceDataLine.getBufferSize()
?
@ExperimentalCoroutinesApi | ||
object Speaker { | ||
|
||
private const val FOLDER_SPEECH_FILES = "./speech" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the os specific "./" and just use the name. Or provide a path object (see below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks and works good! For some implementation detail improvements see the comments.
…peech # Conflicts: # game/application/src/main/resources/logback.xml # game/ui/src/main/kotlin/de/gleex/pltcmd/game/ui/views/GameView.kt
# Conflicts: # game/application/src/main/resources/logback.xml
I stumbled upon a text-to-speech library and thought about letting it read out the radio calls. And well, a little tinkering and... here we are :)