Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements to ASR API #5

Merged
merged 58 commits into from
Jun 23, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
4904cd6
feat: Add sink arg and make stream optional
ar13pit Apr 23, 2020
7a4fd8c
feat: Add base class for sinks
ar13pit Apr 23, 2020
ab9c6a5
feat: Make WaveFileSink inherit from AudioSinkBase
ar13pit Apr 23, 2020
b1abddb
feat: Add method to open output stream
ar13pit Apr 23, 2020
2e47aba
feat: Add a common base class definition
ar13pit May 16, 2020
526c5c5
feat: Add source and sink args
ar13pit May 16, 2020
d7fb652
feat: Add method to link elements
ar13pit May 16, 2020
d6f4259
feat: Add definition of open method
ar13pit May 16, 2020
9cd803b
feat: Change base class to AsrPipelineElementBase
ar13pit May 16, 2020
4e967fa
feat: Add args timeout and chunk
ar13pit May 16, 2020
9f23331
feat: Update subclassing around AsrPipelineElementBase
ar13pit May 16, 2020
c88c9d5
feat: Make stream instance attribute and start in start
ar13pit May 16, 2020
e03e2ea
fix: Remove AudioSourceBase
ar13pit May 16, 2020
a432281
feat: Rename module to asr
ar13pit May 16, 2020
7f82b59
feat: Add timeout as an arg
ar13pit May 16, 2020
a930738
feat: Initialize timeout arg of super class
ar13pit May 16, 2020
af32118
feat: Add abstract method register_callback
ar13pit May 16, 2020
124f6d1
feat: Move to asr module and subclass AsrPipelineElementBase
ar13pit May 16, 2020
04e726e
feat: Import Asr class from asr
ar13pit May 16, 2020
9c87654
feat: Futurize module
ar13pit May 18, 2020
f396957
feat: Make source and sink private
ar13pit May 18, 2020
f1d1105
feat: Make open, next_chunk and close abstract methods
ar13pit May 18, 2020
3fa685f
fix: Do not override method definition
ar13pit May 18, 2020
681a885
refactor: Use default implementation of stop
ar13pit May 18, 2020
a9fd722
feat: Add return statement back
ar13pit May 18, 2020
f17ac8f
feat: Add preliminary design of the pipeline class
ar13pit May 19, 2020
b54083f
test: Comment out calls to sink
ar13pit May 19, 2020
ab3f994
feat: Add AsrPipeline to module exports
ar13pit May 19, 2020
7edf517
fix: Initialize source and sink to None and set using link
ar13pit May 19, 2020
5bf8bbc
feat: Only set non empty source or sink and not replace them
ar13pit May 19, 2020
acc2154
feat: Set properties of source and sink args to point to current object
ar13pit May 19, 2020
6c80542
feat: Add optional source arg
ar13pit May 19, 2020
87073ea
feat: Add optional sink arg
ar13pit May 19, 2020
e3cc3e4
feat: Add back optional source and sink args
ar13pit May 19, 2020
5d29e3b
feat: Add support to add multiple elements
ar13pit May 19, 2020
74e56dd
test: Add structure of io test on AsrPipeline
ar13pit May 19, 2020
989c63c
feat: Replace start_state with stop_state and complete stop API
ar13pit May 19, 2020
8648311
feat: Add register callback and callback execution
ar13pit May 19, 2020
598d1e1
feat: Add a iteration counter
ar13pit May 19, 2020
9d73079
test: Complete ASR pipeline io test script
ar13pit May 19, 2020
bef5f94
fix: Fix broken pipeline element check logic
ar13pit May 19, 2020
1453a44
feat: Make register_callback optional in elements
ar13pit May 19, 2020
8eec552
refactor: Remove commented sink code
ar13pit May 19, 2020
5f44874
feat: Add a finalize Event
ar13pit May 19, 2020
a4f7281
feat: Add internal finalize state
ar13pit May 19, 2020
9b179cb
fix: Fix setting of finalize state
ar13pit May 19, 2020
ef59ca3
fix: Remove default value of arg in abstractmethod
ar13pit May 19, 2020
a27bd2f
feat: Use internal function to catch StopIteration exception
ar13pit May 20, 2020
ffea3e5
feat: Add chunksize and rate to init, add open, close defs
ar13pit May 20, 2020
ab73e8b
feat: Convert while loop into single iteration function
ar13pit May 20, 2020
104d554
test: Restructure test using AsrPipeline
ar13pit May 20, 2020
e0c31d7
fix: Store model and make model, decoder private
ar13pit May 20, 2020
9c6bd39
feat: Set stop state on StopIteration
ar13pit May 20, 2020
00761dc
feat: Log info about finalizing decoding
ar13pit May 20, 2020
17a0594
test: Remove commented code
ar13pit May 20, 2020
12c8374
feat: Minor version bump
ar13pit May 20, 2020
ddca995
fix: Use ABC instead of ABCMeta for raising exceptions in python2/3
ar13pit Jun 17, 2020
e42e8a1
feat: Raise NotImplementedError as default
ar13pit Jun 17, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
import pkgconfig


VERSION = "0.2.0"
VERSION = "0.3.0"
PACKAGE = "yapykaldi"
PACKAGE_DIR = os.path.join('src', 'python')

Expand Down
120 changes: 0 additions & 120 deletions src/python/yapykaldi/asr.py

This file was deleted.

22 changes: 22 additions & 0 deletions src/python/yapykaldi/asr/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
"""
Yapykaldi ASR: Classes and functions for ASR pipeline
"""

__all__ = [
# From .asr
"Asr",

# From .pipeline
"AsrPipeline",

# From .sources
"PyAudioMicrophoneSource", "WaveFileSource",

# From .sinks
"WaveFileSink"
]

from .asr import Asr
from .pipeline import AsrPipeline
from .sources import PyAudioMicrophoneSource, WaveFileSource
from .sinks import WaveFileSink
84 changes: 84 additions & 0 deletions src/python/yapykaldi/asr/_base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
"""Base classes for the ASR pipeline"""
from __future__ import print_function, division, absolute_import, unicode_literals
from builtins import *
from abc import ABC, abstractmethod
from threading import Event
import pyaudio


class AsrPipelineElementBase(ABC):
"""Class AsrPipelineElementBase is the base class for all Asr Pipeline elements.
It requires three abstract methods to be implemented:
1. open
2. close
3. next_chunk
The right order of setting up an element is:
1. element = AsrPipelineElementBase()
2. element.open() # To open the file, connect the mic etc.
3. element.start() # Start streaming audio data
4. element.next_chunk() # Use the audio data
5. element.stop() # stop getting audio data
6. element.close() # close the file
Elements need to support open and close at least once but must support
start, next_chunk, stop several times
"""
# pylint: disable=too-many-instance-attributes

def __init__(self, source=None, sink=None, rate=16000, chunksize=1024, fmt=pyaudio.paInt16, channels=1, timeout=1):
self._source = None
self._sink = None
self.rate = rate
self.chunksize = chunksize
self.format = fmt
self.channels = channels
self.timeout = timeout
self._finalize = Event()

self.link(source=source, sink=sink)

@abstractmethod
def open(self):
"""Abstract method to open the stream of the element. Opening may or may not start the stream."""

@abstractmethod
def next_chunk(self, chunk):
"""Abstract method to process a chunk generated in the source element or received from the source element"""

@abstractmethod
def close(self):
"""Abstract method to close the stream of the element. In this method all resources of the stream should be
freed."""

def start(self):
"""Optional method to start the stream of the element"""

def stop(self):
"""Optional method to stop the stream of the element"""

def register_callback(self, callback):
"""Register a callback to the element outside the pipeline"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be either implemented or raise NotImplementedError?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't have this implemented then by default it is pass. But do you suggest to add a NotImplementedError ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is pass in a valid behavior, could there be a class that works by using pass? If not, then use NotImplementedError. Otherwise, passis a valid default implementation.

raise NotImplementedError()

def link(self, source=None, sink=None):
"""Link a source or a sink to the element
This method does not override preset source or sink of the element.
:param source: (default None) A source object
:param sink: (default None) A sink object
"""
if (not self._source) and source:
self._source = source
source.link(sink=self)

if (not self._sink) and sink:
self._sink = sink
sink.link(source=self)

def finalize(self):
"""Set the finalize flag of the element"""
self._finalize.set()
122 changes: 122 additions & 0 deletions src/python/yapykaldi/asr/asr.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
"""
Yapykaldi ASR: Class definition for ASR component. It connects to a source and an optional sink
"""
from __future__ import (print_function, division, absolute_import, unicode_literals)
from builtins import *
import struct
import numpy as np
from ._base import AsrPipelineElementBase
from ..logger import logger
from ..nnet3 import KaldiNNet3OnlineDecoder, KaldiNNet3OnlineModel
from ..gmm import KaldiGmmOnlineDecoder, KaldiGmmOnlineModel
from ..utils import volume_indicator


ONLINE_MODELS = {'nnet3': KaldiNNet3OnlineModel, 'gmm': KaldiGmmOnlineModel}
ONLINE_DECODERS = {'nnet3': KaldiNNet3OnlineDecoder, 'gmm': KaldiGmmOnlineDecoder}


class Asr(AsrPipelineElementBase):
"""API for ASR"""
# pylint: disable=too-many-instance-attributes, useless-object-inheritance

def __init__(self, model_dir, model_type, rate=16000, chunksize=1024, debug=False, source=None, sink=None):
"""
:param model_dir: Path to model directory
:param model_type: Type of ASR model 'nnet3' or 'hmm'
:param rate: (default 16000) sampling frequency of audio data. This must be the same as the audio source
:param chunksize: (default 1024) size of audio data buffer. This must be the same as the audio source
:param debug: (default False) Flag to set logger to log audio chunk volume and partially decoded string and
likelihood
:param source: (default None) Element to be connected as source when constructing an AsrPipeline
:type source: AsrPipelineElementBase
:param sink: (default None) Element to be connected as sink when constructing an AsrPipeline
:type sink: AsrPipelineElementBase
"""
super().__init__(chunksize=chunksize, rate=rate, source=source, sink=sink)
self.model_dir = model_dir
self.model_type = model_type

self._model = None
self._decoder = None
self._decoded_string = None
self._likelihood = None

self._string_partially_recognized_callbacks = []
self._string_fully_recognized_callbacks = []

self._debug = debug

def open(self):
# No definition for this method while inheriting abstract class AsrPipelineElementBase
pass

def close(self):
# No definition for this method while inheriting abstract class AsrPipelineElementBase
pass

def next_chunk(self, chunk):
"""Method to start the recognition process on audio stream added to process queue"""
try:
data = np.array(struct.unpack_from('<%dh' % self.chunksize, chunk), dtype=np.float32)
except Exception as e: # pylint: disable=invalid-name, broad-except
logger.error("Other exception happened: %s", e)
raise
else:
if self._decoder.decode(self.rate, data, self._finalize.is_set()):
if self._finalize.is_set():
logger.info("Finalized decoding with latest data chunk")

self._decoded_string, self._likelihood = self._decoder.get_decoded_string()
if self._debug:
chunk_volume_level = volume_indicator(data)
logger.info("Chunk volume level: %s", chunk_volume_level)
logger.info("Partially decoded (%s): %s", self._likelihood, self._decoded_string)

for callback in self._string_partially_recognized_callbacks:
callback(self._decoded_string)

return chunk

raise RuntimeError("Decoding failed")

def stop(self):
"""Stop ASR process"""
logger.info("Stop ASR")

logger.info("Decoding of input stream is complete")
logger.info("Final result (%s): %s", self._likelihood, self._decoded_string)

for callback in self._string_fully_recognized_callbacks:
callback(self._decoded_string)

def start(self):
"""Begin ASR process"""
logger.info("Starting speech recognition")
# Reset internal states at the start of a new call

self._finalize.clear()

logger.info("Trying to initialize %s model from %s", self.model_type, self.model_dir)
self._model = ONLINE_MODELS[self.model_type](self.model_dir)
logger.info("Successfully initialized %s model from %s", self.model_type, self.model_dir)

logger.info("Trying to initialize %s model decoder", self.model_type)
self._decoder = ONLINE_DECODERS[self.model_type](self._model)
logger.info("Successfully initialized %s model decoder", self.model_type)

self._decoded_string = ""
self._likelihood = None

def register_callback(self, callback, partial=False):
"""
Register a callback to receive the decoded string both partial and complete.
:param callback: a function taking a single string as it's parameter
:param partial: (default False) flag to set callback for partial recognitions
:return: None
"""
if partial:
self._string_partially_recognized_callbacks += [callback]
else:
self._string_fully_recognized_callbacks += [callback]
Loading