20200727

在开源社区，除了snowboy（不开源）这个比较出名的唤醒词引擎，其实还有好几个，
这些唤醒词引擎可以在一个叫rhasspy的离线语音助手开源项目中看到：
第一个是porcupine，是加拿大一家公司Picovoice的项目（不开源），另一个是mycroft-precise（开源），公司在美国。
至于snowboy（不开源），我以前说过了被百度收购了。而至于pocketsphinx（开源）就不说了（太古老了）。
其实如果不考虑设备的可移动性，甚至deepspeech（开源）、Julius（开源）、kaldi（开源）
这些重量级的项目也有可能可以用来做成嵌入式产品，
例如用于树莓派，所以不仅仅只是这几个，应该数量不少

如何用树莓派3b运行简单的deepspeech命令行进行wav文件的语音识别。

前面我提到一篇文章介绍这方面的经验：
https://www.seeedstudio.com/blog/2020/01/23/offline-speech-recognition-on-raspberry-pi-4-with-respeaker/
另外还有一篇，用的是0.6.0版本：
https://dev.webonomic.nl/trying-out-deepspeech-on-a-raspberry-pi-4
（一）关于硬件：官方明确支持树莓派3和树莓派4，所以大部分arm linux开发板都应该支持。但性能有差距，后面我会说。
（二）关于软件：建议使用最新的raspbian系统ROM，我用的是2020-05-27 Buster，非full版的ROM，使用自带的python 3.7和pip3安装deepspeech==0.6.0。
旧版的python 3和python 2应该无法安装0.6.0。至于为什么pip3会搜索到一个deepspeech-tflite的软件包，那个可以忽略不管。另外需要用到sudo，否则无法把deepspeech命令添加到PATH
（三）关于deepspeech命令行参数：推荐使用tflite后缀（tensorflow lite）的模型数据文件（--model参数）：output_graph.tflite。
其他还有两个模型数据文件，一个是pb后缀（protobuf的缩写），另一个是pbmm后缀（protobuf的mmap版），我没有测试，
理论上tflite版占用内存要小，不容易崩溃。另外两个参数trie（字典树）和lm（语言模型），据说这两个是可选参数，尤其是lm文件的体积非常大，
但我没有测试，最好加上（具体用法参考deepspeech帮助）
（四）关于软件包依赖。deepspeech据说是基于tensorflow的，但实际安装python包时没有依赖于tensorflow，我猜测这里有玄机（五）关于性能。如果用树莓派3b运行，速度是10秒左右（据说树莓派4可以提升到2秒左右），具体结果如下
（分别是听写结果、听写时间、wav持续时间）：

experience proof of, 12.486s, 1.975s
why should one halt on the way, 14.917s, 2.735s
your paris efficient i said, 11.295s, 2.590s
所以这个语音识别引擎有两个硬伤：一是官方模型数据只支持英语。二是比较重量级，需要性能比较好的硬件才能达到实时听写的效果

TensorFlow 到底有几种模型格式？

https://blog.csdn.net/zjc910997316/article/details/82853791
lm:语言模型
lm和trie可选

基于Freeswitch + Unimrcp + 谷歌ASR 的语音识别的实现

https://blog.csdn.net/chuiyg/article/details/90767769

pocketsphinx

https://github.com/Ebiroll/aiot.git

git clone https://github.com/cmusphinx/sphinxbase.git
git clone https://github.com/cmusphinx/pocketsphinx.git
cd sphinxbase
./autogen.sh
make
cd ..
cd pocketsphinx
./autogen.sh
make

src/programs/pocketsphinx_continuous -inmic yes -hmm model/en-us/en-us 
-lm model/en-us/en-us.lm.bin -dict model/en-us/cmudict-en-us.dict

simple.jsfg

#JSGF V1.0;
grammar all;
public <all> = turn ( on | off ) the lights;

src/programs/pocketsphinx_continuous -inmic yes -hmm model/en-us/en-us 
-dict model/en-us/cmudict-en-us.dict -jsgf simple.jsfg

DeepPavlov

https://github.com/deepmipt/DeepPavlov
http://www.diegorobot.com
https://github.com/andelf/PyAIML

website, html5

https://github.com/2fps/recorder
https://github.com/2fps/demo
https://github.com/giscafer/street-address-search/tree/40344aab0f0ed0d4b9d4deb72adeaa7a9fbd43d8

HTK

http://htk.eng.cam.ac.uk/download.shtml
https://labrosa.ee.columbia.edu/doc/HTKBook21/node1.html
https://www.zhihu.com/question/65516424
https://www.cnblogs.com/ansersion/p/4155951.html

Common Voice

https://commonvoice.mozilla.org/zh-CN/datasets

Python+TensorFlow机器学习实战, 第9章

tensorflow_speech_recognition_demo, 9.2 听懂数字
https://github.com/llSourcell/tensorflow_speech_recognition_demo
英文数字语音识别
https://blog.csdn.net/weixin_44345862/article/details/86887448
https://github.com/pannous/tensorflow-speech-recognition/blob/master/number_classifier_tflearn.py
spoken_numbers_pcm dataset
https://github.com/pannous/tensorflow-speech-recognition
see /spoken_numbers_pcm.tar
ChineseTrain, 9.3 听懂中文
https://github.com/illool/TensorFlow/tree/master/ChineseTrain
http://www.openslr.org/18/
https://github.com/18515350435/TensorFlowTest/blob/master/TensorFlow/LSTM构建语音分类模型/12声音分类.py
很多其他例子
https://github.com/weimingtom/TensorFlowTest/tree/master/TensorFlow
https://github.com/illool/TensorFlow
https://github.com/XqFeng-Josie/Tensorflow
Tacotron, 9.4 语音合成
https://github.com/Kyubyong/tacotron

lstm

https://blog.csdn.net/yj13811596648/article/details/89499432

hmm

https://github.com/WiseDoge/plume/blob/master/plume/hmm.py
search github: forward_prob model numpy

vosk-api

https://alphacephei.com/vosk/models
http://t.rock-chips.com/forum.php?mod=viewthread&tid=1478

人工智能开发系列(6) 语音命令识别, RK3399ProD

http://t.rock-chips.com/forum.php?mod=viewthread&tid=456

角蜂鸟

https://www.senscape.com.cn/hornedsungem/

How to Make a Simple Tensorflow Speech Recognizer

https://v.youku.com/v_show/id_XMjYzNDUzODc1Ng==.html
https://github.com/pannous/tensorflow-speech-recognition

kaldi

https://www.jianshu.com/p/4e74861b47e9

baidu ai studio, dataset

https://aistudio.baidu.com/aistudio/projectoverview/public/1?tags=23
合成数据集下载：
CMU ARCTIC (en)-李开复实验室: http://festvox.org/cmu_arctic/
LJSpeech (en): 2.6G https://keithito.com/LJ-Speech-Dataset/
thchs30: 清华大学30小时的数据集（中文） 6.4G http://www.openslr.org/18/

deepvoice3, tts

https://github.com/r9y9/deepvoice3_pytorch

tensorflow datasets

https://tensorflow.google.cn/datasets/catalog/ljspeech
(TODO) baidupan search LJSpeech-1.1.tar.bz2
see https://tensorflow.google.cn/datasets/catalog/ljspeech, open github page to get download url
https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/audio

Mastering Machine Learning with scikit-learn

https://github.com/PacktPublishing/Mastering-Machine-Learning-with-scikit-learn-Second-Edition

Speech Recognition: Free Software and Complete Privacy

https://unix.stackexchange.com/q/256138/16704

http://cmusphinx.sourceforge.net/
http://www.kiecza.net/daniel/linux/
http://www.speech.cs.cmu.edu/comp.speech/Section6/Recognition/ears.html
http://julius.osdn.jp/
http://kaldi.sourceforge.net/
https://github.com/alumae/kaldi-gstreamer-server
https://web.archive.org/web/19990508201353/http://biz.yahoo.com/bw/990426/ny_ibm_1.html
http://nico.nikkostrom.com/
http://freespeech.sourceforge.net/
http://www-i6.informatik.rwth-aachen.de/rwth-asr/
http://shout-toolkit.sourceforge.net/
http://voxhub.io/silvius
http://simon-listens.org/index.php?id=122
http://xvoice.sourceforge.net/
https://appdb.winehq.org/objectManager.php?sClass=application&iId=2077
https://sourceforge.net/projects/natlink/
https://pypi.python.org/pypi/dragonfly
https://github.com/TristenHayfield/damselfly
https://github.com/DragonComputer/Dragonfire

基于kaldi的在线中文识别初体验

https://note.abeffect.com/articles/2020/02/10/1581269426654.html
https://note.abeffect.com/articles/2020/02/10/1581269678158.html

Alibaba-MIT-Speech

https://github.com/alibaba/Alibaba-MIT-Speech

超神经

https://hyper.ai/datasets/6792

Android NNAPI

https://github.com/JDAI-CV/DNNLibrary
https://zhuanlan.zhihu.com/p/30926958
rk3399 android例子
http://wiki.friendlyarm.com/wiki/index.php/NanoPi_M4V2/zh
https://github.com/rockchip-linux/tensorflow/tree/master/tensorflow/contrib/lite/java/demo
https://tensorflow.google.cn/lite/guide/android

[经验] 【瑞芯微RK1808计算棒试用体验】搭建Linux（Ubuntu18.04）环境试用RK1808

http://bbs.elecfans.com/jishu_1873423_1_1.html

rknn_toolkit, for PC

http://wiki.t-firefly.com/zh_CN/Core-1808-JD4/npu_rknn_toolkit.html

在ARM板子上把玩Tensorflow Lite

https://blog.csdn.net/computerme/article/details/80345065
https://blog.csdn.net/mhsszm/article/details/80610042

Sipeed Maixシリーズの活用レシピ

https://hrkz.tokyo/sipeed-maix-ideas/
https://github.com/andriyadi/Maix-SpeechRecognizer
https://github.com/Technica-Corporation/Speech_Recognition-Maixduino
https://en.bbs.sipeed.com/t/topic/870
CNN+CTC

如何自制一个超迷你的语音助手

https://zhuanlan.zhihu.com/p/72896282

The Ultimate Guide To Speech Recognition With Python

https://realpython.com/python-speech-recognition/
python语音识别终极指南
https://cloud.tencent.com/developer/article/1109408?fromSource=waitui

apiai
assemblyai
google-cloud-speech
pocketsphinx
SpeechRecognition
watson-developer-cloud
wit

语音识别中的CTC算法的基本原理解释

https://www.cnblogs.com/qcloud1001/p/9041218.html
白话CTC(connectionist temporal classification)算法讲解
https://blog.csdn.net/luodongri/article/details/77005948
CTC Algorithm Explained Part 1：Training the Network（CTC算法详解之训练篇）
http://xiaodu.io/ctc-explained

PaddleOCR

https://github.com/PaddlePaddle/PaddleOCR

adafruit, tensorflow lite

https://learn.adafruit.com/tensorflow-lite-for-circuit-playground-bluefruit-quickstart?view=all
https://adafruit.github.io/arduino-board-index/package_adafruit_index.json
https://learn.adafruit.com/tensorflow-lite-for-circuit-playground-bluefruit-quickstart?view=all#micro-speech-demo
https://github.com/adafruit/Adafruit_TFLite
search baidupan, tflite_tensorflow_lite_adafruit
Arduino_TensorFlowLite

esp-sr, WakeNet

https://github.com/espressif/esp-sr/blob/master/wake_word_engine/README_cn.md
https://github.com/espressif/esp-sr/tree/master/wake_word_engine
https://arxiv.org/abs/1703.05390
CRNN+CTC
https://github.com/espressif/esp-sr/tree/master/speech_command_recognition
我以前猜测ESP32用的算法是LSTM+CTC，不过根据现在官方的说法，应该是CRNN+CTC。
当然这个说法也是猜测，不排除它的最新版用的是更先进的算法（参考：
https://github.com/espressif/esp-sr/tree/master/speech_command_recognition ）说起CRNN+CTC，网上比较普遍的说法是一种OCR文字识别技术，另一个值得注意的地方是，
官方提到的CRNN原始论文：
https://arxiv.org/abs/1703.05390
，（参考：
https://github.com/espressif/esp-sr/blob/master/wake_word_engine/README_cn.md
）其实就是我之前说的ML-KWS是一样的，所以可以得到这样的结论，ESP32的WakeNet旧版本（闭源）和ARM的ML-KWS（开源）是同源的（CRNN），
MultiNet是加上CTC版本（CRNN+CTC），而WakeNet新版本（闭源）则基于Dilated CNN，
ESP32的算法都使用了MFCC

国产离线语音识别芯片对比

https://zhuanlan.zhihu.com/p/166078186
技术阶段/识别类型/算法类型/算法名称/企业类型/代表厂商/主处理器
1.0/特定人识别/模型匹配/VQ\DTW/传统型/凌阳/MCU或者通用DSP
2.0/非特定人识别/概率统计/GMM+HMM/传统型/新塘（赛维）、山景、九芯、ICRoute、唯创/MCU或者通用DSP
3.0/非特定人识别/辨别器分类、深度神经网络/DNN、RNN、CNN+HMM/互联网型|纯芯片型/讯飞、思必驰、云知声、士兰微（阿里、百度、互问、华镇）|探境、知存、启英、清微、人麦、国芯

Sparkfun Edge, TinyML

https://github.com/sparkfun/Tensorflow_AIOT2019
嵌入式下的深度学习 Sparkfun Edge with TensorFlow（一）Hello World
https://www.cnblogs.com/guangnianxd/p/12542184.html
Arduino BSP
https://github.com/sparkfun/Arduino_Boards/blob/master/IDE_Board_Manager/package_sparkfun_index.json
https://github.com/sparkfun/SparkFun_Edge
https://learn.sparkfun.com/tutorials/using-sparkfun-edge-board-with-ambiq-apollo3-sdk
Arduino IDE, magic wand
https://learn.sparkfun.com/tutorials/programming-the-sparkfun-edge-with-arduino

speex

低码率音频编码参考设计
http://blog.sina.com.cn/s/blog_4680937f0102ycic.html
https://github.com/xiph/opus

Intelligent-speech-robot

https://github.com/1158114251/-Intelligent-speech-robot
https://mc.dfrobot.com.cn/thread-25649-1-1.html

SmartSpeaker, stm32f407

一个基于云端语音识别的智能控制设备，类似于天猫精灵，小爱同学。采用的芯片为stm32f407,wm8978,esp8266。
https://github.com/lovelyterry/SmartSpeaker

uSpeech, µSpeech, arduino, stm32

https://github.com/arjo129/uSpeech
https://arjo129.wordpress.com/experiments/µspeech/
https://hsel.co.uk/2016/01/06/stm32f0-uspeech-port/
https://github.com/pyrohaz/STM32F0-uSpeechPort

《深度实践OCR：基于深度学习的文字识别》随书代码

CRNN+CTC文字识别
https://github.com/ocrbook/ocrinaction

TinyML-ESP32

https://github.com/HollowMan6/TinyML-ESP32
https://github.com/tanakamasayuki/Arduino_TensorFlowLite_ESP32

61凌阳单片机

https://github.com/super-1943/MCU/tree/master/sunplus
https://github.com/weimingtom/MCU/tree/master/sunplus

一些常用的语音特征提取算法

https://www.cnblogs.com/LXP-Never/p/11725378.html

(NOT GOOD, only for code reading on pudn web page) pudn, biguo100

语音识别程序,STC12C5620AD单片机,利用DFT算法
http://www.pudn.com/Download/item/id/1988530.html
http://www.biguo100.com/news/33409.html
基于MSP430单片机，workbench环境，LPCC算法，实现简单语音识别
http://www.biguo100.com/news/9782.html
http://www.pudn.com/Download/item/id/830873.html

search phoneme speech

【安信可ESP32语音开发板专题①】ESP32-A1S音频开发板之离线语音识别控制LED灯

https://blog.csdn.net/Boantong_/article/details/104457259
https://docs.ai-thinker.com/esp32
https://docs.ai-thinker.com/esp32-audio-kit
https://github.com/donny681/esp-adf/tree/master/ai-examples
https://github.com/Ai-Thinker-Open/Ai-Thinker-Open_ESP32-A1S_ASR_SDK/tree/master/examples/Smart_home_scene_AI

VAD_campare

https://github.com/mengsaisi/VAD_campare

腾讯云语音, 腾讯云叮当语音识别ASR平台

https://dingdang.qq.com/doc/page/285
https://www.it610.com/article/1288354813658079232.htm
https://dingdang.qq.com/doc.html?dir=/doc/tvs/cloud/api.html

Xunfei (iflytek) WebAPI v2

关于讯飞的AIUI对接，如果使用场景不是安卓，而是某些单片机或者arm linux之类，建议最好用WebAPI V2的接口去对接，
这样就可以绕过dll和so的兼容问题（官方只适配了x86，除非用的是安卓）。不过讯飞的WebAPI有点诡异，如果你不把应用发布成正式版，
是看不到兜底设置的机器人回答结果（例如图灵机器人），原因是讯飞不允许在测试环境下使用正式环境的设置
（也就是说，默认情况下是不添加兜底设置的），除非你在scene参数后面加_box后缀，例如这样：
（当然你也可以通过审核弄成发布，这样就不需要那么麻烦了）。另一个注意事情是不要开启白名单，
否则也不会返回正确的聊天回答
https://github.com/IflytekAIUI/DemoCode/blob/master/webapi_v2/java/WebaiuiDemo.java
https://console.xfyun.cn/app/myapp
https://console.xfyun.cn/services/iat

腾讯云语音识别

https://cloud.tencent.com/document/product/1093
https://cloud.tencent.com/document/product/1093/35646
https://cloud.tencent.com/document/product/1093/37308
https://cloud.tencent.com/document/product/1093/35735
https://cloud.tencent.com/document/sdk/Java
https://github.com/TencentCloud/tencentcloud-sdk-java

VGGVox models for speaker identification and verification

https://github.com/a-nagrani/VGGVox
https://blog.csdn.net/weixin_41738734/article/details/86109333

说话人识别

端到端语音识别时代来临：网易杭州研究院的智能语音探索之路

https://cloud.tencent.com/developer/news/491629

应用、算法、芯片，“三位一体”浅析语音识别

http://news.eeworld.com.cn/xfdz/article_2017101874336_2.html

ArduinoTensorFlowLiteTutorials

https://github.com/arduino/ArduinoTensorFlowLiteTutorials

SpeechCmdRecognition

https://github.com/douglas125/SpeechCmdRecognition

Arduino Portenta H7

https://github.com/hpssjellis/my-examples-for-the-arduino-portentaH7

TTGO_T_Watch_Baidu_Rec, T-Watch

用 TTGO_T_Watch 手表做的百度语音识别终端
TTGO_T_Watch 主板自带有8M PSRAM, 扩展板有多种，有一种扩展板集成了INMP441 I2S 麦克风录入芯片, 可以处理语音.
声音监听器。声音监听器。监听周围的声音，并识别成文字。识别的文字经过配置可以转发到其它设备，
如树莓派，分发给其它设备联动。每次识别最长10秒录音并识别。平均一次录音文字识别时间1-10秒不等
https://github.com/lixy123/TTGO_T_Watch_Baidu_Rec
https://github.com/Xinyuan-LilyGO/TTGO_TWatch_Library

《Tensorflow入门与实战》, 第六章《循环神经网络》，6.4《用LSTM+CTC实现语音识别》

https://github.com/thewintersun/tensorflowbook/blob/master/Chapter6/asr_lstm_ctc/asr_lstm_ctc.py
search baidupan, 源代码_TensorFlow入门与实战.zip
https://www.ituring.com.cn/book/2398
语音识别（LSTM+CTC）
https://www.cnblogs.com/followees/p/10422809.html
FundamentalsOfAI_book_code
https://github.com/koryako/FundamentalsOfAI_book_code
定义一个向前计算的LSTM单元
https://github.com/search?q=定义一个向前计算的LSTM单元，40个隐藏单元&type=code
https://github.com/luvensaitory/project
https://github.com/koryako/FundamentalsOfAI_book_code

old:
语音识别（LSTM+CTC）
https://www.cnblogs.com/followees/p/10422809.html
search github, tf.nn.ctc_loss reduce_mean mfcc LSTMCell
https://github.com/igormq/ctc_tensorflow_example
https://github.com/pannous/tensorflow-speech-recognition/blob/master/lstm_ctc_to_chars.py
CTC tensorflow example 代码解析
https://blog.csdn.net/he_wen_jie/article/details/80586345

mxnet speech_recognition踩坑记

https://blog.csdn.net/zhqh100/article/details/103887097
https://github.com/baidu-research/warp-ctc/blob/master/README.zh_cn.md
https://github.com/apache/incubator-mxnet/tree/v1.7.x/example/speech_recognition
https://github.com/samsungsds-rnd/deepspeech.mxnet/tree/master/Libri_sample
https://github.com/baidu-research/ba-dls-deepspeech

LAS-asr

https://github.com/ChaosCY/LAS-asr
https://github.com/thomasschmied/Speech_Recognition_with_Tensorflow

mlpack LSTM

https://github.com/mlpack/examples/blob/master/lstm_stock_prediction/lstm_stock_prediction.cpp

Deep Audio-Visual Speech Recognition

https://www.cnblogs.com/tangbinchn/p/12809360.html
计算机视觉方向简介 | 唇语识别技术
https://zhuanlan.zhihu.com/p/48670591

LSTM Speech Recognition实战

https://blog.csdn.net/antkillerfarm/article/details/84232764

KWS, Keyword Spotting in Noise Using MFCC and LSTM Networks, matlab

https://www.mathworks.com/help/audio/examples/keyword-spotting-in-noise-using-mfcc-and-lstm-networks.html

esp32_kws

https://github.com/42io/esp32_kws

KWS-for-XMC

https://github.com/Infineon/KWS-for-XMC

基于kaldi训练唤醒词模型的一种方法

https://blog.csdn.net/cj1989111/article/details/88017908
https://github.com/xiangxyq/kaldi/tree/master/egs/wakeup_words
https://github.com/xiangxyq/3gpp_vad

语音识别系列4--语音识别CTC之模型训练源码解析

https://blog.csdn.net/u012361418/article/details/90313249

基于DTW的孤立词语音识别系统

https://blog.csdn.net/king_audio_video/article/details/90113627

tensorflow-android-speech-kws

https://github.com/shichaog/tensorflow-android-speech-kws

Deep Learning with Applications Using Python

Python深度学习实战，基于TensorFlow和Keras的聊天机器人以及人脸、物体和语音识别
https://github.com/Apress/Deep-Learning-Apps-Using-Python/blob/master/Chapter11_Speech%20to%20text%20and%20vice%20versa
https://github.com/NavinManaswi/Book-Deep-Learning-Applications-with-Applications-Using-Python

基于TensorFlow，人声识别如何在端上实现？

https://developer.aliyun.com/article/592687
https://github.com/weedwind/MFCC
https://github.com/weedwind/CTC-speech-recognition
search baidupan, weedwind_MFCC-master.zip

kaldi, cvte

https://gitee.com/yangmiao123/SpeechRecognition
search baidupan, cvte.zip

Edison - Keyword Spotting on Microcontroller

https://hütter.ch/posts/edison-kws-on-mcu/
(???) STM32L4, microphone
https://github.com/noah95/edison
ref
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/speech_commands/models.py
(??? seem IMP) KWS mcu, mfcc
https://github.com/majianjia/nnom/tree/master/examples/keyword_spotting

语音信号处理实验教程（MATLAB源代码）

https://github.com/veenveenveen/SpeechSignalProcessingCourse

基于Kaldi（DNN）的小词汇量汉语语音识别平台搭建

https://veenveenveen.github.io/article/technology/ASR/ASR_Kaldi_DNN_Chinese.html#kaldi-简介
 https://github.com/veenveenveen/ASR_Kaldi_DNN_Chinese
基于HTK开源框架的汉语语音识别 (GMM-HMM)
https://veenveenveen.github.io/article/technology/ASR/ASR_HTK_Chinese.html
https://github.com/veenveenveen/chinese_voice

STM32和百度云-天工最新物联网开发板，B-L475E-IOT01A探索套件操作说明

https://blog.csdn.net/annic9/article/details/80434389
https://cloud.baidu.com/doc/IOT/s/7jwvy87a2
STM32L475, esp8266, esp32
https://github.com/baidu/baidu-iot-samples/tree/master/STM32/I-CUBE-BAIDU

Python实现基于BIC的语音对话分割(一)

https://blog.csdn.net/wblgers1234/article/details/75896605
https://github.com/zimuyanzi/BIC

kaldi编译

search baidupan, kaldi_20200917_pre.tar.gz, work_kaldi
经历了两天时间，终于用虚拟机x86 debian（我用的编译环境是raspberry pi x86 desktop 2020年2月版镜像，32位debian）
编译完kaldi。需要修改代码，有些地方会出问题，例如这里：
jcsilva/docker-kaldi-android#11
简单来说是三步走：
（1）第一步执行tools下的make和make openblas，安装第三方库。
（2）第二步执行src下的configure和make，编译执行文件
（3）第三步执行yesno下的run.sh，测试执行文件是否正常
最后会看到一个WER是0的零错误报告，具体参考这篇中的《解码和测试》：
https://www.jianshu.com/p/09deba57f339

How to eat Pytorch in 20 days ?

https://github.com/lyhue1991/eat_pytorch_in_20_days

kaldi-ctc

https://github.com/lingochamp/kaldi-ctc
https://zhuanlan.zhihu.com/p/23177950

社区分享 | 从零开始学习 TinyML（一）

https://blog.csdn.net/wfing/article/details/106995562

ardu-badge, Arduino_TensorFlowLite

https://www.ardu-badge.com/Arduino_TensorFlowLite/zip
https://community.platformio.org/t/arduino-nano-33-ble-tensorflow-lite-undefined-references/14387/2
same as adafruit, Arduino_TensorFlowLite

ESP32 支持运行 TensorFlow Lite Micro

https://zhuanlan.zhihu.com/p/228593457

Arduino 机器学习实战入门（上）

https://blog.csdn.net/weixin_44507034/article/details/105602112
Arduino机器学习实战入门（下）
https://blog.csdn.net/weixin_44507034/article/details/105613754
https://medium.com/tensorflow/how-to-get-started-with-machine-learning-on-arduino-7daf95b4157
https://cloud.tencent.com/developer/article/1534288

Voiceprint-Recognition

AliosThings 嵌入式声纹识别项目
https://github.com/SunYanCN/Voiceprint-Recognition
alibaba/AliOS-Things#976

aiBook

https://github.com/xiyanxiyan10/aiBook
语音合成
https://github.com/ibab/tensorflow-wavenet
https://github.com/tomlepaine/fast-wavenet
语音识别
https://github.com/buriburisuri/speech-to-text-wavenet
https://github.com/pannous/tensorflow-speech-recognition

《人工智能》清华大学版, AIDemo

http://aibook.cslt.org/slides/index.html
http://aibook.cslt.org/aidemo/demo.html
第三章: 倾听你的声音, see
https://github.com/jcsilva/deep-clustering
search baidupan, aibook_speech.zip

search speechSeparation

https://github.com/pchao6/LSTM_PIT_Speech_Separation
https://github.com/ododoyo/DANet

Files

asr_002.md

Latest commit

History

asr_002.md

File metadata and controls

20200727

caffe

Common Voice, Mozilla

caffe

Sound_Sensor

DuerOS

nrf, i2s, tensorflow lite

Theano

Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting

ML-KWS

dtw, esp32

ESP32-A1S-AudioKit, for Arduino

wm8978, wm8960

ac101, ac108

ADAFRUIT "MUSIC MAKER" MP3 SHIELD FOR ARDUINO W/3W STEREO AMP

SIPEED R6+1 麦克风阵列

Maix-Bit V2.0(with MEMS microphone)

ATOM ECHO / M5StickC

关于麦克风的参数介绍 - 驻极体麦克风（ECM）和硅麦（MEMS）

语音识别基本原理, 英文, 罗宾纳

ds-cnn

sr

CAE

Hello Edge: Keyword Spotting on Microcontrollers

STM32F4使用FPU+DSP库进行FFT运算的测试过程二

Matlab---串口操作---数据採集篇

Offline Speech Recognition on Raspberry Pi 4 with Respeaker

使用DeepSpeech2进行语音识别

jasperproject

手写实现李航《统计学习方法》书中全部算法

kfrlib

deepspeech

rhasspy, 唤醒词引擎, 意图引擎, 语音合成引擎

esp32 online asr sdk

盘点下市面上都有哪些智能语音开发板

唤醒词引擎