- Rasa Version
- Chinese ReadMe
- Demo Video
- DEMO-GIF
- Description
- Environment
- Import data into Neo4j
- Train a Rasa model
- Test the model with Rasa Shell
- Run this bot as a service
- Reference
- Change Log
- Rasa==2.0.x(for other versions please check the branches)
- Demo-Video-Here
- The demo server maybe slow due to low configuration
- 2020/05/20 The new version of the chatbox has a different color.
-
This program is a QABot based on medical knowledge graph and Rasa-2.0.x. Neo4j is used for the storage of medical knowledge graph.
-
The conversation management engine is rasa-core. The configuration of rasa pipeline is as follows:
pipeline: - name: HFTransformersNLP # Name of the language model to use model_name: "bert" # Pre-Trained weights to be loaded model_weights: "bert-base-chinese" # An optional path to a specific directory to download and # cache the pre-trained model weights. # The `default` cache_dir can be "C:\Users\username\.cache\torch\transformers" # OR ~/.cache/torch/transformers # See https://huggingface.co/transformers/installation.html#caching-models cache_dir: null - name: "LanguageModelTokenizer" # Flag to check whether to split intents intent_tokenization_flag: False # Symbol on which intent should be split intent_split_symbol: "_" # LanguageModelFeaturizer type: Dense featurizer - name: "LanguageModelFeaturizer" - name: "MitieNLP" model: "data/total_word_feature_extractor_zh.dat" - name: "MitieEntityExtractor" - name: "EntitySynonymMapper" - name: "RegexFeaturizer" # SklearnIntentClassifier requires dense_features for user messages - name: "SklearnIntentClassifier"
-
Notice: Rasa NLU and Rasa Core have been merged into Rasa.
-
Python ≈ 3.8.5
-
Download the ZIP file or use "git clone" to get this project.
-
cd Doctor-Friende, and don't forget to "conda activate" your environment.
-
Simple instructions to install mitie, see Install Mitie
-
Use this command to install the required libraries and tools.
pip install -r requirements.txt
-
Tips:
-
If you are in China and suffer from slow network, you can use pip mirrors to accelerate. This command is for temporary use:
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple -r requirements.txt
-
If you have a proxy, you can add --proxy=ip:port at the end of the command above.
-
-
Prerequisite: You already have a Neo4j graph to connect to.
-
Unzip
MedicalSpider/data/data.tar.gz
toMedicalSpider/data
(Do not make a new folder). Thenmedical.json
contains all the data you need to import into Neo4j. -
Edit
MedicalSpider/process_data/create_graph.py
, change the database information into yours. In order to avoid path errors, runningcreate_graph.py
through pycharm is recommended. -
About Spider: The web spider is based on scrapy. If you want to run the spider, just run
SpiderMain.py
. (In order to avoid path errors, running it through pycharm is recommended.) -
The Knowledge Graph contains:
- 13,635 nodes (5 labels)
- 114,163 relationships (6 types)
- Data Structure:
Entity Type | Quantity | Example |
---|---|---|
Disease | 6,143 | 百日咳\n头痛 |
Department | 54 | 儿科\n小儿内科 |
Drug | 1,124 | 硫辛酸片\n曲克芦丁片 |
Food | 378 | 蟹肉\n鱿鱼(干) |
Symptom | 5,936 | 角弓反张\n视网膜Roth斑 |
Total | 13,635 | / |
-
Create your own Rasa Train Dataset with Chatito
-
Download mitie model into chat/data, BaiDu Disk, pwd: p4vx, OR Mega
-
The first time running the training command with this
Pipeline
, the terminal will download the bert model. You can find default cache directory at Cache Models. -
**Important: ** If errors happen when loading the bert model, try:
- rename
bert-base-chinese-config.json
toconfig.json
- rename
bert-base-chinese-vocab.txt
tovocab.txt
- rename
bert-base-chinese-tf_model.h5
totf_model.h5
- rename
-
To the training, open up a terminal and
cd chat
, thenrasa train -c config/config_pretrained_embeddings_mitie_zh.yml --data data/medical/M3-training_dataset_1564317234.json data/medical/stories.md --out models/medicalRasa2 --domain config/domains.yml --num-threads 5 --augmentation 100 -vv
-
Edit the
tracker_store
field in endpoints.yml, change the database information into yours. (Either a new DB or an existing one, Rasa will create a table namedevents
). Check SQLAlchemy for thedialect
field. This link is provided here Tracker Store. For more information please check Rasa official doc. -
If you want to use the customized socketio, edit the DB info in
MyChannel/MyUtils.py
and make sure you havemessage_recieved
table.(Of course you change this table name. If you do so, you have to change this inhandle_message
function inmyio.py
.) -
Edit
chat/MyActions/actions.py
, change the Neo4j information into yours. -
Open two terminals or two cmds, both cd into the
chat
directory in project root. (Don't forget toconda activate
your environment.) -
Run rasa action server in one terminal:
rasa run actions --actions MyActions.actions --cors "*" -vv
-
run rasa shell in another terminal/cmd:
rasa shell -m models/medicalRasa2/20201026-112436.tar.gz --endpoints config/endpoints.yml -vv
-
Do first five steps as mentioned above.
-
run rasa server in another terminal/cmd:
rasa run --enable-api -m models/medicalRasa2/20201026-112436.tar.gz --port 5000 --endpoints config/endpoints.yml --credentials config/credentials.yml -vv
-
Frontend Webpage: ChatHTML If you use the customized socketio, change socketPath in the html to
/mysocket.io/
. -
Tips:
- On a server, you can try
nohup
command for background running.
- On a server, you can try
-
rasa-webchat, webchat.js
-
-
Update Rasa to 2.0.x, Python version 3.8.5
-
Pipeline
changes a lot. Add a new componentHFTransformersNLP
. -
You should use the new model in
models/medicalRasa2
. -
Edit
domains.yml
, change thetype
ofsure
andpre_disease
intoany
.
-
-
-
Update Rasa to 1.7.4, Python version 3.7.9.
-
Use newly trained model only.
-
Add
session_config
in domains.yml to meet Rasa's requirement. -
Edit line 91 in
chat/data/medical/stories.md
toaction_first
. It wasutter_greet
originally, which would runutter_greet
according to line 91 in chat/data/medical/stories.md, instead ofaction_first
. This happens inRasa>=1.3.0
.
-
-
-
Update Rasa to 1.2.9, Python version 3.6.8
-
Introduce Tracker Store into endpoints.yml, this enables auto storage of Tracker into your MySQL DB. Though Tracker Store is an official way to store messages, I also add a customized way to store user message. See below.
-
I updated
myio.py
andMyUtils.py
inchat/MyChannel/
. There's a customized socketio inmyio.py
which enables you to store user messages into MySQL DB. It's based onrasa.core.channels.socketio.SocketIOInput
. You can store the fields you need in functionhandle_message
. -
The information for connecting your MySQL DB should be provided in
MyUtils.py
. -
Add some configurations to
credentials.yml
to enable the customized socketio. -
Fix the demo server. It crashed on April 1st.
-