decoding error after successful aishell train #10

MNCTTY · 2019-09-02T10:17:47Z

Hi! I managed to train LAS on aishell data without errors. This is the end of the log:

Epoch 20 | Iter 441 | Average Loss 0.406 | Current Loss 0.505424 | 64.8 ms/batch
Epoch 20 | Iter 451 | Average Loss 0.409 | Current Loss 0.383116 | 64.1 ms/batch
-------------------------------------------------------------------------------------
Valid Summary | End of Epoch 20 | Time 956.81s | Valid Loss 0.410
-------------------------------------------------------------------------------------
Learning rate adjusted to: 0.000000
Find better validated model, saving to exp/train_in240_hidden256_e3_lstm_drop0.2_dot_emb512_hidden512_d1_epoch20_norm5_bs32_mli800_mlo150_adam_lr1e-3_mmt0_l21e-5_delta/final.pth.tar
# Accounting: time=21312 threads=1
# Ended (code 0) at Fri Aug 30 17:15:39 MSK 2019, elapsed time 21312 seconds

but decoding stage gave an error:

Stage 4: Decoding
run.pl: job failed, log is in exp/train_in240_hidden256_e3_lstm_drop0.2_dot_emb512_hidden512_d1_epoch20_norm5_bs32_mli800_mlo150_adam_lr1e-3_mmt0_l21e-5_delta/decode_test_beam30_nbest1_ml100/decode.log
2019-08-30 17:15:39,608 (json2trn:24) INFO: reading exp/train_in240_hidden256_e3_lstm_drop0.2_dot_emb512_hidden512_d1_epoch20_norm5_bs32_mli800_mlo150_adam_lr1e-3_mmt0_l21e-5_delta/decode_test_beam30_nbest1_ml100/data.json
Traceback (most recent call last):
 File “/home/karina/Listen-Attend-Spell/egs/aishell/../../src/utils/json2trn.py”, line 25, in <module>
   with open(args.json, ‘r’) as f:
FileNotFoundError: [Errno 2] No such file or directory: ‘exp/train_in240_hidden256_e3_lstm_drop0.2_dot_emb512_hidden512_d1_epoch20_norm5_bs32_mli800_mlo150_adam_lr1e-3_mmt0_l21e-5_delta/decode_test_beam30_nbest1_ml100/data.json’
write a CER (or TER) result in exp/train_in240_hidden256_e3_lstm_drop0.2_dot_emb512_hidden512_d1_epoch20_norm5_bs32_mli800_mlo150_adam_lr1e-3_mmt0_l21e-5_delta/decode_test_beam30_nbest1_ml100/result.txt
|      SPKR        |         # Snt                   # Wrd         |      Corr              Sub              Del              Ins              Err            S.Err      |
|      Sum/Avg     |             0                       0         |       0.0              0.0              0.0              0.0              0.0              0.0      |

I don't understand why there are no some file in that directory. I thought everything that run.pl need are generated by themself there

The text was updated successfully, but these errors were encountered:

KnowBetterHelps · 2019-09-02T13:50:05Z

maybe you should check file "exp/train_in240_hidden256_e3_lstm_drop0.2_dot_emb512_hidden512_d1_epoch20_norm5_bs32_mli800_mlo150_adam_lr1e-3_mmt0_l21e-5_delta/decode_test_beam30_nbest1_ml100/data.json" ?

MNCTTY · 2019-09-02T13:56:49Z

yes, there are no such file as I said
but as I understand, it should be generated at some stage as every other file in that directory
it's not
and I want to find why: in log all previous stages were with no errors

KnowBetterHelps · 2019-09-02T14:16:27Z

yeah, it should be generated when decoding started
I am running the training proccessing now, and it will be finished tomorrow, I'll see it whether got the same problem.

MNCTTY · 2019-09-02T14:16:48Z

yep
thanks

KnowBetterHelps · 2019-09-03T03:17:00Z

I found the same problem, but the main reason is not "the file non exist".

in my case, I found a encoding error in "exp/train_in240_hidden256_e3_lstm_drop0.2_dot_emb512_hidden512_d1_epoch20_norm5_bs128_mli800_mlo150_adam_lr1e-3_mmt0_l21e-5_delta/decode_test_beam30_nbest1_ml100/decode.log"

I use "export PYTHONIOENCODING=UTF-8" to fixed it

MNCTTY · 2019-09-03T12:03:14Z

yep, we found encoding error earlier, and solved similar way
so, after fixing encoding error I run all run.sh without any errors?

KnowBetterHelps · 2019-09-03T12:12:44Z

yes, I am waiting for decoding finished now. recognition result seems okay

MNCTTY · 2019-09-03T14:39:25Z

how did you run recognition without decoding stage from run.sh?

MNCTTY · 2019-09-03T16:22:36Z

ok. our news:

we are here SOMEHOW managed to run decoding stage
for this we copied data.json from dump/test to folder, where data.json is not found by run.sh
plus add encoding with utf-8 in several new places, plus change rec_token_id for token_id, because we thought that it was a typo.

and:
stage 4 finally ran successfully
and here are what id did say:


karina@karina:~/Listen-Attend-Spell/egs/aishell$ ./run.sh 
dictionary: data/lang_1char/train_chars.txt
Stage 4: Decoding
run.pl: job failed, log is in exp/train_in240_hidden256_e3_lstm_drop0.2_dot_emb512_hidden512_d1_epoch1_norm5_bs32_mli800_mlo150_adam_lr1e-3_mmt0_l21e-5_delta/decode_test_beam30_nbest1_ml100/decode.log
2019-09-03 19:05:02,215 (json2trn:24) INFO: reading exp/train_in240_hidden256_e3_lstm_drop0.2_dot_emb512_hidden512_d1_epoch1_norm5_bs32_mli800_mlo150_adam_lr1e-3_mmt0_l21e-5_delta/decode_test_beam30_nbest1_ml100/data.json
2019-09-03 19:05:02,218 (json2trn:28) INFO: reading data/lang_1char/train_chars.txt
2019-09-03 19:05:02,218 (json2trn:37) INFO: writing hyp trn to exp/train_in240_hidden256_e3_lstm_drop0.2_dot_emb512_hidden512_d1_epoch1_norm5_bs32_mli800_mlo150_adam_lr1e-3_mmt0_l21e-5_delta/decode_test_beam30_nbest1_ml100/hyp.trn
2019-09-03 19:05:02,218 (json2trn:38) INFO: writing ref trn to exp/train_in240_hidden256_e3_lstm_drop0.2_dot_emb512_hidden512_d1_epoch1_norm5_bs32_mli800_mlo150_adam_lr1e-3_mmt0_l21e-5_delta/decode_test_beam30_nbest1_ml100/ref.trn
write a CER (or TER) result in exp/train_in240_hidden256_e3_lstm_drop0.2_dot_emb512_hidden512_d1_epoch1_norm5_bs32_mli800_mlo150_adam_lr1e-3_mmt0_l21e-5_delta/decode_test_beam30_nbest1_ml100/result.txt
|      SPKR                   |      # Snt            # Wrd       |      Corr              Sub              Del              Ins              Err            S.Err      |
|      Sum/Avg                |       419             26135       |     100.0              0.0              0.0              0.0              0.0              0.0      |

how it should be interpreted?

how to run recognition for a random new wav and where it write the recognised text?

do you have ANY idea why data.json doesn't being generated in decoding folder by itself?

I would be very grateful, if you could answer any of these questions

ps/ do you have any spaces in your language? :D

KnowBetterHelps · 2019-09-04T02:20:15Z

I guess:
1、it is not correct for you to copy test/data.json to exp/{...}/data.json. if you do so, the score script would compare test/data.json with exp/{...}/data.json, which are the same in your case. and the result would be 100% correct
2、how to run recognition for a random new wav and where it write the recognised text?
you should prepare dump/test/deltatrue/data.json, whcih could be generated from your data dir. look into data preparation script
3、do you have ANY idea why data.json doesn't being generated in decoding folder by itself?
maybe it is still the encoding problem
4、and BTW what do you mean "do you have any spaces in your language？“　: )

MNCTTY · 2019-09-04T14:52:45Z

hmm
i noticed, that aishell annotations looks like text without lots of spaces, I opened Chinese wiki and see thats there are some spaces, but only after points or commas
so I ask
because in russian we have lots of spaces, it separates words from each other
and when data prep script delete all spaces russian annotations looks strange

MNCTTY · 2019-09-04T21:10:48Z

3, by the way, what do you mean saying 'encoding' - what stage in run.sh represents encoding? I thought there are only decoding - from wav to text. No?

KnowBetterHelps · 2019-09-05T01:43:28Z

the utterance for nnet training usually contains only one sentence, so there won't be any points or commas in it, if have, should be delete before training.

"encoding" means the language encoding type, like utf-8. when you use chinese, like opening some file which contains chinese, you should always be careful with the encoding.

MNCTTY · 2019-09-09T21:55:51Z

I managed to run decoding from start to end, the problem was really in encoding (it was need to be added to some other files)
but! for some reason results are still 100% corr. I don't understand why is that

by the way, do you know, how to load a pretrained model to further training?

KnowBetterHelps · 2019-09-10T02:53:30Z

for some reason results are still 100% corr
can you show me some example of you train/...../data.json?

how to load a pretrained model to further training
I didn't find any train_stage like in kaldi process, so it might not support pre-training

MNCTTY · 2019-09-10T10:14:17Z

can you show me some example of you train/...../data.json?

you mean, dump/train/deltatrue/data.json ?
here it is

and I attach at once dump/test/deltatrue/data.json
and data.json from decoding folder, that generated in the process of decoding
I renamed them train_data.json, test_data.json and decode_data.json for easy distinguishing in the attachement

Archive.zip

KnowBetterHelps · 2019-09-11T01:28:41Z

looks like you were using the script directly for your own data. рон не отрываясь смотрел на письмо которое уже начало с углов дымиться

in chinese, one syllbale can be one word, like "我"(one token)， means "me"; but in your language, "рон" would be splited into "P O H" (three token), maybe you should modify the script to better understand your data, like one word one token (рон as one token).

MNCTTY · 2019-09-11T11:18:35Z

so, did I understood you correctly? You say that it's better to construct a vocab with lots of tokens that would be meaningful pieces of the language?
not predict letters, but predict that pieces.

may be it makes sense, since I can take such vocab from bert for russian

can you say me, please, what files I should search for changes? I mean, if I just put new vocab in a place of old, situation didn't changed, yes?

KnowBetterHelps · 2019-09-11T11:24:36Z

I am doing some similar work for code-switch recognition, which in english I gona using subword 'BPE' unit, not a letter. for a example, catch --> ca tch , not 'c a t c h'

KnowBetterHelps · 2019-09-11T11:27:42Z

can you say me, please, what files I should search for changes
the script in data preparation, specically in generating data.json

MNCTTY · 2019-09-11T11:32:22Z

I am doing some similar work for code-switch recognition, which in english I gona using subword 'BPE' unit, not a letter. for a example, catch --> ca tch , not 'c a t c h'

yeah, bert vocab using bpe exactly to construct vocab
plus, there are huge complete vocabs for english, maybe you can use them since google had much more data to construct them
for russian they are much smaller but still complete enough

MNCTTY · 2019-09-13T00:14:57Z

ok, I find the real problem of 100% correct results in result.txt

the problem was in json2trn.py file:
there were creation of 2 absolute identical files - ref and hyp - in sourse code from decode data.json. But we know, that they must be different - hyp contains predictions of model, ref - things from test data.json
I fixed it in my computer code - and result.txt now is correct (has no 100% correctness)

May be it should be fixed and in source code.

MNCTTY · 2019-09-23T15:04:53Z

okay
i've done something wrong: now hyp.trn are being created empty one. Can somebody tell me what files beside json2trn.py are responsible for it's creation? please
may be I will find out this tomorrow, but if someone already knows and answer in this time, it will be cool

KnowBetterHelps · 2019-09-24T01:32:29Z

it will create something like this, from exp/***/decode/data.json

hyn.trn
过去的就不要想了 (T0055G2375-T0055G2375S0447)
天气下降注意身体 (T0055G2286-T0055G2286S0457)
浦中市剧中人街最儿我独醒事已见放 (T0055G0915-T0055G0915S0468)

ref.trn
过去的就不要想了 (T0055G2375-T0055G2375S0447)
天气下降注意身体 (T0055G2286-T0055G2286S0457
补充诗句众人皆醉而我独醒是以见放 (T0055G0915-T0055G0915S0468)

MNCTTY · 2019-09-25T02:06:59Z

it's strange that though I have exp/***/decode/data.json , not empty, looks pretty correct, I still got empty hyp.trn
but ref.trn is not an empty at all and looks correct one too

ben-8878 · 2020-05-20T09:08:04Z

@MNCTTY do u have solve your problem？ I'm hesitant whether to use the tool

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decoding error after successful aishell train #10

decoding error after successful aishell train #10

MNCTTY commented Sep 2, 2019

KnowBetterHelps commented Sep 2, 2019

MNCTTY commented Sep 2, 2019

KnowBetterHelps commented Sep 2, 2019

MNCTTY commented Sep 2, 2019

KnowBetterHelps commented Sep 3, 2019

MNCTTY commented Sep 3, 2019

KnowBetterHelps commented Sep 3, 2019

MNCTTY commented Sep 3, 2019 •

edited

Loading

MNCTTY commented Sep 3, 2019 •

edited

Loading

KnowBetterHelps commented Sep 4, 2019

MNCTTY commented Sep 4, 2019

MNCTTY commented Sep 4, 2019

KnowBetterHelps commented Sep 5, 2019

MNCTTY commented Sep 9, 2019

KnowBetterHelps commented Sep 10, 2019

MNCTTY commented Sep 10, 2019

KnowBetterHelps commented Sep 11, 2019

MNCTTY commented Sep 11, 2019

KnowBetterHelps commented Sep 11, 2019

KnowBetterHelps commented Sep 11, 2019

MNCTTY commented Sep 11, 2019

MNCTTY commented Sep 13, 2019

MNCTTY commented Sep 23, 2019 •

edited

Loading

KnowBetterHelps commented Sep 24, 2019

MNCTTY commented Sep 25, 2019

ben-8878 commented May 20, 2020

decoding error after successful aishell train #10

decoding error after successful aishell train #10

Comments

MNCTTY commented Sep 2, 2019

KnowBetterHelps commented Sep 2, 2019

MNCTTY commented Sep 2, 2019

KnowBetterHelps commented Sep 2, 2019

MNCTTY commented Sep 2, 2019

KnowBetterHelps commented Sep 3, 2019

MNCTTY commented Sep 3, 2019

KnowBetterHelps commented Sep 3, 2019

MNCTTY commented Sep 3, 2019 • edited Loading

MNCTTY commented Sep 3, 2019 • edited Loading

KnowBetterHelps commented Sep 4, 2019

MNCTTY commented Sep 4, 2019

MNCTTY commented Sep 4, 2019

KnowBetterHelps commented Sep 5, 2019

MNCTTY commented Sep 9, 2019

KnowBetterHelps commented Sep 10, 2019

MNCTTY commented Sep 10, 2019

KnowBetterHelps commented Sep 11, 2019

MNCTTY commented Sep 11, 2019

KnowBetterHelps commented Sep 11, 2019

KnowBetterHelps commented Sep 11, 2019

MNCTTY commented Sep 11, 2019

MNCTTY commented Sep 13, 2019

MNCTTY commented Sep 23, 2019 • edited Loading

KnowBetterHelps commented Sep 24, 2019

MNCTTY commented Sep 25, 2019

ben-8878 commented May 20, 2020

MNCTTY commented Sep 3, 2019 •

edited

Loading

MNCTTY commented Sep 3, 2019 •

edited

Loading

MNCTTY commented Sep 23, 2019 •

edited

Loading