Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Github removes html like tag from jupyter notebook #1

Closed
mobassir94 opened this issue Dec 27, 2022 · 0 comments
Closed

Github removes html like tag from jupyter notebook #1

mobassir94 opened this issue Dec 27, 2022 · 0 comments

Comments

@mobassir94
Copy link
Owner

mobassir94 commented Dec 27, 2022

As we are working on code mixed book readers, we had to write code for tagging text based on language. for example have a look at tag_text() function from https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/TTS_text_preprocessing/in_depth_mlt_text_processing_for_bn_TTS.ipynb it shows :

# tag the text
for m in parts:
    if len(m.strip())>1:text=text.replace(m,f"{m}")
# clean-tags
text=text.replace("start",'')
text=text.replace("end",'')

instead of the correct code, which should be :

# tag the text
for m in parts:
    if len(m.strip())>1:text=text.replace(m,f"</ar><SPLIT><bn>{m}</bn><SPLIT><ar>")
# clean-tags
text=text.replace("</ar><SPLIT><bn>start",'<bn>')
text=text.replace("end</bn><SPLIT><ar>",'</bn>')

which can be found here in corresponding .py file (converted from ipynb into .py before uploading in github) : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/TTS_text_preprocessing/in_depth_mlt_text_processing_for_bn_tts.py

you can see from above example that if a notebook contains html like tag inside python code then after uploading that notebook into github,github automatically replaces all those html like tags into '' which turns the original right code snippet into error. in order to avoid such unwanted errors we have uploaded all the .py converted scripts of those notebooks which are got broken after uploading in github. below i am sharing list of those problematic notebooks and their corresponding corrected .py version script for you :

  1. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/TTS_text_preprocessing/in_depth_mlt_text_processing_for_bn_TTS.ipynb

    original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/TTS_text_preprocessing/in_depth_mlt_text_processing_for_bn_tts.py

  2. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/code-mixed%20book%20readers/tafsir-jalalayn-book-reader-tts.ipynb

    original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/code-mixed%20book%20readers/tafsir-jalalayn-book-reader-tts.py

  3. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/code-mixed%20book%20readers/tafsir_bayan_reader.ipynb

    original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/code-mixed%20book%20readers/tafsir_bayan_multilingual_(bn%2Bar)_tts_based_qtafsir_reader.py

  4. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/mlt_TTS_inference_demo/Multilingual_(ben%2Bara)_tts_inference_colab_demo.ipynb

    original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/mlt_TTS_inference_demo/multilingual_(ben%2Bara)_tts_inference_colab_demo.py

  5. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/mlt_TTS_inference_demo/v1_Multilingual_(ben%2Bara)_tts_based_quranic_tafsir_reader.ipynb

    original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/mlt_TTS_inference_demo/v1_multilingual_(ben%2Bara)_tts_based_quranic_tafsir_reader.py

  6. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/mlt_TTS_inference_demo/v2_Multilingual_(bn%2Bar)_tts_based_Qtafsir_reader.ipynb

    original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/mlt_TTS_inference_demo/v2_multilingual_(bn%2Bar)_tts_based_qtafsir_reader.py

  7. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/prepare_dataset/banglanmt_tafsir-ibn-kathir-en-to-bn-translator.ipynb

original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/prepare_dataset/banglanmt_tafsir-ibn-kathir-en-to-bn-translator.py

  1. broken notebook : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/prepare_dataset/tafsir-al-jalalayn-en-to-bn-translator.ipynb

original code : https://github.com/mobassir94/comprehensive-bangla-tts/blob/main/prepare_dataset/tafsir-al-jalalayn-en-to-bn-translator.py

if you try to run those above broken notebooks locally you are expected to face errors.in order to fix those errors please check the corresponding original code(.py files) and find and replaces the line of codes where html like tags are eliminated by github automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant