subword

This repository contains source code implementation of assignments for NTU's MSAI course AI6127 on Deep Neural Networks for Natural Language Processing (2019 Sem 2).

nlp ner language-model subword msai

Updated Dec 11, 2020
Jupyter Notebook

explanare / char-iit

Star

A causal intervention framework to learn robust and interpretable character representations inside subword-based language models

subword interpretability character-level-language-model causal-intervention

Updated Jul 10, 2023
Jupyter Notebook

TiMauzi / dawg

Star

The concept of DAWGs is based on: Blumer, A. et al. (1985). The smallest automation recognizing the subwords of a text. Theoretical Computer Science, 40, 31–55.

nlp tree parsing tree-structure theoretical-computer-science dawg subword subword-segmentation subwords

Updated Sep 13, 2022
Java

Tokenization is a way of separating a piece of text into smaller units called tokens. Here, tokens can be either words, characters, or subwords. Hence, tokenization can be broadly classified into 3 types – word, character, and subword (n-gram characters) tokenization.

cat nlp count tensorflow tokenizer natural-language character sentence keras-classification-models subword nerual-network imdb-dataset deep-learning-architectures rnn-keras smaller-units tokenizer-nlp

Updated Jun 30, 2021
Jupyter Notebook

Scitator / subword-nmt

Star

Subword Neural Machine Translation

deep-learning seq2seq neural-machine-translation language-model subword

Updated Jun 20, 2017
Python

Improve this page

Add a description, image, and links to the subword topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the subword topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

subword

Here are 15 public repositories matching this topic...

scarletcho / KoLM

zouharvi / tokenization-scorer

lallubharteja / KWS-Scripts

cooelf / subMrc

andreasgrv / johnny

cooelf / subword_seg

wang-h / FMDL

scarletcho / subword-mikolov

jluo41 / NLPText

burcgokden / BERT-Subword-Tokenizer-Wrapper

kkaryl / AI6127-Deep_NLP

explanare / char-iit

TiMauzi / dawg

Ishan-Kotian / Tokenizer_NLP

Scitator / subword-nmt

Improve this page

Add this topic to your repo