[WIP] Train Vietnamese Dependency Parsing

In this work, we build a Vietnamese Dependency Parser using Biaffine Attention in a graph-based dependency parser on VLSP 2020 Dependency Parsing dataset.

Models Description

download

Input vectors

The input vector is composed of two parts: the word embedding and the CharLSTM word representation vector of

Biaffine Attention Mechanism

Compute the score of a dependency via biaffine attention:

Parameter settings

Model parameters

	Component	Hyper-Parameter	Value
Embedding	BERT	n_bert_layers dimension	4 768
LSTM	Encoder	n_lstm_hidden n_lstm_layers lstm_dropout	400 3 0.33

Training Parameters

Hyper-Parameter	Value
optimizer	Adam

Choose batch_size (5000) right help us alots

Training Data

VLSP 2020 Dataset

Train: 8151 sentences, Test: 1122 sentences

Notes

Using wandb logs is very handful. We can easily watch logs, loss graph with nearly zero setup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Train Vietnamese Dependency Parsing

Models Description

Training Data

Notes

Clone this wiki locally