This project is based on the paper "A Neural Probabilistic Language Model" (Bengio et al., 2003) and the very informative videos by Andrej Karpathy on building probabilistic language models in Pytorch.
I am using a german namedataset to train a probabilistic language model with 10-dimensional embeddings, a hidden layer with 100 neurons (configurable) and a cross-entropy loss function.
In inference you can generate new names based on the probabilities outputted by the neural net.
Sample names are:
- marlo
- emmergo
- walrainmart
- tilldalfine
- thande
- walde
- berga
- bilthaherd
- herolin
- gerwi
The training, evaluation and test have all been written as a learning exercise and are not the most efficient way of building a neural net.