In order to use a pre-trained word2vec file, you must first download it and place it here. DeepQA supports both the .bin format of the Google News word2vec embeddings, and the .vec format of the Facebook fasttext embeddings. The vec2bin.py is a small utility script to convert a .vec to a .bin file, which reduces disk space and improve the loading time.

Usage:

python main.py --initEmbeddings --embeddingSource=wiki.en.bin

Google News embeddings: https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing

FastText embeddings: https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md

More details on word2vec and these pre-trained vectors: https://code.google.com/archive/p/word2vec/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Files

README.md

Latest commit

History

README.md

File metadata and controls