Skip to content

Latest commit

 

History

History
16 lines (11 loc) · 741 Bytes

README.md

File metadata and controls

16 lines (11 loc) · 741 Bytes

In order to use a pre-trained word2vec file, you must first download it and place it here. DeepQA supports both the .bin format of the Google News word2vec embeddings, and the .vec format of the Facebook fasttext embeddings. The vec2bin.py is a small utility script to convert a .vec to a .bin file, which reduces disk space and improve the loading time.

Usage:

python main.py --initEmbeddings --embeddingSource=wiki.en.bin

Google News embeddings: https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing

FastText embeddings: https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md

More details on word2vec and these pre-trained vectors: https://code.google.com/archive/p/word2vec/