Skip to content

Simple python util to load glove embeddings and process tweets

License

Notifications You must be signed in to change notification settings

dhartung/python-glove-loader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Glove Loader

A simple loader for glove embeddings that supports a memory saving embedding store and preprocessing for twitter data.

Look at the following example to see how it works:

from glove import load_embedding, tokenize, HashTagMode

# Tokenize tweets (thanks to @tokestermw and @ppope for their basis implementation; see method docs)
tokens = tokenize(
    "Hello #World, this is a sample util from @dhartung to " +
    "illustrate the power of https://github.com/dhartung/python-glove-loader :)",
    hashtag_mode=HashTagMode.REPLACE
)
print(tokens)

# Load whole embedding in memory
glove = load_embedding("./glove.twitter.27B.25d.txt", keep_in_memory=True)
# OR Don't load full embeddings into memory (saves a lot of RAM with large embeddings)
glove = load_embedding("./glove.twitter.27B.25d.txt", keep_in_memory=False)

# Read embeddings
print(glove.get_embedding("potato"))
print(glove.get_embeddings(["potato", "house", "mouse"]))

# Parse tweets
print(
    glove.get_tweet_embeddings(
        "Hello #World, this is a sample util from @dhartung to " +
        "illustrate the power of https://github.com/dhartung/python-glove-loader :)"
    )
)

About

Simple python util to load glove embeddings and process tweets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages