Intro to Natural Language Processing with Python package: nltk Create a Dictionary Text Similarity Analysis tf-idf Jaccard index Hashing MinHash SimHash Reference: https://moz.com/devblog/near-duplicate-detection/ Sentiment Analysis how do you analyze the synchronized time series of sentiment?