NLP (Natural Language Processing) python analyzer for twitter
NLP for twitter is a collection of routines that will allow the user to download the tweets of an specific user and run NLP (Natural Language Processing) analysis.
The results will include:
- Wordcloud chart
- Sentiment scatter chart
- Tweet frequency chart
- Word count
- Sentence count
- Vocabulary richnness
- Number of stopwords used
- Number of profanity words
- Estimated text reading time
- Most common words
- Sentiment analysis
Libraries:
- Python 3.0 or higher
- Twython
- NLTK
- Textblob
- Wordcloud
Twitter:
- consumer_key
- consumer_secret
- access_token
- access_token_secret
Create a file with the name auth.py
add the following lines:
consumer_key = 'your consumer key'
consumer_secret = 'your consumer secret'
access_token = 'your access token'
access_token_secret = 'your token secret'
If you need to create your consumer key click here
That´s it!
To dowload tweets:
python tweets_get.py --user <twitter_user> [--count <number_of_tweets>]
note: the maximum number of tweets is 200
To analyze tweet sentiment:
python tweets_sentiment.py --file <tweets_file.json>
To perform text analysis on tweets:
python tweets_text_analysis.py --file <tweets_file.json> --lang <en|es>
To plot sentiment scatter chart:
python tweets_scatter_v2.py --file <tweets_file.json>
To plot tweet sentiment and tweeting frequency:
python tweets_graph.py --file <tweets_file.json>
To listen to tweets:
python tweets_listener.py --track <keyword> [--lang [en|es]]
note: the number of tweets to listen to is controlled by a coded constant. Adjust this value to your needs
Feel free to add an issue to the repo or contact me at: [email protected]
Please be advised that this repository contains files with a list of profanity words used in english (EN) and spanish (ES). The only purpose of these files is to conduct NLP research and analisys on texts and should be treated as such.