ADS-509_Final_Project_Team1

Analytical text mining

Short description of your project and objectives:

We are to use Twitter API “Tweepy'' to scrape tweets off the official page of some popular cryptocurrencies and build a supervised machine learning model to classify for corresponding labels (crypto type); Then we are to build a topic model using various text mining techniques to evaluate on how well the topics identified for our tweets lined up with the labels (cryptos types). The designated end goal for deployment is to use “flask” to upload our model(s) to the website (local host) as an end product to predict crypto types based on keyword(s) from user input.

Description of your selected dataset (data source, number of variables, size of dataset, etc.):

We are pulling approximately 1000 tweets per cryptocurrency for a total of five crypto types from twitter API. Using “client.get_users_tweets” to perform a maximum of 15 pulls with 100 tweets per pull. Since some cryptocurrencies have less tweets than others, we ended up getting: For Bitcoin: 1498 tweets. For Ethereum: 1497 tweets. For Cardano, 1496 tweets. For Dogecoin: 957 tweets. For Shiba Inu: 770 tweets, for a total of 6218 tweets with two columns: “crypto_type” for crypto types and “tweets” for each tweet we pulled for each crypto.

Mehodology

Classification Models:

Naive Bayes
Support Vector Machine (SVM)

Topic Models:

Non-Negative Matrix Factorization (NMF)
Latent Semantic Analysis (LSA)
Latent Dirichlet Allocation (LDA)

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
nlp_app		nlp_app
sentiment		sentiment
twitter		twitter
.DS_Store		.DS_Store
ADS-509 Team 1 Project Status Update Form.docx		ADS-509 Team 1 Project Status Update Form.docx
ADS509_Team1_Final_Project.ipynb		ADS509_Team1_Final_Project.ipynb
README.md		README.md
crypto_tweets.csv		crypto_tweets.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ADS-509_Final_Project_Team1

Short description of your project and objectives:

Description of your selected dataset (data source, number of variables, size of dataset, etc.):

Mehodology

About

Releases

Packages

Contributors 3

Languages

dingyiduan7/ADS-509_Final_Project_Team1

Folders and files

Latest commit

History

Repository files navigation

ADS-509_Final_Project_Team1

Short description of your project and objectives:

Description of your selected dataset (data source, number of variables, size of dataset, etc.):

Mehodology

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages