-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
23 lines (21 loc) · 1.7 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Virality Predictor Project using 'https://www.kaggle.com/gspmoreira/articles-sharing-reading-from-cit-deskdrop?select=shared_articles.csv' dataset
Instructions:
A folder called Virality_Predictor is expected to be under Home directory.
~/Virality_Predictor/
Folder contents:
~/Virality_Predictor/ Collaborative_Filtering_EN.ipynb Collaborative_Filtering_EN_PT.ipynb Collaborative_Filtering_PT.ipynb Collaborative_Filtering_Utils.py
TFIDF-Regression-EN.ipynb TFIDF-Regression-PT.ipynb TFIDF_Regression_Utils.py
TFIDF_Classification_EN.ipynb TFIDF_Classification_PT.ipynb TFIDF_Classification_Utils.py
Utils.py
data_analysis_articles.ipynb data_analysis_users.ipynb
nltk_data/ corpora
datasets/
cleaned_articles_test_EN_text.csv cleaned_articles_test_EN_upsampled_text.csv cleaned_articles_test_PT_text.csv cleaned_articles_test_PT_upsampled_text.csv cleaned_articles_train_EN_text.csv cleaned_articles_train_EN_upsampled_text.csv cleaned_articles_train_PT_text.csv cleaned_articles_train_PT_upsampled_text.csv shared_articles.csv
users_interactions.csv
models/
CF_EN_PT_norm.pkl Classification_EN_pipeline.pkl Classification_PT_pipeline.pkl CF_EN_PT_raw.pkl CF_PT_norm.pkl
CF_PT_raw.pkl Regression_EN_pipeline.pkl Regression_PT_pipeline.pkl
To run the models one should simply start run the ‘jupyter notebook’ command from command line. Notebooks show latest state of the models. Models are under the /models directory
Each .ipynb file corresponds to a problem, such as Collaborative Filtering using Articles in English (Collaborative_Filtering_EN.ipynb)
In each jupyter notebook there is a commented cell where you can find the command for loading the corresponding model.
Below packages are installed in the project’s virtualenv.