This Repository Contains R-Codes executed on various Datasets in RStudio. I Hope This Repository is very helpful for those who are Willing to build their Career in Data Science, Big Data.
You will Need Rstudio to Execute all the Codes So Install it first and then Go through the Below Codes. To Download Rstudio, Click Here.
To Begin with the Basics of the Data Science, go through the Practice(Basics) Folder in the Repository.
No. | Name | File |
---|---|---|
1. | Basics | practice.r |
2. | Confidence Interval | Confidence_Interval.r |
3. | Probability | Probability.r |
Now we will do the Descriptive Statistics Analysis also known as Exploratory Data Analysis(EDA).
No. | Dataset Name | File |
---|---|---|
1. | Carbon Dioxide(CO2) | Descriptive_Stats_CO2.r |
2. | Air Quality | Descriptive_Stats_airquality.r |
Now lets Go through Various Algorithms.
No. | Name | File |
---|---|---|
1. | Hypothesis Testing | Hypothesis Testing.r |
No | Name | Dataset | File |
---|---|---|---|
1. | Newspaper Data | NewspaperData.CSV | Newspaper_LinearRegression.r |
2. | Waist Circumference-Adipose Tissue | WC-AT.csv | WC-AT_LinearRegression.r |
No | Name | Dataset | File |
---|---|---|---|
1. | Cars | Cars.csv | Cars_Multi_Linear_Regression.r |
2. | Corolla | Toyota_Corolla.csv | Toyota_Multi_Linear_Regression.r |
No | Name | Dataset | File |
---|---|---|---|
1. | Claimants | Claimants.csv | Logistic Regression.r |
No | Name | Dataset | File |
---|---|---|---|
1. | Titanic | Titanic.csv | Titanic_Association_Rule.r |
No | Name | Dataset | File |
---|---|---|---|
1. | Cat | Cat.jpg | Example1_PCA.r |
2. | University | Universities.csv | Universities_PCA.r |
No | Name | Dataset | Heirarchical Clustering | K-Means CLustering |
---|---|---|---|---|
1. | Universities | Univesities.csv | Universities_Heirarchical_Clustering.r | K-Means_Clustering.r |
No | Name | Dataset | File |
---|---|---|---|
1. | Unemployment | Survival_Unemployment.csv | Survival_Unemployment.r |
No | Name | File | Bagging | Bagging and Boosting |
---|---|---|---|---|
1. | Example 1 | DecisionTree.r | Decision_tree_Bagging.r | Decision_Tree_Bagging_Boosting.r |
No | Name | Dataset | File |
---|---|---|---|
1. | Cancer | KNN.csv | K-Nearest_Neighbour.r |
No | Name | Dataset | File |
---|---|---|---|
1. | Iris | Available in R Datasets | random_forest.r |
No | Name | Dataset | File |
---|---|---|---|
1. | Concrete | concrete.csv | Concrete_Neural_Network.r |
No | Name | Dataset | File |
---|---|---|---|
1. | Letter Data | LetterData.csv | LetterData_Support_Vector_Machine.r |
No | Name | Dataset | File |
---|---|---|---|
1. | SMS Spam | sms_spam.csv | Naive_Bayes_Sms_Spam.r |
No | Name | Dataset | Prediction File | File |
---|---|---|---|---|
1. | Amtrak | Amtrak.csv | Predict_new.xlsx | Amtrak_Forecasting.r |
2. | Aviation | Aviation.csv | --- | Aviation_Exponential_Smooting_Forecasting.r |
We require Positive Words and Negative Words for the Analysis.
No | Name | Dataset | File |
---|---|---|---|
1. | Emotion Mining | Amazon Nokia Lumia Reviews.txt | Emotion_Mining_Amazon.r |
2. | Sentiment Analysis | McD_Small.csv | Sentiment Analysis_McD.r |
If you want to extract the Reviews of a particular Product from Amazon then Run the Below Code in Rstudio.
This Code is Valid only for the Products on Amazon.
The Code Varies from site to site.
install.packages("rvest")
install.packages("XML")
install.packages("magrittr")
library(rvest)
library(XML)
library(magrittr)
# Amazon Reviews #############################
aurl <- "URL of Product Reviews page"
amazon_reviews <- NULL
for (i in 1:10){
murl <- read_html(as.character(paste(aurl,i,sep="=")))
rev <- murl %>%
html_nodes(".review-text") %>%
html_text()
amazon_reviews <- c(amazon_reviews,rev)
}
length(amazon_reviews)
write.table(amazon_reviews,"apple.txt",row.names = F)
I have Performed this code for Extracting Reviews of Apple Macbook Air, Do check it Out.
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Buyer Ratio | .pptx | BuyerRatio.csv | BuyerRatio.r |
2. | Customer Order Form | .pptx | Customer+OrderForm.csv | Customer+OrderForm.r |
3. | Cutlet Diameter | .pptx | Cutlets.csv | Cutlet_Hyp_Test.r |
4. | Fantaloons | .pptx | Fantaloons.csv | Fantaloons.r |
5. | Lab | .pptx | LabTAT.csv | Lab_Hyp_Anova_test.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Calories Consumed | .txt | Calories_Consumed.csv | Calories_Simple_Linear.r |
2. | Delivery Time Data | .txt | Delivery_Time.csv | Delivery_Simple_Linear_Regression.r |
3. | Employee Data | .txt | Emp_Data.csv | Emp_Simple_Linear.r |
4. | Salary Data | .txt | Salary_Data.csv | Salary_Simple_Linear.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | 50 Startup | .txt | 50_Startups.csv | 50_Startup_Multi_Linear.r |
2. | Computer Data | .txt | Computer_Data.csv | Computer_Data_Multi_Linear.r |
3. | Computer Data | .txt | ToyotaCorolla.csv | ToyotaCorolla_Multi_Linear.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Bank | .txt | Bank-Full.csv | Bank_logistic_Regression.r |
2. | Credit Card | .txt | Creditcard.csv | Creditcard_Logistic_regression.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Books | .txt | Book.csv | Book.r |
2. | Groceries | .txt | Groceries.csv | Groceries.r |
3. | Movies | .txt | My_Movies.csv | My_Movies.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Crime Data | .txt | Crime_Data.csv | Crime_Data_Clustering.r |
2. | East West Airlines | .txt | EastWestAirlines.xlsx | EastWestAirlines_Cluster.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Wine | .txt | Wine.csv | Wine_PCA.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Company Data | .txt | Company_Data.csv | Company_Data.r |
2. | Fraud Check | .txt | Fraud_Check.csv | Fraud_Check.r |
3. | Iris | Available in R Dataset | Iris_ctree.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Company Data | .txt | Company_Data.csv | Company_Data.r |
2. | Fraud Check | .txt | Fraud_Check.csv | Fraud_Check.r |
3. | Iris | Available in R Dataset | Iris.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Glass Data | .txt | Glass.csv | Glass.r |
2. | Zoo | .txt | Zoo.csv | Zoo.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | 50 Startups | .txt | 50_Startups.csv | 50_Startups.r |
2. | Concrete | .txt | Concrete.csv | Concrete.r |
3. | ForestFires | .txt | Forestfires.csv | Forestfires.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Forest Fires | .txt | Forestfires.csv | Forestfires.r |
2. | Salary Data | .txt | Salary_Data_Train.csv, Salary_Data_Test.csv | SalaryData.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Salary_Data | .txt | SalaryData_Train.csv, SalaryData_Test.csv | SalaryData.r |
2. | Sms Data | .txt | Sms_Raw_NB.csv | Sms_Raw_NB.r |
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Airlines Data | .txt | Airlines+Data.xlsx | Airlines+Data.r |
2. | Coca Cola Sales | .txt | CocaCola_Sales_Rawdata.xlsx | CocaCola_Sales_Rawdata.r |
3. | Plastic Sales | .txt | PlasticSales.csv | PlasticSales.r |
You Require Positive-Words, Negative-Words and Stop-Words for this Analysis.
No. | Name | Problem Statement | Dataset | File |
---|---|---|---|---|
1. | Amazon HP Review | .txt | HP Reviews.txt | Amazon_HP_Reviews.r |
2. | IMDB Paatal Lok WebSeries Review | .txt | Paatal_Lok_Reviews.txt | IMDB_Paatal_Lok.r |
THANKYOU