Skip to content

Repo showing some of the data science projects I have done including preprocessing and supervised learning.

Notifications You must be signed in to change notification settings

LHamnett/Data_Science_and_Machine_Learning_projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Data-Science-Projects

Repo showing some of the data science projects I have done including preprocessing and supervised learning.

We have the preprocessing folder and supervised learning folder.

In the preprocessing folder we have:

  1. Create_shuffled_train_val_split_for_image_datasets - shuffling and combining a number of different image datasets into training and validation splits
  2. Data_preparation_feature_analysis_star_ratings_sklearn - Using big data methodologies (such as dask and c++) to process a large amount of data as efficiently as possible, clean it, combine it and then save it in a suitable format
  3. Energy_price_data_preprocessing - gathering data from a number of different excel files and performing a number of preprocessing steps to get the data into a single cleaned dataframe.

In the supervised learning folder we have:

  1. Fraud_classification_sklearn - Comparing different classification algorithms to see if an individual is a person of interest in the Enron company legal case for fraud.
  2. Employee_leave_logistic_regression_sklearn - A logistic regression using sklearn to predict for scheduling reasons whether an employee would take a large amount of time away from the office for a certain absence reason.
  3. Road_safety_regression_prediction_ensemble_sklearn - Road-safety score prediction using regression methods and ensemble learning. Comparison of different regression models and ensemble methods to predict the road safety score (1 star - 5 stars) of a road given a set of road features.
  4. Energy_price_model_selection_optimisation_sklearn_optuna - Loads data, automatically compares a number of different regression models for predicting energy prices, chooses the best model and applies automated hyperparameter tuning , then *visualises the results of the tuned model on a test set.

About

Repo showing some of the data science projects I have done including preprocessing and supervised learning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published