Starting a 100 Days Code Challenge for Learning Data Science from Scratch is my goal on Learning Data Science in Machine Learning by:
- Learning Fundamentals of Python
- Python Libraries for Data Science
- Data Manipulation and Preprocessing
- Machine Learning Basics
- Advanced Machine Learning Techniques
- Deep Learning and Neural Networks
- Model Evaluation and Deployment
- Data Science Project and Wrap-Up
-
Project 1: Bank Management System: Python, OOPS, and MySQL Database
-
25 days Completion: Successful Completion of 25 Days in 100 Days of Data Science Code
-
50 days Completion : Successful Completion of 50 Days in 100 Days of Data Science Code
Sun | Mon | Tues | Wed | Thurs | Fri | Sat |
---|---|---|---|---|---|---|
- | - | - | - | - | - | 1 |
2 | 3 | 4 | 5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 | 13 | 14 | 15 |
16 | 17 | 18 âś… | 19 âś… | 20 âś… | 21 âś… | 22 âś… |
23 âś… | 24 âś… | 25 âś… | 26 âś… | 27 âś… | 28 âś… | 29 âś… |
30 âś… | 31 âś… | - | - | - | - | - |
Sun | Mon | Tues | Wed | Thurs | Fri | Sat |
---|---|---|---|---|---|---|
- | - | 1 âś… | 2 âś… | 3 âś… | 4 âś… | 5 âś… |
6 âś… | 7 âś… | 8 âś… | 9 âś… | 10 âś… | 11 âś… | 12 âś… |
13 âś… | 14 âś… | 15 âś… | 16 âś… | 17 âś… | 18 âś… | 19 âś… |
20 âś… | 21 âś… | 22 âś… | 23 âś… | 24 âś… | 25 âś… | 26 âś… |
27 âś… | 28 âś… | 29 âś… | 30 âś… | 31 âś… | - | - |
Sun | Mon | Tues | Wed | Thurs | Fri | Sat |
---|---|---|---|---|---|---|
- | - | - | - | - | 1 âś… | 2 âś… |
3 âś… | 4 âś… | 5 âś… | 6 âś… | 7 âś… | 8 âś… | 9 âś… |
10 âś… | 11 âś… | 12 âś… | 13 âś… | 14 âś… | 15 âś… | 16 âś… |
17 âś… | 18 âś… | 19 âś… | 20 âś… | 21 âś… | 22 âś… | 23 âś… |
24 âś… | 25 âś… | 26 âś… | 27 âś… | 28 âś… | 29 âś… | 30 âś… |
Sun | Mon | Tues | Wed | Thurs | Fri | Sat |
---|---|---|---|---|---|---|
1 âś… | 2 âś… | 3 | 4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
29 | 30 | 31 | - | - | - | - |
- Control flow statements like if-else conditions and loops.
Github Repository: Source Code
LinkedIn post: Daily Update
- Concept of modules.
- How to import and use built-in modules as well as create your own.
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Python's built-in data structures such as lists, tuples, dictionaries, and sets.
- Also, learn about indexing, slicing, and manipulating these data structures.
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Read from and write to files in Python.
- Learn about exception handling and how to handle errors using try-except blocks.
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Class Declaration
- Object Instantiation
- Constructor and Destructor
- Built-in Class Attributes and Functions
- Instance, Class and Static Variables and Functions.
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Data Abstraction
- Encapsulation
- Inheritance
- Polymorphism.
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Higher Order Functions
- List Comprehensions
- Regular Expressions (RegEx)
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Setting Up MySQL Connection
- Executing SQL Queries.
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Database Setup
- Python Environment Setup
- Database Connectivity
- Create Basic Classes
- Customer Management.
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Account Management(Create Account, List Account Details)
- Basic Error Handling(Apply Validations on Input values)
- Testing and Debugging(Checking Input value validations).
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Transfer Operation
- Final Testing and Documentation
- Clean Up and Deployment.
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Introduction to NumPy
- Installing NumPy
- Creating NumPy arrays
- Array indexing and slicing
- Array reshaping and resizing
- Stacking and splitting arrays.
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Element-wise Operations
- Aggregation Functions
- Linear Algebra with NumPy.
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Descriptive statistics
- Random number generation
- Sorting and searching arrays
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Introduction to Pandas
- Install Pandas
- Types of Data Structures : Series, DataFrames
- Importing and Exporting DataFrames
- DataFrame Functions
- Accessing DataFrames : Indexing, Slicing, loc[], iloc[].
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Advanced Indexing and Selection - (Label-based indexing, boolean indexing, and advanced slicing)
- Combining DataFrames - (Concatenation, merging, and joining techniques)
- Data Manipulation
- Advanced Data Manipulation - (reshaping data, pivoting, and melting)
- Data Aggregation and Grouping - (groupby() and other aggregation Functions)
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Basic Data Cleaning and Pre-Processing:
- Removing Duplicates
- Fixing Wrong Data
- Cleaning Data of Wrong Format
- Cleaning Empty Cells
- dropna(), fillna()
- drop_duplicates()
- Data Transformation - ( apply() and map() )
- Working with Text Data - Functions of str attribute
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Feature Engineering:
- Data Normalization
- Data Scaling
- Data Standardization
- Time Series Analysis and Resampling:
- Working with datetime data
- Date offsets
- Resampling time series data
- Datetime index
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Matplotlib:
- Installation of Matplotlib library
- Import Matplotlib library
- Matplotlib Pyplot:
- Plotting x and y points
- Plotting without line
- Matplotlib Markers (Types, Color, Size)
- Matplotlib Line (LineStyle, Line colors, line width)
- Single Plot with multiple lines
- Matplotlib Labels and Title (Create Label, Create Title, Set font properties to Title and Label, Title Position)
- Adding Grid Lines (Line Properties of grid)
- Matplotlib Bars:
- Vertical Bars
- Horizontal Bars
- Bar colors
- Bar width
- Bar height
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Subplots:
- subplot() function
- Title for each subplot
- Super title of Plot
- Matplotlib Scatter Plot:
- Create Scatter Plots
- Compare Plots
- Color each dots
- ColorMap for dots
- Combine Color, Size and Alpha values
- Matplotlib Histograms:
- Create Histogram
- Matplotlib Pie Charts:
- Create Pie Chart
- Labels
- startAngle
- Explode
- Shadow
- Colors
- Legend
- Header
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Seaborn:
- Installation of Seaborn
- Import Seaborn library
- Different types of plots:
- Relational Plots
- Categorical Plots
- Distribution Plots
- Regression Plots
- Categorical Plots:
- Bar Plot
- Count Plot
- Box Plot
- Violinplot
- Stripplot
- Swarmplot
- Factorplot
- Distribution Plots:
- Histogram
- Distplot
- Jointplot
- Pairplot
- Rugplot
- KDE Plot
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Customizing Seaborn Plots:
- Changing Figure Asthetics
- Removal of Spines
- Changing the Figure size
- Scaling the plots
- Setting the Style Temporarily
- Color Palette - (Diverging, Sequential, Default color palette)
-
Multiple Plots with Seaborn:
- Using Matplotlib - (add_axes(), subplot(), subplot2grid() functions)
- Using Seaborn - (FacetGrid() method, PairGrid() method)
-
Relational Plot Types:
- relplot()
- Scatter Plot
- Line Plot
-
Regression Plot Types:
- lmplot
- RegPlot
-
Matrix Plots:
- HeatMap
- Clustermap
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Introduction
- Features
- Applications
- Identifiers:
- Keywords
- Variables and Constants
- Operators in python
- Data types in python
- String data type and operations
- List data type and operations
- Tuple data type and operations
- Set data type and operations
- Dictionary data type and operations
- Control Statements in python:
- Decision making
- looping statements
- looping control statements
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Introduction
- Installation
- Import
- Create arrays in python
- Array creation using NumPy Functions
- zeros
- ones
- arange
- linspace
- eye
- identity
- fromiter
- Accessing array elements
- Indexing and Slicing
- Random number Generation
- rand()
- random()
- ranf()
- randint()
- randn()
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Introduction - Install, Import
- Data Structures:
- Series
- DataFrames
- DataFrames
- Importing and Exporting
- Functions - columns, describe(), info(), head(), tail(), isna()
- Accessing DataFrames - loc[], iloc[],
- Basic Data Cleaning:
- Empty Cells
- Wrong Format Data
- Fixing Wrong Data
- Removing Duplicates
- Apply filters
- apply()
- map() - Using Dictionary, Series, Function for mapping
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Artificial Intelligence:
- Machine Learning:
- Difference between Artificial Intelligence and Machine Learning
- Applications of Machine Learning
- Limitations of Machine Learning
- Types of Machine Learning
- Supervised Learning
- Unsepervised Learning
- Reinforcement Learning
- Comparisons between all types
GitHub Repository: Source Code
LinkedIn post: Daily Update
- 1. Data Preprocessing:
- Data Cleaning
- Feature Selection/Extraction
- Normalization/Scaling
- Encoding Categorical Variables
- Splitting Data
- 2. Model Training:
- Selecting a Model
- Initializing Parameters
- Training Loop
- Gradient Descent (for Optimization)
- Hyperparameter Tuning
- 3. Model Evaluation:
- Metrics
- Cross-Validation
- Confusion Matrix
- ROC and AUC
- Overfitting and Underfitting
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Cross-Validation
- Evaluation Metrics:
- Accuracy
- Precision
- Recall
- F1-Score
- Area Under Curve (AUC) and Receiver Operating Characteristic (ROC)
- Confusion Matrix
- Overfitting and Underfitting Detection:
- Overfitting
- Underfitting
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Underfitting:
- Choosing a more complex model
- Adding more features
- Fine-tuning hyperparameters
- Overfitting:
- Collect more data
- Feature selection
- Cross-validation
- Regularization techniques
- Early stopping
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Linear Regression Introduction
- Simple Linear Regression:
- Assumptions of Simple LR
- Equation of Simple LR
- Applications of Linear Regression
- Working of Linear Regression
- Finding goodness of fit
- Examples of Linear Regression
- Implementation of Simple Linear Regression
- Real-world Application: Salary Prediction
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Multiple Linear Regression (MLR):
- Key points of MLR
- Equation of MLR
- Assumptions of MLR
- Implementation of MLR using Python
- Real-world Application: Student Performance Analysis
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Classification
-
Types of Learners:
- Lazy Learners: Firstly, store dataset and wait until receive test dataset.
- Eager Learner: Develop classification model based on training dataset, before receiving testing dataset.
-
Types of Classification Algorithms:
- Logistic Regression
- Decision Trees
- Random Forest
- Support Vector Machines (SVM)
- K-Nearest Neighbors (KNN)
- Naive Bayes
- Neural Networks
-
Terminologies in Classification:
- Features and Labels
- Training and Testing Data
- Confusion Matrix
- Precision, Recall, F1-Score
- ROC and AUC Curve
-
Types of Classification:
- Binary Classification: Two classes (e.g., Yes/No)
- Multiclass Classification: Multiple distinct classes (e.g., Cat/Dog/Horse)
-
Models' Evaluation Techniques for Classification: Used for finding goodness of model's fit:
- Accuracy
- Precision and Recall
- F1-Score
- ROC Curve and AUC
- Confusion Matrix
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Logistic Regression:
- Logistic Function (Sigmoid Function)
- Assumptions of Logistic Regression
- Types of Logistic Regression:
- Binary / Binomial
- Multinomial
- Ordinal
- Terminologies involved in Logistic Regression
- Implementation of Logistic Regression
- Difference between Linear Regression and Logistic Regression
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Decision Tree:
- Components of a Decision Tree
- Root Node
- Internal Nodes
- Leaf Nodes
- Attribute Selection Measures(ASM):
- Entropy
- Information Gain
- Gini Index
- How Decision Trees Work
- Advantages of Decision Trees
- Components of a Decision Tree
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Decision Tree Implementation Setup:
- Data Pre-processing
- Model Training
- Predicting the Results
- Model Evaluation Techniques
- Examples for Decision Tree Implementation:
- IRIS Flower Classification
- Red Wine Quality Prediction
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Ensemble Methods:
- Bagging
- Boosting
- Stacking
- Advantages of Ensemble Methods
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Gradient Boosting in Machine Learning:
- What is Gradient Boosting
- Key Components of Gradient Boosting
- How Gradient Boosting Works
- Benefits of Gradient Boosting
GitHub Repository: Source Code
LinkedIn post: Daily Update
- AdaBoost and XGBoost:
- AdaBoost (Adaptive Boosting)
- XGBoost (Extreme Gradient Boosting)
- Advantages of AdaBoost and XGBoost
- Applications
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Random Forests:
- What are Random Forests
- Key Components of Random Forests
- How Random Forests Work
- Benefits of Random Forests
- Real-world Applications of Random Forests
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Random Forest Implementation:
- Step-by-Step Approach
- IRIS Flower Prediction
- Red Wine Quality Prediction
- Hyperparameter Tuning:
- Unlocking Model Potential
- GridSearchCV
- RandomizedSearchCV
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Decision Tree in Action
- Enchantment of Random Forests
- Social Media Ads prediction
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Introduction to SVM
- Terminologies used in SVM
- Advantages of SVM
- Limitations of SVM
GitHub Repository: Source Code
LinkedIn post: Daily Update
- SVM Implementation:
- Linear SVM (Social Media Ads) : Kaggle Notebook
- Non-Linear SVM (IRIS Flower Prediction) : Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
- SVM Regression Implementation:
- Salary Prediction : Kaggle Notebook
- Boston Housing Price Prediction : Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
- KNN Introduction
- Distance Metrics:
- Euclidean Distance
- Manhatten Distance
- Minkowski Distance
- How KNN works
- How to choose value of 'K'
GitHub Repository: Source Code
LinkedIn post: Daily Update
- KNN Classification:
- IRIS Flower Prediction : Kaggle Notebook
- Mushroom Clasification : Kaggle Notebook
- KNN Regression:
- Employee Salary Prediciton : Kaggle Notebook
- Student Performance Prediction : Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
- KNN Regression:
- House Price Prediction : Kaggle Notebook
- KNN Classification:
- BMI Classification : Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
- What is AI
- What is ML
- Machine Learning
- Model Evaluation Techniques in ML
- Classification: Accuracy Score, Confusion Matrix, Classification Report
- Regression: Mean Absolute Errors,Mean Square Errors, Root Mean Square Errors
- Exploratory Data Analysis (EDA)
- Handling Outliers
- Removing Outliers
- Transforming Values
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Resource Allocation in 5G Network Service Project:
- Data Pre-Processing
- Implementation:
- Polynomial Regression
- SVM Regression
- KNN Regression
- Model Evaluation:
- Mean Absolute Errors
- Mean Square Errors
- Root Mean Square Errors
- Kaggle Notebook : Link to Notebook
- Comparison of Model Performances (Multiple Bar Charts)
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Gender Classification Project:
- Data Pre-Processing
- Implementation:
- Logistic Regression
- Decision Tree
- Random Forest
- SVM Classification
- KNN Classification
- Model Evaluation:
- Accuracy Score
- Confusion Matrix
- Classification Report
- Kaggle Notebook : Link to Notebook
- Comparison of Model Performances (Bar Chart)
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Introduction to Cross-Validation
- What is Cross Validation
- Why is Cross Validation Important
- Advantages of Cross Validation
- Limitations of Cross Validation
- Types of Cross-Validation:
- Leave-One-Out Cross-Validation (LOOCV)
- Leave-P-Out Cross Validation (LPOCV)
- K-Fold Cross-Validation
- Stratified K-Fold Cross-Validation
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
K-Fold Cross-Validation : Kaggle Notebook
-
Stratified K-Fold Cross-Validation : Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
EDA on Walmart Dataset : Kaggle Notebook
-
EDA on Fifa 19 Dataset : Kaggle Notebook
-
EDA on Restaurant Dataset : Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
- The Curse of Dimensionality
- The Importance of Dimensionality Reduction
- Dimensionality Reduction Techniques:
- Feature Selection
- Feature Extraction
- Dimension Reduction
- Advantages of Dimensionality Reduction
- Limitations of Dimensionality Reduction
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Some common terms used in PCA algorithm
- Uses of PCA
- Advantages of Principal Component Analysis
- Limitations of Principal Component Analysis
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Step 1 : Covariance Matrix Computation
- Step 2 : Compute Eigenvalues and Eigenvectors of Covariance Matrix to Identify Principal Components
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Pre-processed Data
- Calculated Covariance Matrix
- Eigenvalues and Eigenvectors
- Sorted Eigenvalues
- Select Principal Components
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Data Preparation
- Importing Scikit-learn
- Standardization
- PCA Implementation
- Explained Variance
- Dimensionality Reduction
- Visualization
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
- What is Feature Selection?
- Why is Feature Selection Necessary?
- Techniques in Feature Selection
- Univariate feature selection
- Feature importance from tree-based models
- Recursive Feature Elimination (RFE)
- L1-based feature selection
- Correlation-based feature selection
- Steps in Feature Selection:
- Data Pre-Processing
- Feature Scoring
- Feature Selection
- Advantages of Feature Selection:
- Improved model performance
- Faster training and prediction
- Enhanced model interpretability
- Reduced risk of overfitting
- Easier visualization of data
- Limitations of Feature Selection:
- It may result in information loss.
- It can be challenging to decide which features to select.
- Some methods might not work well for all types of data.
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Introduction to Filter Methods
-
Steps in Filter Methods:
- Data Pre-Processing
- Feature Scoring
- Feature Selection
-
Common Techniques in Filter Methods:
- Correlation-based Feature Selection
- Information Gain
- Chi-square Test
- Fisher's Score
- Missing Value Ratio
-
Advantages of Filter Methods:
- Simplicity
- Speed
- Independence
-
Limitations of Filter Methods:
- Independence
- Suboptimal Results
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Introduction to Wrapper Methods
-
Steps in Wrapper Methods:
- Subset Selection
- Model Building
- Model Evaluation
-
Common Techniques in Wrapper Methods:
- Forward Selection Method
- Backward Elimination Method
- Exhaustive Feature Selection Method
- Recursive Feature Selection Method
-
Advantages of Wrapper Methods:
- Optimal Features
- Model-Specific
-
Limitations of Wrapper Methods:
- Computationally Intensive
- Model Dependency
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Introduction to Embedded Methods
-
Steps in Embedded Methods:
- Feature Selection While Building
- Model Training
- Feature Importance Assessment
-
Common Techniques in Embedded Methods:
- Random Forest Importance
- Lasso (L1 Regularization)
- Ridge (L2 Regularization)
- Elastic Net (L1 and L2 Regularization)
-
Advantages of Embedded Methods:
- Feature Relevance
- Model Compatibility
-
Limitations of Embedded Methods:
- Model Dependency
- May Miss Correlations
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Key EDA Operations Performed:
- Data Loading
- Data Exploration
- Data Visualization
- Statistical Insights
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Key SVR Operations Performed:
- Data Loading
- Data Pre-processing
- Feature Selection
- Splitting Data
- SVR Model Building
- Model Training
- Model Evaluation
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Key Operations Performed:
- Data Loading
- Data Pre-processing
- Collaborative Filtering
- Movie Recommendations
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Key Operations Performed:
- Data Loading
- Data Exploration
- Linear Regression Implementation
- Model Evaluation
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Key Operations Performed:
- Data Loading
- Data Exploration
- Linear Regression Implementation
- Model Evaluation
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Key Operations Performed:
- Data Loading
- Data Exploration
- Data Visualization
- Insights Extraction
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Key Operations Performed:
- Data Loading
- Data Exploration
- Data Visualization
- Insights Extraction
GitHub Repository: Source Code
LinkedIn post: Daily Update
-
Key Operations Performed:
- Data Loading
- Data Exploration
- Data Visualization
- Insights Extraction
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Intro to Clustering
- Types of Clustering:
- Partitioning Clustering
- Density-Based Clustering
- Distribution Model-Based Clustering
- Hierarchical Clustering
- Fuzzy Clustering
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Commonly Used Clustering Algorithms:
- K-means Algorithm
- Hierarchical Clustering
- DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
- Agglomerative Clustering
- Gaussian Mixture Model (GMM)
- Applications of Clustering:
- Customer Segmentation
- Image Compression
- Anomaly Detection
- Document Classification
- Advantages of Clustering:
- Pattern Discovery
- Data Reduction
- Scalability
- Interpretability
GitHub Repository: Source Code
LinkedIn post: Daily Update
- K-means Clustering:
- Initialization
- Assignment
- Update Centroids
- Repeat
- Customer Clustering : Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
- K-means Clustering:
- Initialization
- Assignment
- Update Centroids
- Repeat
- Credit Card Clustering : Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Visualize Clustering Exercises : Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
- Online Retail Clustering : Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
- What Can We Achieve with Hierarchical Clustering:
- Hierarchical Insights
- Data Exploration
- Decision Support
GitHub Repository: Source Code
LinkedIn post: Daily Update