GitHub - patilsanket48/Telco-Churn-Analysis-Classification-Pipeline-: Predict behavior to retain customers. We can analyze all relevant customer data and develop focused customer retention programs

open the file : Telco Churn Analysis Final

Problem Statement

Churn is a one of the biggest problem in the telecom industry. Research has shown that the average monthly churn rate among the top 4 wireless carriers in the US is 1.9% - 2%.

"Predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs." [IBM Sample Data Sets]

Data information

Each row represents a customer, each column contains customer’s attributes described on the column Metadata.

The data set includes information about:

Customers who left within the last month – the column is called Churn
Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies
Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges
Demographic info about customers – gender, age range, and if they have partners and dependents

Initial plan for data exploration

Overview the data
check datatypes of the features
check null values and take action
Feature Engineering (One-hot-encoding, skewness check)
Data Exploration (Visual Analysis)

Plan for ML model execution (KNN, Logistic Regression, RandomForest)

Hyperparameter tuning
ML algorithm fitting, prediction
Result assessment - confusition Matrix, ROC & Precesion/recall curves

Summary & Key Findings

so far, Random Forest model performed well.

KNN and Randomforest model predicted similar results
RandomForest model preformed good in positive class prediction which is the main focus for this problem with very high recall score (specificity)
our model has a 88% recall. In such problems, a good recall value is expected.
Precision and Recall follows a trade-off, and you need to find a point where your recall, as well as your precision, is more than good but both can't increase simultaneously.

overall 86% is a very good result

Suggestions for next steps

some feature could be more importanat and ignoring some will make better model
For this backward feature selection would give better result.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
README.md		README.md
Telco Churn Analysis Final.ipynb		Telco Churn Analysis Final.ipynb
Telco Churn Analysis_stage_1.ipynb		Telco Churn Analysis_stage_1.ipynb
Telco Churn Analysis_stage_2.ipynb		Telco Churn Analysis_stage_2.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

open the file : Telco Churn Analysis Final

Problem Statement

Data information

The data set includes information about:

Initial plan for data exploration

Plan for ML model execution (KNN, Logistic Regression, RandomForest)

Summary & Key Findings

overall 86% is a very good result

Suggestions for next steps

About

Releases

Packages

Languages

patilsanket48/Telco-Churn-Analysis-Classification-Pipeline-

Folders and files

Latest commit

History

Repository files navigation

open the file : Telco Churn Analysis Final

Problem Statement

Data information

The data set includes information about:

Initial plan for data exploration

Plan for ML model execution (KNN, Logistic Regression, RandomForest)

Summary & Key Findings

overall 86% is a very good result

Suggestions for next steps

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages