Excel, PowerBI, Azure ML
The motivation behind this project was to help a telecommunications company reduce customer churn and increase revenue. Customer churn is a major challenge for telecommunications companies as it can significantly impact their bottom line. By predicting which customers are at high risk of churning, the company can develop targeted retention programs to reduce churn and improve customer loyalty.
Additionally, this project aimed to demonstrate the power of data analysis, machine learning, and visualization tools in improving business outcomes. By using tools such as Excel, Power BI, and Azure ML, we were able to extract insights from the data, build a predictive model, and provide actionable recommendations to the company.
Overall, the motivation behind this project was to help a company address a critical business challenge while also showcasing the potential of data-driven solutions in improving business outcomes.
About Data
The Telco Customer Churn dataset is a collection of customer information compiled by IBM Sample Data Sets. The objective of this dataset is to predict customer behavior in order to retain them, which involves analyzing relevant customer data and developing customer retention programs.
The dataset contains information about customers who have left within the last month (represented by the Churn column), as well as the services each customer has signed up for, including phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies. Additionally, the dataset includes customer account information such as how long they have been a customer, their contract type, payment method, paperless billing, monthly charges, and total charges. Finally, demographic information about customers is also included, such as their gender, age range, and whether or not they have partners and dependents.
In this project, I successfully predicted customer churn in a telecommunications company using customer data. I started by cleaning and analyzing the data using Excel, identifying patterns and trends in customer behavior and churn. I then created a dashboard using Power BI to provide visualizations and insights into customer behavior and churn patterns.
Next, I used Azure ML to build a machine learning model to predict which customers were most likely to churn. The model was trained on historical data and used various algorithms to determine the best approach for prediction. I fine-tuned the model by testing and evaluating its performance on a validation set, and achieved a high accuracy rate.
Finally, I used the model to identify customers who were at high risk of churning, allowing the telecommunications company to develop targeted retention programs to reduce customer churn. The project helped the company to retain more customers and increase their revenue.
Overall, this project demonstrated the power of data analysis, machine learning, and visualization tools in predicting customer churn and improving business outcomes in the telecommunications industry.
Step 1 - Data cleaning and transformation actions in Excel
Re-phrased the values in "MultipleLines" column to "SingleLine", "MultipleLine", and "No Phone" to eliminate the "PhoneService" column.
Changed the data type of the "SeniorCitizen" column to "Text" and re-phrased the values to "Yes" and "No".
Created a new column called "FamilyStatus" based on "Partner" and "Dependents" columns. The values are "Single" if the customer has no partner and no dependents and "Multi Member" otherwise.
Combined "StreamingMovies" and "StreamingTV" columns into a new column called "No_StreamingService" to show how many streaming services a customer uses.
Merged "OnlineSecurity", "OnlineBackup", "DeviceProtection" & "TechSupport" columns into a new column called "No_OtherService" to show how many other supportive services a customer uses.
Filled the missing values in "TotalCharges" column with "0" value based on the observation that these particular customers have "0" value in their "tenure" column. (Read More)
Step 2 - Creating a Dashboard in Power BI to Gain InsightsÂ
After completing the data cleaning and transformation actions in Step 1, the next step was to create a dashboard in Power BI to explore the customer dataset and gain insights. The dashboard provided an overview of the customer dataset, including key metrics such as customer demographics, products and services used, and customer churn rate. The dashboard included visualizations such as bar charts, pie charts, and tables to help users easily identify trends and patterns in the data.
Step 3 - Building & Training Models in ML Azure
In this step, In step 3, we used ML Azure to build a predictive model for identifying churned customers in a telecommunication company. Our target variable was "churn", and we included all columns as our selected features. We used 70% of our data for modeling and applied two classification models: a Two-Class Boosted Decision Tree and a Two-Class Logistic Regression Model. These models were designed to accurately predict whether a customer is likely to churn, which can be a valuable tool for the company to take proactive measures and retain their customers.
Here is an overall view of the dashboard. If you would like to explore the dashboard and interact with the visualizations yourself, you can access it via my github.
Some example of insights from the dashboard:
Month-to-month contracts have a high churn rate
Two-year and one-year contracts are more likely to stay with the company
Paperless billing is the most common choice
There may be a correlation between paperless billing and churn
Majority of people use electronic checks for payment
Electronic checks have a higher churn rate compared to other payment methods
Bank transfers and credit cards are popular and associated with lower churn
Mailed checks are less popular but have low churn rate.
Based on the insights gained from the dashboard, there are several recommendations that can be implemented to decrease the churn rate. Firstly, promoting the benefits of adding dependents or partners to the service could be a strategy to increase customer retention. Additionally, offering special offers for long-term contracts could incentivize customers to commit to the service for a longer period. Bundling services together could also provide added value to customers and make it more difficult for them to switch to competitors. Lastly, implementing loyalty rewards, such as discounts or free services, could further strengthen customer loyalty and reduce the likelihood of churn.
The ML models' outcomes can be observed below:
Despite the fact that there is still room for improvement, the Logistic Regression Model demonstrated superior performance compared to the Decision Tree Model. This is evident from its higher accuracy, precision, F1 score, and AUC. However, the model has a lower recall. Additionally, the model achieved better scores for true negatives and false positives.