In the digital age, each business is mixed with technologies but ironically is more complicated for companies to find customer and to keep them. The needs, the expectations of customers, their behaviours, and values and even their tastes are evolving ever faster. The competition is also stronger in all areas with the facilitation of the process of creating companies and the many opportunities that are created every day.
Nowadays, loyalty is the result of a company which not only responds to the customer needs but anticipates them. Loyal customers are more valuable than new customers, who cost more to acquire and don't spend as much. The Pareto Law demonstrated that 80% of the businesses are driven by 20% of their customers. And loyal customers are important, they are the ones who constitute the customer base, and which will allow your business to grow and maintain high profits.
In 2021, the average churn rate in telecom businesses was 22%. Every year, it has been increasing mainly because of a strong competition. The telecommunication business has a large amount and variety of service providers and customers just change from one to one easily. Individual customer retention is difficult because most telecom companies have too many customers that they cannot devote more time to than necessary.
Our agency has been hired by a Telecom company as an AI expert to build a model that will enable them to predict the customers with a high churn probability based on a 2-years historic customers data. They would like to put in place strategies to retain customers with high probability of churn to concentrate their efforts and minimize their expenses.
Our goal is to enable the telecom company to predict the customers who have a high probability of churn with a good precision to help them activate the right retention strategies. Customer churn is one of the most important metrics a growing business needs to evaluate.
Definition
"Customer churn is the percentage of customers that stopped using your company's product or service during a certain time frame" (HubSpot).
It helps the company to identify customers who are going to churn and understand the reasons behind it, harnessed by the power of data and machine learning. And it enables the adequate teams and allow them to develop the tactics to achieve customer retention.
We have a database with demographic and account related information about the customers and the services they subscribed to in the telecommunication company. The identified problem has customer input variables and an output variable, which is our target: "Customer churn". It is supervised learning models that we are going to use to make predictions.
First, we will clean and transform this dataset. We will explore our data by visualizing and understanding it to know which information will be more useful for our analysis and if some adjustments are needed (outliers? categories...etc). Then we will explore the following prediction models:
We will focus our report and our analysis on those questions:
From the Dataset descriptive information, we already know that there are 7043 rows corresponding to customerID and 21 columns or features. The columns are describing customers account information, demographic information, customer subscribed services and the customer churn Vs previous month: