Cognizant Speeds Customer Churn Analysis for Telecom Service Provider

“MATLAB is one of the differentiators for us on client engagements. No matter what industry our client is in, and no matter what data they ask us to analyze—text, audio, images, or video—MATLAB enables us to provide clear results faster.”

Challenge

Analyze gigabytes of customer data and develop predictive models to predict telecom customer churn and identify its principal drivers

Solution

Use MATLAB to preprocess customer data using neural networks, decision trees, and logistic regression

Results

  • Algorithm development time cut from weeks to days
  • Project completed in less than half the allotted time
  • Hundreds of gigabytes of data analyzed
Communication tower
Communication tower.

In the telecom industry, retaining existing customers costs less than attracting new customers. As a result, telecom companies focus on reducing the customer churn rate—the number of customers switching to another provider over a specific period.

Cognizant was tasked by a major telecom company with analyzing business data on customers and developing data analytics to predict churn, determine its key drivers, and identify customers at highest risk of switching to another provider. By using MATLAB®, Cognizant analysts were able to rapidly implement and evaluate several churn-analysis approaches and to deliver their results ahead of schedule.

“MATLAB provided a rich set of functions and capabilities that enabled us to maintain our focus on the deliverable rather than writing underlying code,” says Dr. G Subrahmanya VRK Rao, senior director of Technology at Cognizant. “With MATLAB, we tested several approaches and quickly identified the one that best met the needs of our customer.”

Challenge

Established methods for analyzing churn include machine learning techniques such as neural networks, decision trees, and logistic regression. Cognizant analysts needed to evaluate each of these approaches to find the one that produced the most accurate predictions. These predictions would be based on 500 gigabytes of sample customer data, which included records of customer classification tiers, products, and delivery states (on-time, delayed, and advanced). The analysts needed to preprocess and analyze this large data set. They wanted to generate plots and other visualizations that they could use to evaluate algorithm output and communicate the results to their customer.

Solution

Cognizant used MATLAB, Statistics and Machine Learning Toolbox™, and Deep Learning Toolbox™ to accelerate the analysis of large data sets and accurately predict customer churn.

Working in MATLAB, Dr. Rao and his team imported the raw data and cleaned it by developing preprocessing algorithms that used built-in MATLAB functions to remove records with null, negative, and missing values.

Using Database Toolbox™, they stored the preprocessed data in a MySQL database and performed join queries to prepare data stored in multiple tables for analysis. The team accelerated this computationally intensive task by using Parallel Computing Toolbox™ to execute it in parallel on a multicore processor.

They then explored the data. They generated scatter plots in MATLAB to identify outliers in the data and visualize the average time taken to detect and resolve service issues, a key customer metric for churn.

The team used Deep Learning Toolbox to create, train, and simulate a neural network for churn prediction.

Using Statistics and Machine Learning Toolbox, the team developed one churn prediction algorithm based on decision trees and another based on multivariate logistic regression.

Their analysis identified three key drivers of churn: delayed responses, delayed delivery of services, and problems with quality of service. After comparing results from the neural network, the decision tree, and logistic regression, the team found that logistic regression produced most accurate churn predictions.

Results

  • Algorithm development time cut from weeks to days. “The rich set of functions and capabilities in MATLAB, Deep Learning Toolbox, and Statistics and Machine Learning Toolbox enables us to try different ideas very fast,” says Dr. Rao. “We could develop and test algorithms in MATLAB within a matter of days.”

  • Project completed in less than half the allotted time. “With MATLAB, we could concentrate on higher-level analysis,” says Dr. Rao. “This, combined with the time savings we achieved by preprocessing the data in MATLAB, enabled us to provide results to our customer well before half of the project’s stipulated time had elapsed.”

  • Hundreds of gigabytes of data analyzed. “Our telecom client had a massive accumulation of data going back more than 10 years,” says Dr. Rao. “MATLAB made it easy to clean, visualize, and analyze more than 500 gigabytes of data with no additional software or add-ons.”