Churn Prediction

Customer Churn Prediction is as important as ever. Dive into the data, get to learn your customers in details and turn it into action



We all know how important it is to keep current customer base happy.

We all know the statement: "It cost more to acquire new customers than to retain existing ones"

So the question becomes: What are you doing about it for your business?

Here's some help for you:

  • Get the view and understanding of the state of your business and customers
  • Get the needed understanding about the tools you need or can choose from - including Machine Learning algorithms and tools
  • Using your established model(s) to power new initiatives on your various channels

Let's Connect

Whether you want to create something new, or enhance what you have, we want to hear from you.



Disclaimer: The focus is on Customer Retention only in this context. Keep that in mind going through the following.


First step is to identify the features that has significant importance to customer retention - and that has to be specific features. Examples can be:

  • Product out of stock (might be substituted) or was it cancelled for some reason
  • How many order placed over time - say every month - is it increasing or decreasing?
  • Did the customer get a discount on the order or not?
  • Maybe order size matters?
  • Location like zip code, state - potentially this can also influence the outcome. 



There's more "soft" features - although equally important features - but those are hard to measure - so we'll leave those out in this example. It's also important to distinguish between different types of churns:


  • Voluntary Churn: Subscription services where user decide to cancel a running subscription
  • Non-contractual Churn: Users leaving a potential order before finalizing their transaction - also known as abandoned cart.


With our data-set we have both web orders as well as subscription orders. It could also be interesting to look at different fulfilment types - if's it retail you might have both Pick up in store as well as multiple delivery options. Again -  we will keep it simple in our case.


First challenge is to get this data, and to get it all gathered. Our approach is to get as much as possible - but remember that some data is better than nothing - and some visibility is better than none. For a web-oriented setup you'll get a lot of information simply looking at the orders placed - so start there. Abandoned Cart information should sure also be possible to gather depending on the system you're using.



Now you got the data. Next step is to get the data visualized to get just first overview - so you can begin the process of identifying what data - what "features" - there will be relevant for your Machine Learning Model. A simple way to get started is to simply draw from graphs from the data you have.

Below some examples from our data-set:


Churn Prediction - Distribution of Credit CardsChurn Prediction - Orders and Unique Users over time

You should spend some time on diving into the base data, and graph it out etc. I will give you valuable information, and information you need when you start identifying the features.


Let's look into about how many customer have churned so far?

So how do we really know? There's a lot factor to why you could lose customers - and again how do you measure that? Some ideas below:

  1. Could we measure if the number of orders per time-interval is declining?
  2. Could we measure if the $ amount per time-interval is declining?
  3. Check if the user doesn't place an order per time-interval - simply disappears?

Let's keep it simple and look at how many order users place per month and if they disappear. Later on - whe'll look at declining orders as we might be able to the those user back before they're really gone. As a sample - let's look at the raw data - what attribute's do we have:

  • Id
  • Order status: Cancelled, Settled, Pending
  • Order Date
  • Order Type: Pickup, Delivery
  • Discount: Yes/No
  • CreditCard type: MC, VISA etc
  • State
  • Order modification: Was the order modified after the order was placed? Yes/No. Order line count chanced, Product substituted.




In the area of Customer Churn/Customer retention there a number of "classical" Algorithms  and models that's often used. Most popular will be SVM (Support Vector Machines), Random Forrest, Artificial Neural Networks, Logistic Regression and Naive Bayes Algorithm. There's a lot of very good resources on  these models in this subject - if you need the details (see references further down). Lucky for us - we do not need all the details.

There's a lot tools and products to help you doing this - let's list a few:

For this example - we'll go with Azure ML Studio


Among the most common classification models used in predicting churn, we find Support Vector Machines (SVM), Logistic Regression, Random Forest, Artificial Neural Networks, and Naive Bayes Algorithm. Using machine learning does require some insights and understanding, but there's a lot of good help and a lot of good tools out there. Most products nowadays have these features built into their products which makes it easy to apply - even without understanding the core of it. Our perspective is that as long as it can be verified that it works better than without the ML/AI applied - it's good. So that's an obvious thing to make sure - as with everything - make sure that you can measure so that you can verify what you're doing.


Again - In this case we'll keep it simple - and use Azure ML Studio to build our model and expose a webservice we can use. Doing a "fast-forward" this is what we end up with:


Churn Prediction Experiment 1


Establishing the right model takes some time. You will have to try out various algorithms, add and remove features from your data-set and see will give the result in terms of accuracy and and precision. The various tools will help you, and there's a lot of guidance on which algorithms that works best for various cases. 



Now the various tools, including ML Studio, can expose your model as a webservice. This means that you can start using it on your various channels as systems and in this case we can start using it to predicting if a customer is likely to churn or not. We can use this in a lot of difference scenarios - examples could be:

  • Send marketing emails or text to customers that's likely to churn
  • Show custom offers or single-use-coupon codes when they use your web site
  • Mail specific offers to their home address 
  • etc.

Look out for a video that demonstrates the use of Sitecore CDP to present personalized messages and offers based on this churn model and exposed webservice. Will be posted here soon.


PLEASE reach out to us with any comments and questions on Churn prediction - we'll be happy to help your business.