Let’s start with a joke

There’s a tongue-in-cheek joke in the data science community that goes as follows: “A data scientist is someone who spends 80% of their time “cleaning” the data, 5% of their time building models/analyzing the data, and 15% of their time bitching about cleaning the data.”

What is a data scientist?

There’s some truth to the joke above, and beyond that you’re looking at a wonderfully interdisciplinary array of skills including machine learning, statistics, programming, and data analysis/SQL.

I’ve found the best data scientists are also grounded in business reality, not theoretical abstractions. They’ve come to appreciate that data science serves at the pleasure of the business, not the other way around (unless you are TikTok and the entire user experience is ostensibly powered by data science and machine-learning recommendations).

The predictive pillar

If data science was the Acropolis of Athens, one of the most important pillars would certainly be machine learning. Machine-learning algorithms use statistics to find patterns in massive amounts of data. Once a pattern is identified, it can then exploited for business value, often via the inherent power of prediction.

Anticipation is the ultimate power. Losers react; leaders anticipate.

Trying to predict churn

To run a successful software business today, you have to do three fundamental things: 1) build a product that delivers value to customers, 2) acquire customers efficiently, and 3) help customers achieve said value so they don’t leave. I’ve spend the majority of my career focused on number 3.

When a software customer leaves, it’s commonly referred to as “churn.” In a business model built on recurring revenue—as most SaaS companies are—the compounding effect of churn can be downright devastating. For example, losing 3% of your revenue each month might not sound like a lot, but compounded over a year 3% monthly churn equates to 43% of your revenue walking out the door!

Therefore, figuring out how to predict churn could be a lucrative activity, and that is where machine learning comes in. . .

We are going to skip deep learning & neural networks

At this juncture, we could follow the White Rabbit down the machine learning rabbit hole into a Neural Network Wonderland—but that is an article for another day. Suffice to say there are some INSANE things happening in the space right now, e.g. AlphaZero achieved superhuman chess abilities in just 4 hours, trained entirely via “self-play.” In other words: the totality of human skill at chess was surpassed by a machine learning algorithm before lunchtime. But I digress. . .

Building the model

We didn’t have a data science team in house, so we worked with an external data science team. They helped us 1) clean up the data/tables in Redshift, 2) craft the queries in Mode (using SQL and Python), and 3) ultimately generate models we evaluated for accuracy.

One of the early frontrunners was XGBoost, a contest-grade model which was able to accurately predict churn 70% of the time:

Source: Stratus Data

However, given the competitive nature of our approach, XGBoost soon had a contender: Random Forest. The two stacked up closely, albeit with different virtues:

Source: Stratus Data

Under the hood

Within any machine learning model, there is an analysis of “features” underway—things that may or may not carry predictive importance for the outcome you care about, e.g. predicting churn. The features that were most predictive of churn included: 1) product usage, 2) changes to product usage, 3) Alexa Rank, and 4) # of Twitter followers.

Conclusion

We ultimately decided that Random Forest was the best model for our use case, and easier to operationalize in Salesforce: I run the Python notebook in Mode, and then upload a CSV to Salesforce. This creates a Churn Prediction Score for each customers, and enables my team to prioritize and take corrective action.

Yin and yang

Of course, no machine learning model is perfect, so our team decided to treat the Churn Prediction Score as our “quantitative safety net” by which the model nominates at-risk accounts, and then a human is free to agree or disagree. After all, human intuition is powerful.

Last week I ran the model and it nominated 10 net-new accounts that were at-risk in Q3 (that a human had NOT previously flagged). The team, after some investigation, determined 7 of the 10 accounts—or 70%—were indeed at-risk.

I’m glad we have a safety net.