Skip to content

You Often Want Your Models To Be Wrong

A few weeks back, I was in a discussion with some analytics executives when one gentleman made a point that sounded odd at first. He suggested that in many cases we actually want the predictions we make with our models to be wrong, not right. After hearing his explanation, however, I totally agreed with him. This post will explain and illustrate with some examples.

Many Events We Model Target Desired Outcomes

We often think about models in context of trying to achieve something positive and helping do that with accurate predictions. For example, we build a pricing model to set a more profitable price. Or, we build a sales forecast to plan a budget and logistics for the coming quarter. Or, we predict who is most likely to respond to an offer so that we can maximize the sales from that offer.

In each of these cases, we benefit from our models being as accurate as possible and we want our actual results to match closely to our predictions. To help with that, we’ll track results in detail and match them against the model predictions with great care. When the reality does not play out as predicted then pressure is put on the data science team and the business to do a better job. But what about cases where a predicted outcome is bad?

Bad Outcomes Are Often Modelled Too

There are also situations where we predict or forecast something that has negative connotations for our business or community. We might project how many diabetes deaths will occur in the coming year. Or, we might predict how many flawed products will roll off the assembly line today. Or, we might predict how many insurance claims to expect due to a major storm.

In each of these cases, we also use historical data to come up with predictions that tell us what to expect. The key here is that those predictions will come true if we hold all other factors steady. In other words, if we don’t change anything we’re doing as compared to what was being done when the training data was collected. But, if an outcome is bad, why sit by idly waiting for the bad outcome to happen?

Intervening To Make Your Models Wrong

The gentleman referenced previously made the terrific point that once you have your prediction of how many bad outcomes to expect, don’t just sit idly waiting for people to die or claims to be filed. Try to change the outcomes through intervention! Don’t wait for insurance claims to roll in after a storm. Instead, educate policy holders on actions they can take before the storm to minimize damage and, therefore, claims. Don’t sit back watching bad product roll off the assembly line. Instead, try to tune the process so that quality is improved. In other words, don’t think of your predictions of negative outcomes as a target you want to hit like you would with positive outcomes. Rather, think of the predictions as a worst-case scenario that you very much want to minimize.

This is a distinctly different approach. Rather than taking actions and validating the actions work as expected to help your predictions come true, take actions that will specifically make your predictions become inaccurate. The best outcome when you’re modeling something bad is to make the model predictions as wrong as possible by mitigating the damage before it can happen.

Accurately Assessing Your Interventions

Of course, if you’re successful in offsetting some of the predicted negativity, your models will look less accurate according to typical metrics. It is necessary to account for this in expectation setting. People are used to thinking that a missed forecast is a bad thing, but that isn’t the case if the forecast is missed because fewer bad things happened due to proactive interventions.

It is obviously best to use controlled A/B testing so that you can cleanly validate that your results would have been as predicted without intervention. Without a control group, it won’t be possible to tell how much of the gap was due to your interventions and how much was due to the models being fundamentally incorrect. That’s a problem. Of course, some situations might make it ethically dubious to hold out a control group. For example, do you really want to leave 10,000 houses exposed to a storm just to validate that the damage mitigation ideas you have work? Is that fair to the 10,000 homeowners who have damage that could have been avoided? There are numerous issues to consider.

The main takeaway from this blog is simply that you should examine your portfolio of models to identify situations where you have been dutifully tracking your models’ accuracy and feeling good about it, but where interventions to make the reality miss the model predictions would be a better way to go. It won’t be the most common scenario for most businesses, but I’m willing to bet that most businesses do have some cases where aiming for less accurate models is a better path. You might just find a few in your business if you look into it.

Originally published by the International Institute for Analytics

Bill Franks, Chief Analytics Officer, helps drive IIA's strategy and thought leadership, as well as heading up IIA's advisory services. IIA's advisory services help clients navigate common challenges that analytics organizations face throughout each annual cycle. Bill is also the author of Taming The Big Data Tidal Wave and The Analytics Revolution. His work has spanned clients in a variety of industries for companies ranging in size from Fortune 100 companies to small non-profit organizations. You can learn more at http://www.bill-franks.com.

You can view more posts by Bill here.