Skip to content

Millions of Models Under Management

Financial asset management firms are often evaluated in terms of their “assets under management.” The leading firms have billions or even trillions of dollars that they are managing. In analytics and AI, few organizations currently approach those numbers of models under management. However, I suspect that it is already be the case in some firms--won’t be long——that the number of models under management approaches the millions.

This, of course, is a radical departure from the past. Even the most analytically-focused firms would have had a few hundred models in production a few years ago. Given the “artisanal” approach to analytics that most organizations have employed for decades, it’s likely that model management was also treated in an artisanal or even haphazard fashion—if it was done at all. Like old programming code, old models in production were often neglected and unmanaged. They sometimes lacked documentation, details on their provenance, and even records about who created them. That may have been OK if the organization wasn’t too dependent upon them.

Models That Matter

Now, of course, models derived from analytics or machine learning are increasingly viewed as an enterprise asset. If many of an organization’s operational decisions are made on the basis of such models, it’s pretty important to manage them well after they go into production. Many industries are developing models that matter to their long-term survival—financial services, healthcare, online businesses, and even manufacturing (predictive maintenance models, for example). In a bank, it’s important to know why you decided to extend credit to a customer or not; in insurance, you might want to keep track of why you charged a certain price to insure someone’s life or property. In these industries and increasing numbers of others, if you don’t know what your model is supposed to do, what data it was trained on, what the assumptions are behind it, whether it still does a good job of predicting, and any peculiarities of the code that deploys it, you are headed for trouble.

Of course, model management is not a new idea. A Vanderbilt professor I didn’t know named Robert W. Blanning, for example, was writing about the idea in 1982 (apparently he was also a scholar of “the management implications of artificial intelligence,” so there is really little new under the sun). Dr. Blanning is no longer with us, so I can’t ask him why the field never really took off. But I’m pretty sure that it didn’t thrive in any industry other than banking, where regulators forced banks to document and manage their models. Outside of that industry, there just weren’t enough models that matter to motivate firms to manage them.

More Models, More Need to Manage Them

Not only do models matter more than they used to, but we are also creating a lot more of them. Both the supply of and demand for models have increased dramatically. On the supply side, automated machine learning (AutoML) systems are proliferating from multiple vendors, and they can create models much faster than with human analysts/data scientists alone. One leading vendor of AutoML is DataRobot—where I am an advisor—and that company claimed in mid-2019 that a billion models had been created on its cloud platform. Of course, that doesn’t mean that a billion models have been deployed into production, but it does suggest that the supply of models is increasing rapidly.

The demand for models is also increasing as companies generate more data that needs to be analyzed, make greater use of external data, and need more granularity and precision in their models. This is already resulting in many tens of thousands of models within particular business domains, and will no doubt total in the millions at some firms (though I don’t know of any organizations that have totaled their enterprise model count recently). For example, in its sales propensity modeling area alone, Cisco Systems generates about 250,000 models each year that generate 11 billion scores each quarter to predict what each of 160 million businesses around the world is likely to buy from the company. And Cisco has a variety of other business domains that generate many additional models.

Managing a Multiplicity of Models

When a company recognizes that it is, or will soon be, dealing with millions of important models, how should it change its behavior? There are technological solutions that it can adopt, of course. These range from “model factory” capabilities from analytics and machine learning vendors like SAS and DataRobot, to a “machine learning platform” from Cloudera, to open source approaches for monitoring model reliability and performance using Kubernetes. Cloud vendors such as AWS, Azure, and Google Cloud all have some degree of model management tools in place as well.

In addition to technology, managing millions of models requires that the organization, and many of its leaders, have an attitude of asset management. This means they should view their models as a valuable business capability that is worthy of substantial investment and should be guarded and curated with care. If a company’s management isn’t that serious about data and analytics, they probably won’t have such an attitude.

Companies will also need to invest in people to oversee the model management process. I don’t like the term “governance,” but if I did this is where I would use it. What’s necessary are people who oversee the process of creating, deploying, monitoring, retraining, and retiring models. Technology alone can’t do it, but the people who do this work should of course be familiar with the technologies and use them to enhance their productivity and effectiveness.

You can tell that there is a change in the way we are viewing analytics and machine learning models from the language we are beginning to see in this area. One regularly sees terms like “pipelines,” “platforms,” “factories,” and “MLOps” (if you don’t have an “Ops” term in your technology domain, you might as well not exist these days). The operations and production-oriented language correctly suggests that we will be generating lots and lots of models. If we want them to be a valuable asset to our businesses, we need to think them as “model assets under management.”