Back in 2012 I wrote (with D.J. Patil, who went on to become the Chief Data Scientist in the White House) an article in Harvard Business Review called “Data Scientist.” Nobody remembers the title or much about the content of the article, but many remember the subtitle: “Sexiest Job of the 21st Century.” At the time (and still today), these jobs paid well, were difficult to fill, and required a very high level of analytical and computational expertise. But a more accurate subtitle might have been “Sexiest Job of the 2010-2019 Decade,” because I am not sure how much longer data scientists will be in great demand.
Here’s why: automated machine learning or AML. Many organizations currently need their data scientists to help with machine learning—using data to train models that can then make predictions or classifications on other data. It’s a powerful tool that has many different applications. Historically, one needed data science skills to do it. Now, however, many of the activities in machine learning are being automated. Vendors like DataRobot, SAS, AutoWEKA, and Google Cloud are creating proprietary or open source AML tools to automate such tasks as data preparation, variable or feature engineering, algorithm selection, and even model deployment and explanation. The available AML tools aren’t fully autonomous yet, but they do eliminate or reduce the requirements for heavy data science expertise in doing machine learning.
And this trend is starting to have an impact on staffing. The “citizen data scientist” term was identified a few years ago without a lot of precision about its meaning. The gist of the concept was that some aspects of data science and analytics could be performed by those with relatively little experience. I’m not sure how valid the idea was in the past, but it’s becoming increasingly possible now. Algorithm selection, for example, no longer requires that one understands the difference between logistic regression, random forest, and gradient boosting models—the machine simply figures out which type of algorithm fits the data best. As we discussed during the IIA 2018 Predictions and Priorities webcast, it’s what’s driving us toward a “post-algorithmic era” of analytics.
I’ve been doing a little research on AML, talking to customers of DataRobot about the benefits they are seeing from the technology. And there are some disturbing messages for data scientists in some of the interviews. One manager at a bank, for example, noted:
“I am an advocate of the citizen data scientist camp. We hire a lot of data scientists, but with automated machine learning you don’t really need to know much about technology or math or programming—it gives business analysts superpowers. If I have to choose between people who know data science, and analysts who understand the business problem, the data, and the behavior of customers, I will take the analysts every time—they are much more useful. Data scientists sometimes resist the automated machine learning tools—they say they can beat them, but they usually can’t. And unless they find a way to automate the business intuition piece, they won’t be able to catch up with the analyst equipped with automated ML tools.”
And several other managers I spoke with had similar comments. To be sure, none of the people I interviewed said their organizations were no longer hiring data scientists. But several did hope not to need as many in the future.
Of course, there are several ways that data scientists can address this issue and make themselves more valuable to organizations. Here are a few ideas:
Data scientists can embrace automated tools themselves—the combination of their expertise and the productivity gained from automated machine learning tools can make them unbeatable;
They can try to acquire business insights and experience in addition to the technical capabilities they possess;
They can focus on the aspects of machine learning and analytics that are not yet automated; deep learning is a good example at the moment.
Data scientist jobs are still sexy, but their sex appeal—more accurately, the demand for them in the marketplace—is somewhat at risk. The last thing they should do if they want to retain their allure is to resist new technologies that can improve their productivity and performance, and that of other analytics users within their organizations. No matter what your training and skills, it’s not very sexy to turn your back on the capabilities of new technology.