The Challenges of Educating the Next Generation of Data Scientists

By Diego Klabjan, Sep 13, 2012

Analytics is booming, and there is a gap between the supply of skilled analysts and the high demand in the market.  In the U.S., the first comprehensive analytics program was established in 2007.  Prior to that, existing programs focused on the components of analytics - statistics, information technology, data mining and business intelligence.  It took another four years before other U.S. schools picked up on the trend and started slowly supplying students with advanced analytics degrees.

Today, in 2012, there are still only a handful of graduate and undergraduate programs focused on analytics.  Some companies, like EMC, have found the talent shortage so severe that they have created their own programs to train and certify analysts.  Given the high demand for these skills, why are universities so far behind?

A decade ago, Bioengineering was a degree and skill in heavy demand, and almost immediately numerous schools started research and educational programs in that field. What is different about Analytics and Bioengineering when it comes to establishing an educational program within a university or college? The primary difference is that the area of analytics is extremely broad.  It spans computer science, statistics, operations research, and business. As a result, there is a lack of focused research programs and departments in analytics, and it is challenging for a traditional academic department to offer a program in analytics.

While Bioengineering spans the areas of biology/medicine and engineering, it was quickly embraced by engineering schools.  Bioengineering departments started forming quickly and together with them the underlying undergraduate and graduate majors and research components reflected through advanced graduate degrees and comprehensive research groups.  Despite the fact that bioengineering also covers multiple areas, the vast majority of the departments are housed within engineering schools or colleges. Analytics is much broader than bioengineering and it encompasses several traditional research and education areas within a university.  For this reason there is ambiguity about who should host an analytics program, or even an ‘analytics department.’

North Carolina State elegantly addressed this issue by forming a new standalone unit, the Institute of Advanced Analytics.  The institute operates independently as a university-wide collaboration that can draw faculty from any of the university’s ten colleges, allowing them to participate in the analytics program on an equal footing. Northwestern University tackled this problem differently.  Northwestern’s program is a professional program, which implies that it is very autonomous.  While it is housed within the department of Industrial Engineering and Management Sciences, it draws instructors from all around the university.

Changes to the program, in particular curriculum, do not have to be approved by the university graduate school and this offers great flexibility when it comes to adjusting the curriculum based on rapidly changing business trends and needs. Northwestern also offers an online program, Master of Science in Predictive Analytics.  This program is administered by the School of Continuing Education which can also be considered a standalone unit. Despite this school not being established for the purpose of offering a degree in analytics, it is not associated with any traditional college or department, so the organizational structure is very much aligned with the North Carolina State’s model.  The big differentiating factor between these two degrees is the full-time vs. online delivery of education.  Outside of North Carolina State and Northwestern, most other analytics-based programs have been developed within schools of business.  These programs are more focused on the business aspect of analytics and thus schools of business are their natural fit.

Bioengineering prospered due to the rise of the entire discipline which has not yet been observed in analytics. As of today, no department including “analytics” in its name exists. There are no undergraduate or graduate majors exclusively in analytics.  On the research end, analytics is based on traditional disciplines such as statistics, operations research, and machine learning. In recent years the research activities come from the increased computing capabilities for handling larger and larger data sets. On the other hand, the business world is more data-driven and it is hard to be competitive without data-driven decision making. To meet these real-world needs, researchers are adapting and expending known techniques to bigger data sets. The Hadoop ecosystem for handling big data is definitely an interesting research area.

Hopefully these research directions will eventually lead to a ‘Department of Analytics’ and thus a ‘Ph.D. in Analytics.’ This will also spawn new educational programs in analytics. Many U.S. institutions of higher education realize the need to prepare the future workforce for analytics, yet there are still debates about the best way to form analytics-based programs. The highly interdisciplinary nature of analytics poses challenges that are yet to be resolved. The North Carolina State model of a new standalone unit, the Northwestern principle of a professional program, and confining analytics to a business focus are three current ways of coping with this fact. Due to the increasing demand for skilled professionals in analytics, universities should move quickly in adopting one of these three business models or find a new one that best fits their needs.

About the author

Author photo

Diego Klabjan is a professor at Northwestern University, Department of Industrial Engineering and Management Sciences. He is also Founding Director, Master of Science in Analytics. After obtaining his doctorate from the Milton School of Industrial and Systems Engineering of the Georgia Institute of Technology in 1999, in the same year he joined the University of Illinois at Urbana-Champaign. In 2007 he became an associate professor at Northwestern. He is the recipient of the first prize of the 2000 Transportation Science Dissertation Award and has also received various other awards with graduate students. His research and expertise is focused on analytics with concentration on sustainability, transportation, healthcare, supply chain management, and retail. Professor Klabjan has led projects with large companies such as FedEx Express, GM, and United Continental, and he is also assisting numerous start-ups with their analytics needs. He is also the CEO and founder of Eco Green Analytics LLC.