For my first blog posting, I decided to focus on one of the hottest topics in analytics right now: “Big Data”. Big Data is typically defined as very large, newer sources of data that don’t necessarily have a convenient or consistent structure. Think web logs, sensor data, RFID data, and other similar data streams. With such a wide range of new data streams coming online in rapid succession, Big Data has rapidly risen in importance and visibility.
As with any new topic getting a lot of attention, there are all sorts of claims related to how Big Data is going to fundamentally change everything. As a person who has “grown up” over the past few decades with roles both doing analytics and managing analytic projects, I don’t buy that. Certainly, Big Data is something that will play an increasing role in corporate analytics. It will provide some big benefits and lead to some terrific new analytics. But, from the view of an analyst in the trenches or analytics leader running the team, will Big Data fundamentally change what you do and why you do it? Let’s explore…
Analysts have been at the forefront of exploring new data sources for a long time. Who first started to analyze call detail records within telecom companies? Analysts. I was doing churn analysis against mainframe tapes at AT&T in my first job. This was at a time when reports or analysis on such data were far from standard. Who first started digging into retail point of sale (POS) data to figure out what nuggets it held? Analysts. Originally, the thought of tracking 10s to 100s of thousands of products across thousands of stores was considered a huge problem. Today, not so much. The analytical professionals who first dipped their toe into such sources were dealing with what at the time was an unthinkably large amount of data to try and analyze. Many people doubted it was possible and even questioned the value of such data. Sounds a bit like Big Data, doesn’t it?
Analysts have always sought out new, interesting data sources. They’ve also always pushed scalability to the limit. So, in my mind, Big Data isn’t really going to change much about what analysts are doing and why. Sure, the problems addressed will evolve due to the new data sources just as they always have. But, at the end of the day, analysts will simply be exploring new, unthinkably large data sets as they have always done.
One commonly accepted aspect of Big Data is that what qualifies as Big Data will change over time. As more capacity and scale is available, what is Big Data today won’t necessarily be Big Data tomorrow. Sounds a lot like how call detail records and POS data aren’t considered all that big anymore either.
What the Big Data trend will change are some of the tactics that analysts utilize to do their work. New tools such as MapReduce will be added alongside SAS and SQL to help deal more effectively with the flood of data. Complex filtering algorithms will be developed to help parse out the few meaningful pieces from a raw stream of Big Data. Modeling and forecasting processes will be updated to include Big Data inputs on top of the currently existing inputs.
Big Data will drive new & innovative analytics. It will force analysts to continue to get creative to work within scalability constraints. Big Data will only get bigger over time. But, analysts are prepared for this. Incorporating Big Data really isn’t different from what they’ve always done. It is simply the next generation of data sources to understand and put to use. Organizations need to turn their analytics professionals loose to do their thing and let them do what they do best. They are fully capable of taming Big Data if provided the opportunity and support to do so.