Privacy is always a flash point. With the advent of big data, privacy is only going to be even more of a concern. The fact is that many sources of big data contain highly detailed information on what people are doing. While there are many very valid uses for most of these big data sources, it is critical that companies, governments, and other institutions, such as universities, take privacy very seriously.
As consumers have become aware of some of the data that is being collected about them, there has been backlash. Recent flare-ups with web companies like Facebook are good examples. The extent to which the tracking of behavior on the internet occurs – with Facebook, Google, and other public sites capturing data about who you are, what you are doing, where you are going, and what you want – is not known to most people. Even though many privacy policies technically declare intentions to collect and use data, the dozens of pages of “legalese” terms used aren’t read or understood by most people. There is often not a fully transparent, opt-in process with plain language declaring corporate intentions. In addition, some companies have pushed the limits of what their privacy policies technically allow. This is a bad practice that has resulted in trouble.
As much as I get excited as an analytics professional to dig into all the new big data sources available, I similarly become hesitant as a customer and citizen. As I predicted on the International Institute for Analytics’ 2012 predictions call, I believe that privacy concerns will be a major influence on how big data itself, and the use of it, evolves. There will need to be an extremely high level of trust between organizations who want our data and those of us who provide it. That trust must be earned and maintained. All it will take will be a few cases of violated trust, intentional or not, to derail the relationship and set us all back.
I have wondered what the “big moment” will be that causes everyone to realize how much about them is exposed and leads to a major popular revolt. Honestly, I thought the big blow up in December around Carrier IQ would be that moment. For those of you who missed the news, Carrier IQ (http://www.carrieriq.com/) is a company that provides software to mobile device manufacturers. The software collects usage information aimed at helping telecommunications companies and mobile device manufacturers identify hardware or network issues.
Someone posted a YouTube video showing the software capturing much more data than anyone would expect or want. The phone was even capturing key presses such as when you entered a password on a secure website. Naturally, this caused a huge uproar. (You can view this series of articles from CNNMoney for more detail: Part 1, Part 2, Part 3, and Part 4.)
In the end, it was determined that manufacturer HTC had accidentally turned on a debug mode in phones sent to consumers that was only supposed to be turned on during testing. The phones were capturing the information locally, but not sending it back to the carriers. However, even having such information stored on your phone is a huge risk if someone steals your phone and knows where to look for your passwords.
Ultimately, I came away believing that none of the companies involved had any dishonest intention. However, due to some errors in process and security procedures, a situation arose with potentially harmful impacts for users, including identity theft or the emptying of accounts by crooks able to access the data.
This is one example of how the world of big data enables analytics and actions that are very much like Big Brother. Without immense care and caution, organizations can wander into a quagmire like the carriers, software providers, and device manufacturers did with Carrier IQ. Those involved can only hope that nobody traces crimes perpetrated against themselves to the loss of their phone and the data that the software was storing on it.
Trust will be required for big data to realize its potential. If trust exists between consumers and corporations, for example, an organization or industry group can come up with its own guidelines that restrict how big data is stored and used. If that does not happen, governments will step in – and their regulations will likely end up being more restrictive and expensive, and could carry unintended consequences, as they often do. One way or another, the Big Brother potential for big data must be addressed soon.
While I was wrong that the Carrier IQ incident would be “the moment,” I do believe it is coming. Perhaps it will be the first leak of the electronic medical records of a prominent politician, or shoppers discovering that a loyalty card is being tracked with embedded RFID chips to monitor their in store activities without their consent (to my knowledge no retailers do this, but it is possible with today’s technology). We have been fortunate thus far that most coverage of big data analytics in the news has been favorable as the world marvels at the pure potential (see this week’s Wall Street Journal article.) We must anticipate possible negative perceptions and actions, and give the respect and care required to big data. This is the only way to avoid the perception, or reality, of big data becoming too much of a Big Brother for people to accept. If that happens, it may never reach its potential.