Welcome to the Real World
One of the great things about the blogosphere is that you are allowed to change your mind.
In my last blog I defined Big Data as any body of information that is so big it cannot be analyzed directly for profitable use in its raw form. Since I wrote that I’ve had a number of conversations with Big Data providers, tool-makers and users, and I’ve come to the conclusion that my definition was a bit too facile – and insufficiently empirical. So let’s try it again:
Big Data is any data that requires massively parallel computational techniques to handle.
At first glance this new definition may not seem much different – other than a bit techier – but in fact I think it better captures what is fundamentally different about the world of Big Data. In particular, it recognizes that Big Data is not just an extension of the number crunching that the computer industry has been doing for the last seventy years. Rather, it represents a revolutionary new approach to the Natural World. In particular, Big Data encompasses the process of instrumenting (that is, automating) parts of the world largely untouched before by digital technology.
That process of instrumenting the things we use and the actions we take will change everything – not least the way we see the world around us and how we actually live in that world. Consider the act of shopping. Right now, we can gather data at the point of sale – i.e., what people are buying, in what quantities and at what prices. We can even do a little bit of sampling – following select customers around to study the pattern of their movement through the store, the number of items they look at before they make a decision and so on. And just this limited amount of information has already transformed the way retailers’ package, display, promote and price their goods.
But Big Data promises to take this process to a level not only impossible before now, but in some ways unimaginable. Suddenly, it will be possible to not only track every customer, their height, weight and age, how long they looked at each item, what they started to buy but changed their mind, how long they stood in line, and a dizzying array of other variables. Hidden within these mountains of data lie valuable insights and correlations that are as yet invisible to us – but which may prove to be extremely valuable in commerce, academia, government and scientific research. In an important sense, the age of sampling – that is, of clever approximation – is coming to an end. With Big Data, we are now entering the era when we will measure every data point, and then have the tools and computational power to crunch the resulting terabytes to find the secrets that lie within.
We have already been given a glimpse of what this new epoch of Big Data will look like. The Internet, after all, is a vast Big Data environment, in which millions of servers and billions of personal computers and smart devices are continuously and perpetually gathering great caches of data on every single one of its users. And indeed, it can be said that every time we do a Google search on our laptop computer we are experiencing the immense power of Big Data to encode the world. . . in this case the World Wide Web. This single form of Big Data has transformed almost every corner of modern life, and delivered to each of us access to knowledge not available to even the most powerful and richest people in the world a half-century ago.
You can think then of Big Data as the extension of the Web and search into the rest of the non-digital world. And this will be accomplished by billions of instruments, sensors and eventually nanomachines. These devices – many of them the tiniest of future generations of microprocessors and microcontrollers, will be embedded in everything from retail items to our own bodies, even tossed into the clouds to ride with the winds. And each will be continuously streaming data, in real time and with unprecedented precision, about the tiniest changes taking place in their little corner of the world – weather patterns, water consumption, the flow of people in their daily lives while working, shopping, learning, playing, living. To put it in the simplest terms, smartness is going everywhere.
Easy to say, but what it all means is far from clear. For example, what happens when everything you touch is recorded? What happens when the commercial world and governments decide to embark on an automated capture of information on a scale never before seen?
One obvious answer is that we’ll get a lot more of what we are getting now: mass customization of products and services, microprogramming of content, personalized/localized apps. That’s really what companies are talking about when they use the term ‘data mining’ – i.e., digging even deeper into existing caches of data to find useful information. Supercharging the status quo is always the big promise of technology revolutions. But the bigger result is usually something completely different. The integrated circuit was invented to make smaller radios, the microprocessor to make smaller calculators, the Internet to speed communications between government agencies and universities. But they found their true earth-shaking applications elsewhere. I’m convinced the same will be true for Big Data; that in making the shift from the digital computer/telecom world to much bigger analog/natural world, Big Data is going to find some extraordinary, even historic, applications (Epidemiology? Resource management? Micro-meteorology?) that we can’t yet even imagine and that may change our lives as completely as the Internet and cellular telephony have done.
And, then there is the third, X, factor; the as-yet unknown/unknown application that only emerges after Big Data has reached into hundreds of applications and becomes pervasive in our lives. For example, what if there is some, pattern in nature or human behavior that only becomes visible after we begin looking at octillions of bits of sensory data — and what if that pattern proves a revelation about humanity or the natural world? What then? Just as it took the personal computer revolution, the Internet and smartphones to fully realize the potential of social networks, what if Big Data produces its own meta-effect that fundamentally changes the way we relate to ourselves, others and the world around us?
Those are heavy thoughts, and possibly some which we may never have to deal with. In the meantime, Big Data is coming fast – and even in its simplest forms will carry in its train a host of opportunities, promises and threats. We’ll address all three of those in the months ahead.
Originally posted on Forbes.com.
Get the Marshmallows Ready