I have recently made investments in businesses that are set on tackling the world’s biggest data challenges. Why? Because the insight that will eventually be gained from big data will undoubtably change the world.
Big data may currently be the most over-used buzz phrase in circulation, but the topic is being discussed everywhere, despite being misunderstood by many. Pretty much everyone has their own definition.
While the technology giants Google, Yahoo, Facebook, Twitter and others have all recruited the brightest minds in an attempt to find patterns in the social media chaos, big data is about much more than them.
In 2012, Gartner updated their definition of big data to “...high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization".
While its a mouthful, I like this 'three v' definition. It highlights that big data is about much more than data volume. Perhaps more relevant is data velocity, the speed of the data flow. This represents a bigger challenge because when data is in flux, the window to extract usable knowledge is smaller. The final ‘v’ is about managing a variety of data types, something that represents challenges to current technology.
The Gartner definition recognises that completely new forms of processing are required and neatly captures the purpose of big data. It’s about gaining new insights, and discoveries which will result in new ways of seeing our world. I hope it will enable us to make some critical decisions that doubtless await us and our children. But, we just can’t get to where we want to be by building more of the same. Fresh thinking is needed.
I am intrigued by the developments being made to advance our data analysis capabilities, that is, new methods of searching through the vast data that we have already accumulated and new AI tools have an exciting role to play here. I also believe our future lies in engineering totally new methods of analysing truly massive volumes of flowing data too vast to ever imagine consider being stored.
Big data analysis methods are designed to balance speed and function and everything is always a compromise. What if radically different technology could enabled us to analyse everything, miss nothing and learn the exciting new truths as a result?
There is big data and then there is BIG data. Consider the volume of data involved in running the Large Hadron Collider. If all of the data were to be captured from its 150m sensors it would equate to 500 exabytes per day. That’s almost 200 times more than the rest of the world’s sources combined. The new Square Kilometre Array being built in Australia will require bandwidth bigger than the entire current global internet. Significant compromise is made in Computational Fluid Dynamics, a field of engineering that is used to predict the weather, model aerodynamics and calculate how other gas and fluids flow through a system. Huge amounts of data is discarded in an MRI machine because it cannot be stored or processed. These sophisticated machines dump much of the data and still resort to printing an image for a Doctor to 'eyeball' in order to make a diagnosis. What could be achieved if all of this data could be analysed?
To search seriously big data we need some seriously new technology. Advances are being made in both quantum and optical computing and I think the next generation of computing might just be optical.
We can all comprehend that you can encode a lot of data into a single beam of light. A single optical fiber can already hold three million concurrent phone calls or 90,000 TV channels. Why then is light not the answer to our processing needs? It has been recognised since the 1960’s that it is possible to perform mathematical equations with light. Optical processing could also be very low energy, unlike quantum computing which involves cooling components close to absolute zero. For example, Optalysys aims to build a computer 1,000 times faster than any supercomputer that exists today capable of being run on a domestic power supply - achievable in part because you can parallel process with light in a way that you cannot with electricity.
Big Data is becoming increasingly relevant to all aspects of our lives but our ability to process is not even close to being on the right scale. Our ambitions are tempered by the very real limits of electronic processing. However, advances are coming fast and there is no doubt that big data is driving it. The answers will certainly have big implications for our future.