What is Big Data?
If you ask ten different people, you might get ten different definitions of Big Data. To some it’s all about data visualisation. To others, it’s predictive analytics. A true IT geek might even talk about “sending code to the data” and “schema-on-read”. However, a fairly well accepted definition is ‘a dataset that is too large and complex to be processed by a single machine’.
Where has it come from?
Big Data has come about from the companies that you see and deal with every day: Facebook, Google, Amazon, LinkedIn, Netflix. They pioneered the technologies due to the huge volume of data that has come into being in recent years, fundamentally via the internet.
Why is Big Data big?!
Around eighty percent of the world’s data is unstructured. That means things like video, photos and Facebook status updates, and these files are, well, big. Put lots of them together, from lots of different people, and they’re really big.
People talk about the ‘three V’s of Big Data’: Volume, Variety and Velocity. There’s lots of data out there, in lots of different formats and people want it quickly. Ninety percent of the world’s data has been created in the last ten years and there’s only going to be more of it. We need new and clever ways of managing it. The latest thinking also now cites n-V’s of Big Data, with n currently being 8: Volume, Variety, Velocity, Value, Veracity, Variability, Virality, and Viscosity.
Why do I care?
You care, or should do, because Big Data is an unavoidable part of your life, either with social media or your business, or both. The organisations that have embraced Big Data, like Uber and Airbnb, are the ones with huge success stories; data is all they own – not taxis and hotel rooms. The enterprises that don’t change their mind-set to fully exploit their rich data sources, and think creatively about it, are the ones that will be left behind - and probably cease to exist.
What’s this Hadoop thing I keep hearing about?
Hadoop is not just one thing, but a set of tools; an ecosystem for handling Big Data. It uses the MapReduce distributed computing paradigm to store basically any type of data. And it’s Open Source, meaning it’s free.
OK, you’ve convinced me! What do I do?
Sign up for our free Big Data Discovery workshop, or get in touch with AgilityWorks today.