Firstly, what is Big Data? Well almost anything you want it to be apparently; the hard part is discovering exactly what the questions are, the answers are much easier to find.
This was the topic of the opening talk given by Phil Claridge, Chief Innovation Architect at JDSU, at the Big Data Science Day hosted by Sci-Tech, Daresbury, as part of the International Festival of Business 2014. Phil’s talk actually posed the question: What turns Data into Big Data? He had discovered that it depended on who was asked the question – ask an engineer and they will say more data that can fit on a single machine; ask a business analyst and they will instantly see the commercial opportunities, the ability to access new, timely and accurate information, along with the legal concerns surrounding usage.
Initially though it was a case of knowing your data. There are very different technologies, investments and skills required today to understand how to interpret the information contained in data. From the structured: tables, lists, spreadsheets, databases and mathematical analysis, which is an easier starting point, to the unstructured: documents, blogs, web pages, tweets – all much harder to process, but these may be of more value.
Then there is always the question: Do you need a data lake or a real time process? A data lake is simply all of your data housed in one archive; intelligent access is by formulating questions and then gathering the results via a pipeline. Phil explained that as this method may take hours, or even days, it could not be construed as real time data. However, if you ‘filter the pipe’ you can sift through the data as it arrives, which is great for questions you have already anticipated.
Amazon’s Kinesis is a prime example of this as it anticipates what you are going to buy before you have bought it; this results in a high speed, efficient service for their customers. Without the mapping process though, you cannot join Big Data to make Big Information. Phil went on to explain that some of the most valuable data that third parties are currently aggregating is our consolidated data: postcode, name, date of birth, wi-fi Mac address, store loyalty card, to name just a few he mentioned. The result? These ‘identity aggregators’ will emerge as new business.
So can you leverage free Big Data?
Apparently so, data sets are being released by both the Government and private enterprises such as Ordnance Survey Open Data, Transport for London and HESA. Utilising this data enables the powerful visualisation of where high value subscribers are spending their time. Emerging from this data can come valuable precision planning data for advertising agencies, retailers, brand managers, etc, enabling them to understand how quickly target customer groups are moving past specific locations for advertisers so they can decide where best to advertise by determining the measure of footfall.
You can discover more about what Phil does with data here.
Big Data Sample Sources
Image, courtesy of NATS, illustrates data visualization of Air Traffic in Europe created from real flight data. This fascinating video shows the air traffic which flies on a typical summer day and highlights the intensity of the operation in Europe.