+91 70951 67689 datalabs.training@gmail.com

Understanding Big Data with simple example – 1

Here we are going to understand the concept of Big Data in a simple and easy-to-understand manner. There is a lot of hype around Big Data today; the internet, websites, and blogs have created this hype citing such example like FaceBook, Google and Twitter that are generating petabytes of data every day. Or they talk about the Hadron Collider project where some people believed that this project could potentially create a black hole in the earth and thus destroy the whole world! It is true that all that was proven wrong, it was also believed that this experiment created such enormous amount of data the scientists are discarding much of it as they won’t be able to look through it or analyze it. It seems the scientists are hoping they haven’t discarded anything valuable.

Though all the facts are interesting and to some extent true, they fail to capture the underlying essence of Big Data; what it is and why is it important to small businesses that don’t need to deal with huge amount of data on companies like Google and FaceBook. So here we are trying to understand the concept of Big Data in a very basic manner, through as example:

Let us say there is a bank; now let us take a very classic example where an organization is trying to find an optimal price of a new product so as to maximize their profit. Let us say, in this case, it might be any travel insurance product. What organization typically did in the 90s, in regards to this problem, is that they would contact a survey forum to gather feedbacks from sample crowds and then this would be debated on by a few industry experts by which the optimal price of the new product would be fixed. The problem here is that the input from the survey companies and the experience of the industry experts was a small knowledge base to derive the optimal price with accuracy. This accuracy with improved with the advent of data warehousing technologies.

Now the organizations have realized that there is a lot of data available for them that could help them to arrive at the optimal price. For example, it had the mainframe databases which would have a lot of customers and their activity related information. Then there are websites where organizations can know about the products people are taking an interest in and this information can be obtained through weblogs. Additionally, the transaction logs of customers would provide information on their spending habits. And finally they could look outside the organization at the competitors’ pricing as well in addition to the survey of the market trends and third party statistics on how accidents are happening thus understanding the probability of a claim being made in a particular area.

With data warehousing what an organization does is it takes the most relevant data from each of these sources and stacks them together on a one, big, expensive server and runs smartly written, complex algorithms to find the optimal price. This is the basic idea of the data warehousing tool which is the current industry standard for the decision support system. Please note that the data is the underlying foundation of decisions; data, when processed, finds it’s meaning as a decision support system.