The Three Vs – Volume, Velocity and Variety
Big Data is becoming a popular term these days; the term is being used to describe the exponential growth and availability of data, which includes both structured and unstructured data. Big Data is important for the business because more data results in more accurate analysis and a more accurate analysis result in better decision making. And better decision-making means better operational efficiencies, better cost reductions and reduced risk. This means the company takes home more revenue.
In 2001 industry analyst Doug Laney defined the term Big Data in terms of the three Vs – Volume, Velocity and Variety.
- Volume: Big Data has a large volume and various factors take part in the increase in a volume of data being generated. Transaction-based data stored in relational databases since years make up a part of the volume. Unstructured data that is being streamed in from social media also plays a role. Sensor and machine-to-machine generate data is increasing with time. In the past storage was an issue; however with time, the storage costs have decreased.
- Velocity: Big Data streams at very high speeds; therefore it must be dealt with in a timely manner. RFID tanks, sensors and smart metering spell out large data within the short period. Reacting fast enough with high velocities of data is one of the challenges that companies face. The speed of data can be highly inconsistent with periodic peaks. This is especially true in social media when something trends. Daily, seasonal and event-triggered peak data loads can sometimes be difficult to manage especially when there is unstructured data involved.
- Variety: Data today comes in different formats; structured data resides in traditional databases and file systems. Then there is unstructured data; examples include, text documents, emails, video, audio and log files, etc. Big Data comes from various sources; the challenge comes in managing, merging and governing different varieties of data. The Big Data has to be connected and correlated during the analysis phase in order to extract useful information out of it.