“Data” is defined as ‘the quantities, characters, or symbols on which operations are performed by a computer, which may be stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media’, as a quick google search would show.
The concept of Big Data is nothing complex; as the name suggests, “Big Data” refers to copious amounts of data which are too large to be processed and analysed by traditional tools, and the data is not stored or managed efficiently. Since the amount of Big Data increases exponentially- more than 500 terabytes of data are uploaded to Face book alone, in a single day- it represents a real problem in terms of analysis.
However, there is also huge potential in the analysis of Big Data. The proper management and study of this data can help companies make better decisions based on usage statistics and user interests, thereby helping their growth. Some companies have even come up with new products and services, based on feedback received from Big Data analysis opportunities.
Classification is essential for the study of any subject. So Big Data is widely classified into three main types, which are-
1. Structured data
Structured Data is used to refer to the data which is already stored in databases, in an ordered manner. It accounts for about 20% of the total existing data, and is used the most in programming and computer-related activities.
There are two sources of structured data- machines and humans. All the data received from sensors, web logs and financial systems are classified under machine-generated data. These include medical devices, GPS data, data of usage statistics captured by servers and applications and the huge amount of data that usually move through trading platforms, to name a few.
Human-generated structured data mainly includes all the data a human input into a computer, such as his name and other personal details. When a person clicks a link on the internet, or even makes a move in a game, data is created- this can be used by companies to figure out their customer behaviour and make the appropriate decisions and modifications.
2. Unstructured data
While structured data resides in the traditional row-column databases, unstructured data is the opposite- they have no clear format in storage. The rest of the data created, about 80% of the total account for unstructured big data. Most of the data a person encounters belongs to this category- and until recently, there was not much to do to it except storing it or analysing it manually.
Unstructured data is also classified based on its source, into machine-generated or human-generated. Machine-generated data accounts for all the satellite images, the scientific data from various experiments and radar data captured by various facets of technology.
Human-generated unstructured data is found in abundance across the internet, since it includes social media data, mobile data and website content. This means that the pictures we upload to out Facebook or Instagram handles, the videos we watch on YouTube and even the text messages we send all contribute to the gigantic heap that is unstructured data.
3. Semi-structured data.
The line between unstructured data and semi-structured data has always been unclear, since most of the semi-structured data appear to be unstructured at a glance. Information that is not in the traditional database format as structured data, but contain some organizational properties which make it easier to process, are included in semi-structured data. For example, NoSQL documents are considered to be semi-structured, since they contain keywords that can be used to process the document easily.
Big Data analysis has been found to have a definite business value, as its analysis and processing can help a company achieve cost reductions and dramatic growth. So it is imperative that you do not wait too long to exploit the potential of this excellent business opportunity.