Big Data is a phrase used to mean a massive volume of both structured and unstructured data that is so large it is difficult to process using traditional database and software techniques. In most enterprise scenarios the volume of data is too big or it moves too fast or it exceeds current processing capacity.
Big Data has the potential to help companies improve operations and make faster, more intelligent decisions. This data, when captured, formatted, manipulated, stored, and analyzed can help a company to gain useful insight to increase revenues, get or retain customers, and improve operations.
Is Big Data a Volume or a Technology?
While the term may seem to reference the volume of data, that isn’t always the case. The term Big Data, especially when used by vendors, may refer to the technology (which includes tools and processes) that an organization requires to handle the large amounts of data and storage facilities. The term is believed to have originated with Web search companies who needed to query very large distributed aggregations of loosely-structured data.
An example of Big Data might be petabytes (1,024 terabytes) or exabytes (1,024 petabytes) of data consisting of billions to trillions of records of millions of people—all from different sources (e.g. Web, sales, customer contact center, social media, mobile data and so on). The data is typically loosely structured data that is often incomplete and inaccessible.
Types of Business Datasets
When dealing with larger datasets, organizations face difficulties in being able to create, manipulate, and manage big data. Big Data is particularly a problem in business analytics because standard tools and procedures are not designed to search and analyze massive datasets.
As research from Webopedia parent company QuinStreet demonstrates, big data initiatives are poised for explosive growth. QuinStreet surveyed 540 enterprise decision-makers involved in big data and found the datasets of interest to many businesses today include traditional structured databases of inventories, orders, and customer information, as well as unstructured data from the Web, social networking sites, and intelligent devices.