1 of 18
WHAT IS BIG DATA? It’s both a term and a new field of work. The term describes the large volume of data – both structured and unstructured —analysed for insights that lead to better decisions. As a job, it refers to ways of analysing, systematically extracting information from, or otherwise dealing with data sets often too large or complex to be dealt with by traditional data-processing application software (like spreadsheets).
Image Credit: Getty
2 of 18
WHEN DID CONCEPT OF BIG DATA EMERGE? The usage of “Big Data” and the need to understand all available data has been around much longer. In 1663, John Graunt dealt with “overwhelming amounts of information”, while he studied the bubonic plague, which was then ravaging Europe.
Image Credit: Gulf News Archive
3 of 18
FIRST PROGRAMMABLE COMPUTER, 1936-1938: The Z1 was created by German Konrad Zuse in his parents' living room between 1936 and 1938. It is considered to be the first electromechanical binary programmable computer, and the first really functional modern computer.
Image Credit: Wikipeadia
4 of 18
1943-1946: The ENIAC was invented by J. Presper Eckert and John Mauchly at the University of Pennsylvania and began construction in 1943 and was not completed until 1946. It occupied about 1,800 square feet and used about 18,000 vacuum tubes, weighing almost 50 tons. Although the Judge ruled that the ABC computer was the first digital computer, many still consider the ENIAC to be the first digital computer because it was fully functional.
Image Credit: Wikipeadia
5 of 18
THE FIRST DIGITAL COMPUTER, 1943: The Colossus was the first electric programmable computer, developed by Tommy Flowers, and was first demonstrated in December 1943. During World War II, the British, desperate to crack Nazi codes, invented a machine that scanned for patterns in messages intercepted from the Germans. Colossus scanned 5,000 characters a second, reducing the workload from weeks to merely hours. Colossus was the first data processor. Two years later, in 1945, John Von Neumann published a paper on the Electronic Discrete Variable Automatic Computer (EDVAC), the first “documented” discussion on program storage, and laid the foundation of computer architecture today.
Image Credit: Wikipeadia
6 of 18
LIBRARIES: The US Library of Congress. Patrons walk through the Main Reading Room of the Library of Congress Jefferson Building during an open house to make the room accessible to visitors without reading cards. In 1944, Fremont Rider, Wesleyan University Librarian, published The Scholar and the Future of the Research Library, and estimated that American university libraries were doubling in size every 16 years. Given this growth rate, Rider speculates that the Yale Library in 2040 will have “approximately 200,000,000 volumes, which will occupy over 6,000 miles of shelves…a cataloging staff of over six thousand persons.” This obviously didn’t happen.
Image Credit: Will Newton for The Washington Post/The Washington Post
7 of 18
WHAT DOES “BIG DATA” MEAN TODAY?: The term “big data” refers to data that is so large, fast or complex that it’s difficult or impossible to process using traditional methods. A massive amount of information can inundate a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters.
Image Credit: Wikipeadia
8 of 18
“INFORMATION EXPLOSION AND INFORMATION OVERLOAD”: About 80 years ago, there emerged the first attempts to quantify the growth rate in the volume of data or what has popularly been known as the “information explosion”. The term was first used in 1941, according to the Oxford English Dictionary.
Image Credit: Getty Images/iStockphoto
9 of 18
BIG DATA IN THE INFORMATION AGE: In the 2000s, in the midst of the IT revolution, the concept of big data started gaining traction. In 2005 Roger Mougalas from O'Reilly Media, a book publishing and online learning platform, coined the term “big data” for the first time. A year earlier, in 2004, they coined the term “Web 2.”0. It refers to a large set of data that is almost impossible to manage and process using traditional business intelligence tools.
Image Credit: Stock image
10 of 18
USE CASES FOR BIG DATA: Big data is used by organizations and businesses for numerous reasons — like discovering patterns and trends related to human behavior, human interaction with technology, consumption of products and information, including news. These can then be used to make decisions that impact how we live, work and play.
Image Credit:
11 of 18
WHAT IS THE RELATIONSHIP BETWEEN BIG DATA AND “ANALYTICS”? The act of accessing and storing large amounts of information for the purpose of “analytics”.
Image Credit: AP
12 of 18
MAINSTREAM DEFINITION OF BIG DATA TODAY: Industry analyst Doug Laney articulated the now-mainstream definition of big data as the “3 V’s”. (1) VOLUME: Organizations collect data from a variety of sources, including business transactions, smart (IoT) devices, industrial equipment, videos, social media and more. In the past, storing it would have been a problem – but cheaper storage on platforms like data lakes and Hadoop have eased the burden. (2) VELOCITY: With the growth in the Internet of Things, data streams in to businesses at an unprecedented speed and must be handled in a timely manner. RFID tags, sensors and smart meters are driving the need to deal with these torrents of data in near-real time. (3) VARIETY: Data comes in all types of formats – from structured, numeric data in traditional databases to unstructured text documents, emails, videos, audios, stock ticker data and financial transactions.
Image Credit: Reuters
13 of 18
WHAT’S THE ROLE OF COMPUTERS?: Big data is made possible by increasingly powerful, inexpensive computers — propelled by "Moore's Law" (in which transistors on a microchip doubles about every two years, though the cost of computers is halved). Armed with computers able to process massive amounts of data, patterns and associations (correlations) can be identified. Computes collect the data, “massage” and sift them — link together — and interpret, or filter vast amounts of information to get to what's most relevant to an organization or person, in order to make informed decisions.
Image Credit: Gulf News Archive
14 of 18
AS AN INDIVIDUAL, HOW DO I GENERATE DATA? When you use a smartphone, chat with your family and friends on social media, do a Google search and watch Youtube videos, you generate data. When you use GPS, play online music, buy an airline ticket, board a train or exercise using a smartwatch, you generate data.
Image Credit: Courtesy sdcongress.com
15 of 18
DATA ANALYTICS: The ability to slice and dice “big data”, leads to the ability to perform predictive analytics, user behaviour analytics, or certain other advanced data analytics methods that extract value from data. This is seldom limited to a particular size of data set. Analysis of data sets can find new correlations to spot business trends, prevent diseases, combat crime and so on.
Image Credit: WEF
16 of 18
WHAT ARE THE BIGGEST HURDLES FOR BIG DATA? Challenges include capturing data, storage, analysis, search, sharing, transfer, visualization, querying, updating, privacy and data source. Here's one practical application of data mining, especially in genetic research: by translating big data into useful insights that can be used for research and innovation, it could significantly improved preventive medicine, early detection of diseases, and treatment of common diseases, including various forms of cancer.
Image Credit: Courtesy: American University of Sharjah
17 of 18
WHAT’S ONE BENEFIT OF BIG DATA FOR ME? As you read this, billions of miles worth of videos taken by car-mounted cameras are being collected and “massaged” to train AI-based systems that allow self-driving vehicles to learn how to drive better under different road conditions. Big data is also used to spot the onset of diseases or our predisposition to it.
Image Credit:
18 of 18
WHO ARE THE TOP USERS OF BIG DATA?: Scientists, business executives, practitioners of medicine, advertising and governments alike regularly meet difficulties with large data-sets in areas including Internet searches, fintech, urban informatics, and business informatics. Scientists encounter limitations in e-Science work, including meteorology, genomics, connectomics, complex physics simulations, biology and environmental research.
Image Credit: Pexels