2.2.3.Connecting the data dots - quanganh2001/Google-Data-Analytics-Professional-Certificate-Coursera GitHub Wiki
Big and small data
As a data analyst, you will work with data both big and small. Both kinds of data are valuable, but they play very different roles.
Whether you work with big or small data, you can use it to help stakeholders improve business processes, answer questions, create new products, and much more. But there are certain challenges and benefits that come with big data and the following table explores the differences between big and small data.
Small data | Big data |
---|---|
Describes a data set made up of specific metrics over a short, well-defined time period | Descirbes large, less-specific data sets that cover a long time period |
Usually organized and analyzed in spreadsheets | Usually kept in a database and queried |
Likely to be used by small and midsize businesses | Likely to be used by large organizations |
Simple to collect, store, manage, sort, and visually represent | Takes a lot of effort to collect, store, manage, sort, and visually represent |
Usually already a manageable size for analysis | Usually needs to be broken into smaller pieces in order to be organized and analyzed effectively for decision-making |
Challenges and benefits
Here are some challenges you might face when working with big data:
- A lot of organizations deal with data overload and way too much unimportant or irrelevant information.
- Important data can be hidden deep down with all of the non-important data, which makes it harder to find and use. This can lead to slower and more inefficient decision-making time frames.
- The data you need isn’t always easily accessible.
- Current technology tools and solutions still struggle to provide measurable and reportable data. This can lead to unfair algorithmic bias.
- There are gaps in many big data business solutions.
Now for the good news! Here are some benefits that come with big data:
- When large amounts of data can be stored and analyzed, it can help companies identify more efficient ways of doing business and save a lot of time and money.
- Big data helps organizations spot the trends of customer buying patterns and satisfaction levels, which can help them create new products and solutions that will make customers happy.
- By analyzing big data, businesses get a much better understanding of current market conditions, which can help them stay ahead of the competition.
- As in our earlier social media example, big data helps companies keep track of their online presence—especially feedback, both good and bad, from customers. This gives them the information they need to improve and protect their brand.
The three (or four) V words for big data
When thinking about the benefits and challenges of big data, it helps to think about the three Vs: volume, variety, and velocity. Volume describes the amount of data. Variety describes the different kinds of data. Velocity describes how fast the data can be processed. Some data analysts also consider a fourth V: veracity. Veracity refers to the quality and reliability of the data. These are all important considerations related to processing huge, complex data sets.
Volume | Variety | Velocity | Veracity |
---|---|---|---|
The amount of data | The different kinds of data | How fast the data can be processed | The quality and reliability of the data |
Test your knowledge on connecting the data dots
Question 1
Describe the key differences between small data and big data. Select all that apply.
- Small data is typically stored in a database. Big data is typically stored in a spreadsheet.
- Small data focuses on short, well-defined time periods. Big data focuses on change over a long period of time.
Explain: Small data involves a small number of specific metrics over a shorter period of time. It’s effective for analyzing day-to-day decisions. Big data involves larger and less specific datasets and focuses on change over a long period of time. It’s effective for analyzing more substantial decisions.
- Small data involves datasets concerned with a small number of specific metrics. Big data involves datasets that are larger and less specific.
Explain: Small data involves a small number of specific metrics over a shorter period of time. It’s effective for analyzing day-to-day decisions. Big data involves larger and less specific datasets and focuses on change over a long period of time. It’s effective for analyzing more substantial decisions.
- Small data is effective for analyzing day-to-day decisions. Big data is effective for analyzing more substantial decisions.
Explain: Small data involves a small number of specific metrics over a shorter period of time. It’s effective for analyzing day-to-day decisions. Big data involves larger and less specific datasets and focuses on change over a long period of time. It’s effective for analyzing more substantial decisions.
Question 2
Which of the following is an example of small data?
A. The bed occupancy rate for a hospital for the past decade
B. The number of steps someone walks in a day
C. The trade deficit between two countries over a hundred years
D. The total absences of all high school students
The correct answer is B. The number of steps someone walks in a day. Explain: The number of steps someone walks in a day is an example of small data.
Question 3
The amount of exercise time it takes for a single person to burn a minimum of 400 calories is a problem that requires big data. True or False?
A. True
B. False
Explain: This problem can be solved using small data. It contains a specific metric (400 calories) and a short, defined period of time (amount of exercise time).