3.1.1.Collecting Data - sj50179/Google-Data-Analytics-Professional-Certificate GitHub Wiki
First-party data
- Data collected by an individual or group using their own resources
Second-party data
- Data collected by a group directly from its audience and then sold
Third-party data
- Data collected from outside sources who did not correct it directly
Data collection consideration
- How the data will be collected
- Choose data sources
- Decide what data to use
- How much data to collect
- Select the right data type
- Determine the time frame
Question 1
In instances when collecting data from an entire population is challenging, data analysts may choose to use what?
A selection- A sample
A specimenA segment
Correct. In instances when collecting data from an entire population is challenging, data analysts may choose to use a sample. A sample is a part of a population that is representative of that population.
Question 2
The data-collection process involves deciding what data to use, determining how much data to collect, and selecting the right data type. Which of the following are also steps in the data-collection process? Select all that apply.
Creating data visualizations- Determining the time frame
Analyzing the data to answer business questions- Choosing data sources
Correct. Determining the time frame and choosing data sources are steps in the data collection process.
Selecting the right data
Following are some data-collection considerations to keep in mind for your analysis:
How the data will be collected
Decide if you will collect the data using your own resources or receive (and possibly purchase it) from another party. Data that you collect yourself is called first-party data.
Data sources
If you don’t collect the data using your own resources, you might get data from second-party or third-party data providers. Second-party data is collected directly by another group and then sold. Third-party data is sold by a provider that didn’t collect the data themselves. Third-party data might come from a number of different sources.
Solving your business problem
Datasets can show a lot of interesting information. But be sure to choose data that can actually help solve your problem question. For example, if you are analyzing trends over time, make sure you use time series data — in other words, data that includes dates.
How much data to collect
If you are collecting your own data, make reasonable decisions about sample size. A random sample from existing data might be fine for some projects. Other projects might need more strategic data collection to focus on certain criteria. Each project has its own needs.
Time frame
If you are collecting your own data, decide how long you will need to collect it, especially if you are tracking trends over a long period of time. If you need an immediate answer, you might not have time to collect new data. In this case, you would need to use historical data that already exists.
Use the flowchart below if data collection relies heavily on how much time you have:
Test your knowledge on collecting data
TOTAL POINTS 3
Question 1
Which method of data-collection is most often used by scientists?
SurveysQuestionnairesInterviews- Observations
Correct. Observation is the method of data-collection most often used by scientists.
Question 2
Organizations such as the U.S. Centers for Disease Control (CDC) often use data collected from hospitals. What kind of data is collected by hospitals, then sold to the CDC for its own analysis?
First-party dataMultiple-party dataThird-party data- Second-party data
Correct. Data gathered by hospitals, then collected by the CDC, is an example of second-party data.
Question 3
Fill in the blank: In data analytics, a _____ refers to all possible data values in a certain dataset.
sourcesample- population
representation
Correct. In data analytics, a population refers to all possible data values in a certain dataset.