4.2.5.Video quiz - quanganh2001/Google-Data-Analytics-Professional-Certificate-Coursera GitHub Wiki

Recognize and remedy dirty data

Fill in the blank: Data _____ is a cleaning feature to check the accuracy and quality of data before adding or importing it.

A. validation

B. mapping

C. security

D. governance

The correct answer is A. validation. Explain: Data validation is a cleaning feature to check the accuracy and quality of data before adding or importing it.

Cleaning data from multiple sources

You can review the spreadsheet data along with the instructor as she makes important points about merging data from different organizations. To use the spreadsheet templates, click the links below and select "Use Template."

Links to templates: International Logistics Association Memberships - Data to Merge

Global Logistics Association - Data to Merge

OR

If you don't have a Google account, you can download the spreadsheets directly from the attachments below:

  1. International Logistics Association Memberships - Data to Merge
  2. Global Logistics Association - Data to Merge

Data-cleaning features in spreadsheets

Would you like to follow along with the instructor? To use the template for the spreadsheet, click the link below and select "Use Template."

Link to template: International Logistics Association Memberships - Data for Cleaning

OR

If you don't have a Google account, you can download the spreadsheet directly from the attachment below.

You can select more than one range, A:E, G:G, and I:L by clicking "Add Another Range" to produce the same result in the video.

Conditional formatting is a spreadsheet tool that changes how cells appear when values meet specific conditions. True or False?

A. True

B. False

Because conditional formatting is a spreadsheet tool that changes how cells appear when values meet specific conditions.

The instructor will right-click the bottom tab for the sheet and select Duplicate.

To continue following along with the instructor, click the link to the spreadsheet template below and select "Use Template."

Link to template: Cosmetics Inc. - Data for Cleaning

OR

If you don't have a Google account, you can download the spreadsheet directly from the attachment below.

What is the term for a character that indicates the beginning or end of a data item, such as a comma?

A. Condition

B. Marker

C. Substring

D. Delimiter

The correct answer is D. Delimiter. Explain: A delimiter is a character that indicates the beginning or end of a data item.

Optimize the data-cleaning process

Would you like to follow along with the instructor using the same spreadsheet? To use the spreadsheet template, click the link below and select "Use Template."

This is the same spreadsheet used in another video. You don't need to copy the template or download the spreadsheet again unless you want a fresh copy of the file.

Link to template: International Logistics Association Memberships - Data for Cleaning

OR

If you don't have a Google account, you can download the spreadsheet directly from the attachment below.

Would you like to follow along with the instructor using the same spreadsheet? To use the spreadsheet template, click the link below and select "Use Template."

Link to template: Cosmetics Inc. - Data for Optimizing

OR

If you don't have a Google account, you can download the spreadsheet directly from the attachment below.

A data analyst wants to determine the length of a text string by counting the number of characters it contains. They can use the MID function. True or False?

A. True

B. False

It is false statement. Explain: The MID function retrieves characters from the middle of the text you supply. LEN is a function data analysts use to determine the length of a text string by counting the number of characters it contains.

Different data perspectives

To follow along with the instructor using the same spreadsheet, click the link below and select "Use Template."

Link to template: Cosmetics, Inc. Data for Pivot Table and VLOOKUP

OR

If you don't have a Google account, you can download the spreadsheet in the attachment below.

The menu option has slightly changed. To insert a pivot table select Insert and Pivot Table.

When scrolling through the chart editor, it would be beneficial to hover over the data bar to get a better understanding of the data surrounding the outlier. After doing this, you will receive the data pop-up showing that the smallest data bar has a financial value at $0.73.

You may now use this information to help locate the outlier in the Price column, which can be identified as the Greenest Mascara product.

Would you like to follow along with the instructor using the same spreadsheets? To use the spreadsheet templates, click the links below and select "Use Template."

NOTE: The Logistics Association Merger sheet is the final merged result in this video. You can examine it to spot-check the merged data. You will not need to merge it yourself.

Link to templates: International Logistics Association Memberships

Global Logistics Association - Data for CONCATENATE

Logistics Association Merger

OR

If you don't have a Google account, you can download the spreadsheets from the attachments below:

  1. International Logistics Association Memberships
  2. Global Logistics Association - Data for CONCATENATE
  3. Logistics Association Merger

Pro Tip: Use CONCATENATE to help you format the data as it is merged.

Coming up, if you enter =CONCATENATE(D2, E2) as demonstrated by the instructor, the results will appear like this: 25 Dyas RdSte. 101

You could manually clean the data later to add a space between Rd and Ste., but why not let CONCATENATE do the work for you?

Because CONCATENATE merges strings, you can enter an additional string to insert a space between Rd and Ste.

Enter =CONCATENATE(D2, " ", E2) instead and you will have an address that is formatted like this: 25 Dyas Rd Ste. 101

Much better!

If you're following along with the video, you can get a feel for data validation by spot-checking the data in the Logistics Association Merger spreadsheet.

  • Was all the data merged?

Add the number of rows of data in the International Logistics Association spreadsheet to the number of rows of data in the Global Logistics Association spreadsheet. Then, check that the total number of rows matches the number of rows of data in the Logistics Association Merger spreadsheet.

  • Was membership data migrated correctly?

Randomly select a few members in the International Logistics Association spreadsheet and compare their data in the Logistics Association Merger spreadsheet. Do the same for a few members in the Global Logistics Association spreadsheet.

  • Are the data formats in the merged spreadsheet consistent?

Check the data formats for each column of the Logistics Association Merger spreadsheet.

Did you discover that the dates in the Membership valid through column are inconsistently formatted? Some dates have the format of 2/17/2021 and others have the format of Tuesday, November 9, 2021.

To fix this inconsistency, select the entire column (Column J), select Format from the main menu, select Number, and then choose the date format you would like all cells in the column to have.