5.3.1.VLOOKUP for data aggregation - sj50179/Google-Data-Analytics-Professional-Certificate GitHub Wiki

Aggregate data for analysis

Aggregation

  • Collecting or gathering many separate pieces into a whole

Data aggregation

  • The process of gathering data from multiple sources in order to combine it into a single summarized collection
  • Data aggregation helps data analyst identify trends, make comparisons and gain insights that wouldn't be possible if each of the data elements were analyzed on its own.

Data can also be aggregated over a given time period to provide statistics such as:

  • Averages
  • Minimums
  • Maximums

Subquery

  • A query within another query

Preparing for VLOOKUP

VLOOKUP (Vartical Lookup)

  • A function that searches for a certain value in a column to return a corresponding piece of information
  • =VLOOKUP(Name of the item, Cell range, Column index number, Approximate matches)

VALUE

  • A function that converts a text string that represents a number to a numerical value

Remove duplicates

  • A tool that automatically searches for and eliminates duplicate entries from a spreadsheet

Question

If you enter FALSE as the last input parameter in a VLOOKUP function, VLOOKUP will search for _____.

  • an exact match
  • the closest match
  • a text string
  • a numeric value

Correct. If you enter FALSE as the last input parameter in a VLOOKUP function, VLOOKUP will search for an exact match.

Identifying common VLOOKUP errors

Recognize the limitations of VLOOKUP and fix some of the most common problems

Troubleshooting questions

  • How should I prioritize these issues?
  • In a single sentence, what's the issue I'm facing?
  • What resources can help me solve the problem?
  • How can I stop this problem from happening in the future?

VLOOKUP only returns the first match it finds

Absolute reference

  • A reference that is locked so that rows and columns won't change when copied

MATCH

  • A function used to locate the position of a specific lookup value

VLOOKUP core concepts

Functions can be used to quickly find information and perform calculations using specific values. In this reading, you will learn about the importance of one such function, VLOOKUP, or Vertical Lookup, which searches for a certain value in a spreadsheet column and returns a corresponding piece of information from the row in which the searched value is found.

When do you need to use VLOOKUP?

Two common reasons to use VLOOKUP are:

  • Populating data in a spreadsheet
  • Merging data from one spreadsheet with data in another

VLOOKUP syntax

A VLOOKUP function is available in both Microsoft Excel and Google Sheets. You will be introduced to the general syntax in Google Sheets. (You can refer to the resources at the end of this reading for more information about VLOOKUP in Microsoft Excel.)

**VLOOKUP(10003, A2:B26, 2, FALSE)**

Here is the syntax.

**VLOOKUP(search_key, range, index, [is_sorted])**

search_key

  • The value to search for.
  • For example, 42, "Cats", or I24.

range

  • The range to consider for the search.
  • The first column in the range is searched to locate data matching the value specified by search_key.

index

  • The column index of the value to be returned, where the first column in range is numbered 1.
  • If index is not between 1 and the number of columns in range, #VALUE! is returned.

is_sorted

  • Indicates whether the column to be searched (the first column of the specified range) is sorted. TRUE by default.
  • It’s recommended to set is_sorted to FALSE. If set to FALSE, an exact match is returned. If there are multiple matching values, the content of the cell corresponding to the first value found is returned, and #N/A is returned if no such value is found.
  • If is_sorted is TRUE or omitted, the nearest match (less than or equal to the search key) is returned. If all values in the search column are greater than the search key, #N/A is returned.

What if you get #N/A?

As you have just read, #N/A indicates that a matching value can't be returned as a result of the VLOOKUP. The error doesn’t mean that anything is actually wrong with the data, but people might have questions if they see the error in a report. You can use the IFNA function to replace the #N/A error with something more descriptive, like “Does not exist.”

**IFNA(#N/A, "Does not exist?)**

Here is the syntax.

**IFNA(value, value_if_na)**

value

  • This is a required value.
  • The function checks if the cell value matches the value; such as #N/A.

value_if_na

  • This is a required value.
  • The function returns this value if the cell value matches the value in the first argument; it returns this value when the cell value is #N/A.

Helpful VLOOKUP reminders

  • TRUE means an approximate match, FALSE means an exact match on the search key. If the data used for the search key is sorted, TRUE can be used.
  • You want the column that matches the search key in a VLOOKUP formula to be on the left side of the data. VLOOKUP only looks at data to the right after a match is found. In other words, the index for VLOOKUP indicates columns to the right only. This may require you to move columns around before you use VLOOKUP.
  • After you have populated data with the VLOOKUP formula, you may copy and paste the data as values only to remove the formulas so you can manipulate the data again.

VLOOKUP resources for Microsoft Excel

VLOOKUP may slightly differ in Microsoft Excel, but the overall concepts can still be generally applied. Refer to the following resources if you are working with Excel.

Test your knowledge on VLOOKUP

TOTAL POINTS 5

Question 1

To change a text string in spreadsheet cell F8 to a numerical value, what is the correct function?

  • =MATCH(F8)
  • =NUM(F8)
  • =CONVERT(F8)
  • =VALUE(F8)

Correct. To change the text string in spreadsheet cell F8 to a numerical value, the correct syntax is =VALUE(F8). Within the parenthesis, the VALUE syntax must include a reference to the specific cell whose value the function should convert.

Question 2

What is the purpose of an absolute reference within a function, such as "$C$3"?

  • To remove unnecessary instructions from a formula or function
  • To represent missing values in a formula or function
  • To make formulas and functions unconditional
  • To lock rows and columns so they won't change when a function is copied

Correct. The purpose of an absolute reference is to lock the reference to a row or column so values won't change when a function is copied.

Question 3

In VLOOKUP, TRUE tells the function to search for exact matches, and FALSE tells the function to look for approximate matches.

  • True
  • False

Correct. In VLOOKUP*,* TRUE tells the function to search for approximate matches, and FALSE tells the function to look for exact matches.

Question 4

The following is a selection from a spreadsheet:

= A B C
1 Country Population in 2020 (millions) Growth in population 2000-2020
2 China 1,439,323,776 13.4%
3 India 1,380,004,385 37.1%
4 United States 331,002,651 17.3%
5 Indonesia 273,523,615 27.7%
6 Pakistan 220,892,340 44.9%
7 Brazil 212,559,417 21.9%
8 Nigeria 206,139,589 66.3%
9 Bangladesh 164,689,383 27.9%
10 Russia 145,934,462 -0.8%

To search for the population of Nigeria, what is the correct VLOOKUP syntax?

  • =VLOOKUP(Nigeria, A2:C10, 3, true)
  • =VLOOKUP(Nigeria, A2:C10, 3, false)
  • =VLOOKUP(Nigeria, A2,C10, 2, true)
  • =VLOOKUP("Nigeria", A2:C10, 2, false)

Correct. To search for the population of Nigeria, the syntax is =VLOOKUP("Nigeria", A2:C10, 2, false). “Nigeria” is the reference. A2:C10 is the table array. The 2 indicates the position of the column from which the value should be returned. And the word false instructs the function to return an exact match.

Question 5

The following is a selection from a spreadsheet:

= A B C D
1 Location Building Height Year completed
2 Dubai Burj Khalifa 2,717 feet 2010
3 Shanghai Shanghai Tower 2,073 feet 2015
4 Mecca Makkah Royal Clock Tower 1,972 feet 2012
5 Shenzhen Ping An Finance Center 1,965 feet 2017
6 St. Petersburg Lakhta Center 1,516 feet 2019
7 Chicago Willis Tower 1,451 feet 1974

To search for the height of the building in Mecca, what is the correct VLOOKUP syntax?

  • =VLOOKUP("Mecca", A2:D7, 3, false)
  • =VLOOKUP(Mecca, A2,D7, 3, true)
  • =VLOOKUP(Mecca, A2:D7, 2, false)
  • =VLOOKUP(Mecca, A2:D7, 2, true)

Correct. To search for the height of the building in Mecca, the correct syntax is =VLOOKUP("Mecca", A2:D7, 3, false). “Mecca” is the reference. A2:D7 is the table array. The 3 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.