4.3.4.Weekly challenge 3 - quanganh2001/Google-Data-Analytics-Professional-Certificate-Coursera GitHub Wiki
Glossary: Terms and definitions
We’ve covered a lot of terms—some of which you may have already known, and some of which are new. To make it easy to remember what a word means, we created this glossary of terms and definitions.
To use the glossary for this course item, click the link below and select “Use Template.”
Link to glossary: Week 3 Glossary
OR
If you don’t have a Google account, you can download the glossary directly from the attachment below.
Course 4 Week 3 Glossary _ DA terms and definitions
Weekly challenge 3
1st grade
Question 1
After a company merger, a data analyst receives a dataset with billions of rows of data. They need to leverage this data to identify insights for upper management. What tool would be most efficient for the analyst to use?
A. Word processor
B. Spreadsheet
C. SQL
D. CSV
The correct answer is C. SQL
Question 2
A team of data analysts is working on a large project that will take months to complete and contains a huge amount of data. They need to document their process and communicate with multiple databases. The team decides to use a SQL server as the main analysis tool for this project and SQL for the queries. What makes this the most efficient tool? Select all that apply.
- SQL records queries and changes throughout a project.
- SQL contains commands that build visualizations.
- SQL efficiently handles large amounts of data.
- SQL allows you to connect to multiple databases.
Question 3
A data analyst has added a massive table to their database on accident and needs to remove the table. What command can the analyst use to correct their mistake?
A. REMOVE TABLE IF EXISTS
B. INSERT INTO
C. DROP TABLE IF EXISTS
D. DROP TABLE IF NOT EXISTS
The correct answer is C. DROP TABLE IF EXISTS
Question 4
You are working with a database table named invoice that contains invoice data. The table includes a column for customer_id. You want to remove duplicate entries for customer_id and get a count of total customers in the database.
You write the SQL query below. Add a DISTINCT clause that will remove duplicate entries from the customer_id column.
NOTE: The three dots (...) indicate where to add the clause.
SELECT COUNT(DISTINCT customer_id)
FROM
invoice
Output:
+-----------------------------+
| COUNT(DISTINCT customer_id) |
+-----------------------------+
| 59 |
+-----------------------------+
What is the total number of customers in the database?
A. 59
B. 43
C. 84
D. 105
The correct answer is A. 59. Explain: The clause COUNT(DISTINCT(customer_id)) will remove duplicate entries from the customer_id column and return a correct count for the total customers.
Question 5
You are working with a database table named customer that contains customer data. The table includes columns about customer location such as city, state, country, and postal_code. The state names are abbreviated. You want to check for city names that are greater than 5 characters long.
You write the SQL query below. Add a LENGTH function that will return any city names that are greater than 5 characters long.
NOTE: The three dots (...) indicate where to add the clause.
SELECT
*
FROM
customer
WHERE LENGTH(postal_code) > 7;
Output:
+-------------+------------+-----------+--------------------------------------------------+---------------------------------+---------------------+-------+----------------+-------------+--------------------+--------------------+-------------------------------+----------------+
| customer_id | first_name | last_name | company | address | city | state | country | postal_code | phone | fax | email | support_rep_id |
+-------------+------------+-----------+--------------------------------------------------+---------------------------------+---------------------+-------+----------------+-------------+--------------------+--------------------+-------------------------------+----------------+
| 1 | Luís | Gonçalves | Embraer - Empresa Brasileira de Aeronáutica S.A. | Av. Brigadeiro Faria Lima, 2170 | São José dos Campos | SP | Brazil | 12227-000 | +55 (12) 3923-5555 | +55 (12) 3923-5566 | [email protected] | 3 |
| 10 | Eduardo | Martins | Woodstock Discos | Rua Dr. Falcão Filho, 155 | São Paulo | SP | Brazil | 01007-010 | +55 (11) 3033-5446 | +55 (11) 3033-4564 | [email protected] | 4 |
| 11 | Alexandre | Rocha | Banco do Brasil S.A. | Av. Paulista, 2022 | São Paulo | SP | Brazil | 01310-200 | +55 (11) 3055-3278 | +55 (11) 3055-8131 | [email protected] | 5 |
| 12 | Roberto | Almeida | Riotur | Praça Pio X, 119 | Rio de Janeiro | RJ | Brazil | 20040-020 | +55 (21) 2271-7000 | +55 (21) 2271-7070 | [email protected] | 3 |
| 13 | Fernanda | Ramos | None | Qe 7 Bloco G | Brasília | DF | Brazil | 71020-677 | +55 (61) 3363-5547 | +55 (61) 3363-7855 | [email protected] | 4 |
| 16 | Frank | Harris | Google Inc. | 1600 Amphitheatre Parkway | Mountain View | CA | USA | 94043-1351 | +1 (650) 253-0000 | +1 (650) 253-0000 | [email protected] | 4 |
| 17 | Jack | Smith | Microsoft Corporation | 1 Microsoft Way | Redmond | WA | USA | 98052-8300 | +1 (425) 882-8080 | +1 (425) 882-8081 | [email protected] | 5 |
| 18 | Michelle | Brooks | None | 627 Broadway | New York | NY | USA | 10012-2612 | +1 (212) 221-3546 | +1 (212) 221-4679 | [email protected] | 3 |
| 20 | Dan | Miller | None | 541 Del Medio Avenue | Mountain View | CA | USA | 94040-111 | +1 (650) 644-3358 | None | [email protected] | 4 |
| 53 | Phil | Hughes | None | 113 Lupus St | London | None | United Kingdom | SW1V 3EN | +44 020 7976 5722 | None | [email protected] | 3 |
+-------------+------------+-----------+--------------------------------------------------+---------------------------------+---------------------+-------+----------------+-------------+--------------------+--------------------+-------------------------------+----------------+
What is the the last name of the customer in row 1 of your query result?
NOTE: The query index starts at 1 not 0.
A. Tremblay
B. Gonçalves
C. Almeida
D. Philips
The correct answer is B. Gonçalves. Explain: The function LENGTH(city) > 5 will return any state names that are greater than 2 characters long. The complete query is SELECT * FROM customer WHERE LENGTH(city) > 5. The LENGTH function counts the number of characters a string contains. The country Ireland is in row 1 of your query result.
Question 6
A data analyst is cleaning transportation data for a ride-share company. The analyst converts the data on ride duration from text strings to floats. What does this scenario describe?
A. Processing
B. Visualizing
C. Typecasting
D. Calculating
The correct answer is C. Typecasting
Question 7
The CAST function can be used to convert the DATE datatype to the DATETIME datatype. True or False?
A. True
B. False
It is true statement
Question 8
Fill in the blank: The _____ function can be used to return non-null values in a list.
A. CONCAT
B. COALESCE
C. TRIM
D. CAST
The COALESCE function can be used to return non-null values in a list.
Question 9
You are working with a database table that contains invoice data. The table includes columns about billing location such as billing_city, billing_state, and billing_country. You use the SUBSTR function to retrieve the first 4 letters of each billing city name, and use the AS command to store the result in a new column called new_city.
You write the SQL query below. Add a statement to your SQL query that will retrieve the first 4 letters of each billing city name and store the result in a new column as new_city.
NOTE: The three dots (...) indicate where to add the statement.
NOTE: SUBSTR takes in three arguments being column, starting_index, ending_index
SELECT invoice_id, SUBSTR(billing_city,1,4) AS new_city FROM invoice ORDER BY billing_city;
Output:
+------------+----------+
| invoice_id | new_city |
+------------+----------+
| 32 | Amst |
| 161 | Amst |
| 184 | Amst |
| 206 | Amst |
| 258 | Amst |
| 379 | Amst |
| 390 | Amst |
| 23 | Bang |
| 45 | Bang |
| 97 | Bang |
| 218 | Bang |
| 229 | Bang |
| 284 | Bang |
| 7 | Berl |
| 29 | Berl |
| 30 | Berl |
| 40 | Berl |
| 52 | Berl |
| 95 | Berl |
| 104 | Berl |
| 224 | Berl |
| 225 | Berl |
| 236 | Berl |
| 247 | Berl |
| 269 | Berl |
+------------+----------+
(Output limit exceeded, 25 of 412 total rows shown)
What invoice ID number is in row 7 of your query result?
NOTE: The query index starts at 1 not 0.
A. 97
B. 390
C. 23
D. 206
The correct answer is B. 390. Explain: The statement SUBSTR(billing_city, 1, 4) AS new_city
will retrieve the first 4 letters of each city name and store the result in a new column as new_city. The complete query is SELECT invoice_id, SUBSTR(billing_city, 1, 4) AS new_city FROM invoice ORDER BY billing_city
. The SUBSTR function extracts a substring from a string. This function instructs the database to return 4 characters of each billing city, starting with the first character. The invoice ID number 390 appears in row 7 of your query result.
2nd grade
Question 1
A junior data analyst joins a new company. The analyst learns that SQL is heavily utilized within the organization. Why would the organization choose to invest in SQL? Select all that apply.
- SQL is a programming language that can also create web apps
- SQL can handle huge amounts of data.
- SQL is a powerful software program.
- SQL is a well-known standard in the professional community.
Question 2
A team of data analysts is working on a large project that will take months to complete and contains a huge amount of data. They need to document their process and communicate with multiple databases. The team decides to use a SQL server as the main analysis tool for this project and SQL for the queries. What makes this the most efficient tool? Select all that apply.
- SQL records queries and changes throughout a project.
- SQL contains commands that build visualizations.
- SQL efficiently handles large amounts of data.
- SQL allows you to connect to multiple databases.
Question 3
You’ve been working on a large project for your organization that has spanned many months. Throughout the project you have created multiple tables to save your progress and store data you may need later on. Because the project is ending soon, you decide to do some housekeeping and clean up the tables you will no longer need. What command will you use to accomplish this task?
A. DROP COLUMN IF EXISTS
B. DROP IF EXISTS TABLE
C. DROP TABLE IF EXISTS
D. DROP ROW IF EXISTS
The correct answer is C. DROP TABLE IF EXISTS
Question 4
You are working with a database table named invoice that contains invoice data. The table includes columns for customer_id and total. You want to remove duplicate customers and identify which unique customers have a total greater than 5.
You write the SQL query below. Add a DISTINCT clause that will remove duplicate entries from the customer_id column.
NOTE: The three dots (...) indicate where to add the clause.
SELECT DISTINCT(customer_id)
FROM
invoice
WHERE total > 5
Output:
+-------------+
| customer_id |
+-------------+
| 8 |
| 14 |
| 23 |
| 46 |
| 52 |
| 2 |
| 25 |
| 31 |
| 40 |
| 4 |
| 10 |
| 19 |
| 42 |
| 48 |
| 57 |
| 21 |
| 27 |
| 36 |
| 59 |
| 6 |
| 15 |
| 38 |
| 44 |
| 53 |
| 17 |
+-------------+
(Output limit exceeded, 25 of 59 total rows shown)
What customer_id number is located in row 5?
NOTE: The query index starts at 1 not 0.
A. 52
B. 57
C. 14
D. 4
The correct answer is A. 52. Explain: The clause DISTINCT customer_id will remove duplicate entries from the customer_id column. The complete query is SELECT DISTINCT customer_id FROM invoice WHERE total > 5.
Question 5
You are working with a database table that contains customer data. The table includes columns about customer location such as city, state, country, and postal_code. The state names are abbreviated. You want to check for state names that are greater than 2 characters long.
You write the SQL query below. Add a LENGTH function that will return any state names that are greater than 2 characters long.
SELECT
*
FROM
customer
WHERE LENGTH(state) > 2
NOTE: The three dots (...) indicate where to add the clause.
Output:
+-------------+------------+-----------+---------+-------------------+--------+--------+-----------+-------------+--------------------+------+----------------------+----------------+
| customer_id | first_name | last_name | company | address | city | state | country | postal_code | phone | fax | email | support_rep_id |
+-------------+------------+-----------+---------+-------------------+--------+--------+-----------+-------------+--------------------+------+----------------------+----------------+
| 46 | Hugh | O'Reilly | None | 3 Chatham Street | Dublin | Dublin | Ireland | None | +353 01 6792424 | None | [email protected] | 3 |
| 55 | Mark | Taylor | None | 421 Bourke Street | Sidney | NSW | Australia | 2010 | +61 (02) 9332 3633 | None | [email protected] | 4 |
+-------------+------------+-----------+---------+-------------------+--------+--------+-----------+-------------+--------------------+------+----------------------+----------------+
What country is in row 1 of your query result? (Hint: you will have to scroll to the right with your mouse or track pad to locate the indicated column.)
NOTE: The query index starts at 1 not 0.
A. Ireland
B. France
C. India
D. Chile
The correct answer is A. Ireland. Explain: The function LENGTH(state) > 2
will return any state names that are greater than 2 characters long. The complete query is SELECT * FROM customer WHERE LENGTH(state) > 2
. The LENGTH function counts the number of characters a string contains. The country Ireland appears in row 1 of your query result.
Question 6
In SQL databases, what data type refers to a number that does not contain a decimal?
A. Integer
B. String
C. Boolean
D. Float
The correct answer is A. Integer
Question 7
A data analyst notices their Boolean column is incorrectly storing True/False values as strings. What function can the analyst use to convert the data type from a string to Boolean?
A. LENGTH
B. TRIM
C. CAST
D. SUBSTR
The correct answer is C. CAST
Question 8
After joining multiple tables you find your data contains a significant amount of null values. What function can you use to return only the non-null values in a list ?
A. COALESCE
B. CONCAT
C. CAST
D. TRIM
The correct answer is A. COALESCE.
Question 9
You are working with a database table that contains invoice data. The table includes columns about billing location such as billing_city, billing_state, and billing_country. You use the SUBSTR function to retrieve the first 4 letters of each billing city name, and use the AS command to store the result in a new column called new_city.
You write the SQL query below. Add a statement to your SQL query that will retrieve the first 4 letters of each billing city name and store the result in a new column as new_city.
NOTE: The three dots (...) indicate where to add the statement.
NOTE: SUBSTR takes in three arguments being column, starting_index, ending_index
SELECT invoice_id, SUBSTR(billing_city,1,4) AS new_city FROM invoice ORDER BY billing_city;
Output:
+------------+----------+
| invoice_id | new_city |
+------------+----------+
| 32 | Amst |
| 161 | Amst |
| 184 | Amst |
| 206 | Amst |
| 258 | Amst |
| 379 | Amst |
| 390 | Amst |
| 23 | Bang |
| 45 | Bang |
| 97 | Bang |
| 218 | Bang |
| 229 | Bang |
| 284 | Bang |
| 7 | Berl |
| 29 | Berl |
| 30 | Berl |
| 40 | Berl |
| 52 | Berl |
| 95 | Berl |
| 104 | Berl |
| 224 | Berl |
| 225 | Berl |
| 236 | Berl |
| 247 | Berl |
| 269 | Berl |
+------------+----------+
(Output limit exceeded, 25 of 412 total rows shown)
What invoice ID number is in row 7 of your query result?
NOTE: The query index starts at 1 not 0.
A. 390
B. 206
C. 97
D. 23
The correct ID number is A. 390. Explain: The statement SUBSTR(billing_city, 1, 4) AS new_city
will retrieve the first 4 letters of each city name and store the result in a new column as new_city. The complete query is SELECT invoice_id, SUBSTR(billing_city, 1, 4) AS new_city FROM invoice ORDER BY billing_city
. The SUBSTR function extracts a substring from a string. This function instructs the database to return 4 characters of each billing city, starting with the first character. The invoice ID number 390 appears in row 7 of your query result.