5.1.7.Video quiz - quanganh2001/Google-Data-Analytics-Professional-Certificate-Coursera GitHub Wiki

The analysis process

During which analysis phase does an analyst sort and filter data?

A. Transform

B. Format and adjust

C. Get input from others

D. Organize

The correct answer is B. Format and adjust

Explain: An analyst sorts and filters data during the format and adjust analysis phase.

Always a need to organize

Fill in the blank: _____ involves arranging data into a meaningful order to make it easier to understand, analyze, and visualize.

A. Filtering

B. Sorting

C. Organizing

D. Prioritizing

The correct answer is B. Sorting. Explain: Sorting involves arranging data into a meaningful order to make it easier to understand, analyze, and visualize.

To narrow the scope of a query, a data analyst filters by a particular criteria. This type of filtering must be done one variable at a time. True or False?

A. True

B. False

It is false statement. Explain: Filtering can be done by single variable or multiple variables, depending on the query’s needs.

More on sorting and filtering

To follow along with the instructor, you should have already uploaded the movie data to BigQuery as a CSV file. If you haven't, follow the instructions in Optional: Upload the movie dataset to BigQuery .

You are working on an international project and need to invoice your customers for the work you complete. The database you use contains an invoices table. The invoices table contains the following columns: InvoiceId, CustomerId, InvoiceDate, BillingAddress, BillingCity, BillingState, BillingCountry, BillingPostalCode, Total.

Create a query to return all the columns from this table for only customers in Germany who have an invoice total greater than $5.

Hint (if needed) - Your query should have the following structure:

SELECT _____
FROM   _____
WHERE _____  AND _____

How many rows are returned from this query?

Explain: Twelve rows are returned when making the following query:

SELECT * FROM invoices WHERE BillingCountry='Germany' AND Total > 5;

Results:

+-----------+------------+---------------------+-------------------------+-------------+--------------+----------------+-------------------+-------+
| InvoiceId | CustomerId | InvoiceDate         | BillingAddress          | BillingCity | BillingState | BillingCountry | BillingPostalCode | Total |
+-----------+------------+---------------------+-------------------------+-------------+--------------+----------------+-------------------+-------+
|        12 |          2 | 2009-02-11 00:00:00 | Theodor-Heuss-Straße 34 | Stuttgart   |         None | Germany        | 70174             | 13.86 |
|        40 |         36 | 2009-06-15 00:00:00 | Tauentzienstraße 8      | Berlin      |         None | Germany        | 10789             | 13.86 |
|        52 |         38 | 2009-08-08 00:00:00 | Barbarossastraße 19     | Berlin      |         None | Germany        | 10779             |  5.94 |
|        67 |          2 | 2009-10-12 00:00:00 | Theodor-Heuss-Straße 34 | Stuttgart   |         None | Germany        | 70174             |  8.91 |
|        95 |         36 | 2010-02-13 00:00:00 | Tauentzienstraße 8      | Berlin      |         None | Germany        | 10789             |  8.91 |
|       138 |         37 | 2010-08-23 00:00:00 | Berger Straße 10        | Frankfurt   |         None | Germany        | 60316             | 13.86 |
|       193 |         37 | 2011-04-23 00:00:00 | Berger Straße 10        | Frankfurt   |         None | Germany        | 60316             | 14.91 |
|       236 |         38 | 2011-10-31 00:00:00 | Barbarossastraße 19     | Berlin      |         None | Germany        | 10779             | 13.86 |
|       241 |          2 | 2011-11-23 00:00:00 | Theodor-Heuss-Straße 34 | Stuttgart   |         None | Germany        | 70174             |  5.94 |
|       269 |         36 | 2012-03-26 00:00:00 | Tauentzienstraße 8      | Berlin      |         None | Germany        | 10789             |  5.94 |
|       291 |         38 | 2012-06-30 00:00:00 | Barbarossastraße 19     | Berlin      |         None | Germany        | 10779             |  8.91 |
|       367 |         37 | 2013-06-03 00:00:00 | Berger Straße 10        | Frankfurt   |         None | Germany        | 60316             |  5.94 |
+-----------+------------+---------------------+-------------------------+-------------+--------------+----------------+-------------------+-------+

The AND clause allows you to write a query with more than one condition. This means that this query will return a list of 12 customers to charge that are from Germany and have invoices totaling more than $5.

Sorting datasets

Would you like to follow along with the instructor using a similar spreadsheet?

If so, you may click the link below and select "Use Template."

Link to template: movie spreadsheet

Or, if you don't have a Google account, download the attached file below.

Movie Data Started Project

Heads up! The sorted results shown in the video aren't entirely accurate. Although the release dates appear to be sorted in ascending order, notice that many of the movie titles seem to be in somewhat alphabetical order, too. And that's too much of a coincidence. If you're a data analyst, you should suspect that something's not right with the data.

The movie titles in Column A could have been previously sorted alphabetically and the data wasn't restored to its original unsorted format. This can happen in a real work environment.

If you're following along with the video using the spreadsheet that was provided, the first several rows of the sorted data will appear correctly as shown below: rk1gg-uxQweNYIPrsXMHSQ_2e5f2f466884433ea03c74895fd95df1_movie_data_sorted_by_release_date

The key takeaway is the difference between sorting a sheet by a column and sorting a range (the values in the column).

In the last step, values in Column A (movie titles) were sorted in A to Z order but the corresponding release dates in Column B weren't sorted. This caused movies to be listed with incorrect release dates. For instance, before you sorted the values in Column A in A to Z order, The Devil Inside was the movie with the earliest release date of 2012-01-06. After you sorted the movie titles in Column A in A to Z order, 10 Cloverfield Lane became the movie with the release date of 2012-01-06. But this is the wrong release date for that movie!

To sort the movie titles in A to Z order and preserve the correct release dates, sort the sheet instead of the range.

If you have been following along with the video, do the following:

  • Click the Undo button to remove the previous sort on Column A.
  • Click any cell in Column A and click Data, select Sort Sheet, and then select Sort sheet by column A (A to Z).

Now you should see the movie 10 Cloverfield Lane at the top with the correct release date of 2016-03-08.

image

Question: When you use the menu function Sort Range, nothing in your spreadsheet gets rearranged except for the fields in the specified range. True or False?

A. True

B. False

It is true statement. Explain: The menu function Sort Range only sorts fields within the specified range. Data across rows are not kept together when sorted.

Which menu sort function is used to keep data together across rows?

A. Sort Sheet

B. Sort Column

C. Sort Row

D. Sort Range

The correct answer is A. Sort Sheet. Explain: Sort Sheet is the menu sort function that keeps data together across rows. It also sorts all the data in the sheet, based on the ranking of a specified column.

The SORT function

If you would like to follow along with the instructor in this video, you can use the same spreadsheet in Google Sheets.

To use the spreadsheet template, click the link below and select “Use Template.”

Link to template: Party Plan Spreadsheet

OR

If you don't have a Google account, you can download the attached file below.

Party Plan Spreadsheet

The menu option has slightly changed. Click Data, then Sort range, then Advanced range sorting options.

Question: Which of the following statements accurately describe differences between the sort from a spreadsheet's Data tab and a written SORT function? Select all that apply.

  • The sort from a spreadsheet's Data tab overwrites the cells containing the unsorted data with the sorted data, while a written SORT function inserts the sorted data in a different cell range.
  • The sort from a spreadsheet's Data tab can only sort data by a single sort condition or column, while a written SORT function can sort data by multiple columns.
  • The sort from a spreadsheet's Data tab can exclude a header row in the data range from being sorted, while the data range for a written SORT function should never contain a header row.
  • The sort from a spreadsheet's Data tab can sort in ascending or descending order, while a written SORT function automatically assumes ascending order.

Explain: The sort from a spreadsheet's Data tab can exclude a header row in the data range from being sorted, while the data range for a written SORT function should never contain a header row.

Sorting queries in SQL

You are working on a project about music and have a table of genres you need to sort. The Genres table contains the columns GenreId and Name.

Write a SQL query to return the name of each genre from this table in alphabetical order.

SELECT Name FROM Genres ORDER BY Name;

What are the first and last music genres returned, respectively? Separate your answers by a comma followed by a space.

Results:

+--------------------+
| Name               |
+--------------------+
| Alternative        |
| Alternative & Punk |
| Blues              |
| Bossa Nova         |
| Classical          |
| Comedy             |
| Drama              |
| Easy Listening     |
| Electronica/Dance  |
| Heavy Metal        |
| Hip Hop/Rap        |
| Jazz               |
| Latin              |
| Metal              |
| Opera              |
| Pop                |
| R&B/Soul           |
| Reggae             |
| Rock               |
| Rock And Roll      |
| Sci Fi & Fantasy   |
| Science Fiction    |
| Soundtrack         |
| TV Shows           |
| World              |
+--------------------+

Explain: Alternative and World are returned when making the following query:

SELECT Name FROM genres ORDER BY Name

Not specifying ascending or descending in your query will sort by ascending order by default. This is appropriate for sorting in alphabetical order.

The database you use contains a Tracks table. The table contains the following columns: TrackId, Name, AlbumId, MediaTypeId, GenreId, Composer, Milliseconds, Bytes, and UnitPrice.

Write a SQL query to pull all columns from the Tracks table for only tracks with Chris Cornell as the composer. Sort the results in descending order by GenreId.

SELECT * FROM Tracks WHERE Composer = "Chris Cornell" ORDER BY GenreId DESC;

Results:

+---------+---------------------------+---------+-------------+---------+---------------+--------------+----------+-----------+
| TrackId | Name                      | AlbumId | MediaTypeId | GenreId | Composer      | Milliseconds |    Bytes | UnitPrice |
+---------+---------------------------+---------+-------------+---------+---------------+--------------+----------+-----------+
|    3388 | You Know My Name          |     270 |           2 |      23 | Chris Cornell |       240255 |  3940651 |      0.99 |
|    3387 | Disappearing Act          |     270 |           2 |      23 | Chris Cornell |       273320 |  4476203 |      0.99 |
|    3386 | Silence the Voices        |     270 |           2 |      23 | Chris Cornell |       267376 |  4379597 |      0.99 |
|    3385 | Finally Forever           |     270 |           2 |      23 | Chris Cornell |       217035 |  3565098 |      0.99 |
|    3384 | Your Soul Today           |     270 |           2 |      23 | Chris Cornell |       205959 |  3385722 |      0.99 |
|    3383 | Scar On the Sky           |     270 |           2 |      23 | Chris Cornell |       220193 |  3616618 |      0.99 |
|    3381 | Killing Birds             |     270 |           2 |      23 | Chris Cornell |       218498 |  3588776 |      0.99 |
|    3380 | Ghosts                    |     270 |           2 |      23 | Chris Cornell |       231547 |  3799745 |      0.99 |
|    3379 | She'll Never Be Your Man  |     270 |           2 |      23 | Chris Cornell |       204078 |  3355715 |      0.99 |
|    3378 | Safe and Sound            |     270 |           2 |      23 | Chris Cornell |       256764 |  4207769 |      0.99 |
|    3377 | Arms Around Your Love     |     270 |           2 |      23 | Chris Cornell |       214016 |  3516224 |      0.99 |
|    3376 | Poison Eye                |     270 |           2 |      23 | Chris Cornell |       237120 |  3890037 |      0.99 |
|    3375 | No Such Thing             |     270 |           2 |      23 | Chris Cornell |       224837 |  3691272 |      0.99 |
|    2522 | Bleed Together            |     203 |           1 |       1 | Chris Cornell |       232202 |  7597074 |      0.99 |
|    2520 | Blow Up The Outside World |     203 |           1 |       1 | Chris Cornell |       347898 | 11379527 |      0.99 |
|    2519 | Burden In My Hand         |     203 |           1 |       1 | Chris Cornell |       292153 |  9659911 |      0.99 |
|    2518 | Pretty Noose              |     203 |           1 |       1 | Chris Cornell |       253570 |  8317931 |      0.99 |
|    2517 | Fell On Black Days        |     203 |           1 |       1 | Chris Cornell |       282331 |  9256082 |      0.99 |
|    2515 | The Day I Tried To Live   |     203 |           1 |       1 | Chris Cornell |       321175 | 10507137 |      0.99 |
|    2514 | Spoonman                  |     203 |           1 |       1 | Chris Cornell |       248476 |  8289906 |      0.99 |
|    2513 | Rusty Cage                |     203 |           1 |       1 | Chris Cornell |       267728 |  8779485 |      0.99 |
|    2512 | Outshined                 |     203 |           1 |       1 | Chris Cornell |       312476 | 10274629 |      0.99 |
|    2508 | Loud Love                 |     203 |           1 |       1 | Chris Cornell |       297456 |  9660953 |      0.99 |
+---------+---------------------------+---------+-------------+---------+---------------+--------------+----------+-----------+

What is the name of the first track that the query returns?

The name of the first track that the query returns is You Know My Name.

Explain: You Know My Name is the first value returned in the Name column when making the following query:

SELECT * FROM Tracks WHERE Composer='Chris Cornell' ORDER BY GenreId DESC

When executed, this query returns a list of tracks that were composed by Chris Cornell. The list would be sorted by GenreId in descending order.