5.4.2.Learn more SQL calculations - sj50179/Google-Data-Analytics-Professional-Certificate GitHub Wiki

Queries and calculations

Operator

  • A symbol that names the types of operation or calculation to be performed in a formula

The syntax of a query is its structure

SELECT
    columnA,
    columnB,
    columnA + columnB AS columnX
FROM
    table_name
Column A  | Column B  | Column X
----------|-----------|---------------
A1        | B1        | Sum of A1 + B1
A2        | B2        | Sum of A2 + B2
A3        | B3        | Sum of A3 + B3
SELECT
    columnA,
    columnB,
    columnC,
    (columnA + columnB) * columnC AS columnX
FROM
    table_name
Column A  | Column B  | Column C  | Column X
----------|-----------|-----------|---------------
A1        | B1        | C1        | (A1 + B1) * C1
A2        | B2        | C2        | (A2 + B2) * C2
A3        | B3        | C3        | (A3 + B3) * C3

Modulo

  • An operator (%) that returns the remainder when one number is divided by another

Question

To find the average of a set of values, a data analyst can type AVERAGE in both spreadsheets and SQL.

  • TRUE
  • FALSE

Correct. The AVERAGE function finds the average of a set of values in spreadsheets. When using SQL, the function is AVG*.*

Question

When using SQL, which of the following are reasons for using underscores in column names? Select all that apply.

  • It verifies that the values in the columns are accurate
  • It keeps the column names readable
  • It helps avoid potential issues with servers and applications
  • It tells the server that the values in the columns are for calculations

Correct. Using underscores instead of spaces helps avoid potential issues with servers and applications. It also helps to keep the column names readable.

Question

What will appear in the Total_Small_Bags_Cost column after the following query is run?

SELECT
    Small_Bags,
    Small_Bags_Cost,
    Small_Bags * Small_Bags_Cost AS Total_Small_Bags_Cost
FROM 
    avocado_data.avocado_prices
  • The sum of the values in the Small_Bags and Small_Bags_Cost columns
  • The product of the values in the Small_Bags and Small_Bags_Cost columns
  • The sum of the values in the “Small_Bags_Cost” and “avocado_data.avocado_prices” columns
  • The product of the values in the “Small_Bags_Cost” and “avocado_data.avocado_prices” columns

Correct. The product of the values in the Small_Bags and Small_Bags_Cost columns will appear in the Total_Small_Bags_Cost column. The query will multiply the values in the two columns.

Calculations with other statements

Extract command

  • Lets us pull one part of a given date to use

Example:

SELECT
    EXTRACT(YEAR FROM STARTTIME) AS year,
    COUNT(*) AS number_of_rides
FROM
    `bigquery-public-data.new_york_citibike.citibike_trips`
GROUP BY    
    year
ORDER BY 
    year DESC

Question

Which of the following options will be returned with this query?

SELECT
    Yes_Responses, 
    Total_Responses,
    Total_Responses - Yes_Responses AS No_Responses
FROM
    Survey_1
  • The total when “Yes” responses are sorted from “No” responses
  • The number of “Yes” and “No” responses combined
  • The difference when "Yes" responses are subtracted from the total number of responses
  • The result of dividing the total number of responses by “Yes” responses

Correct. This query will return the difference when "Yes" responses are subtracted from the total number of responses. The calculation will appear in a new column named No_Responses*. The division operator (/) could be used to divide these values.*

Question

What is the purpose of the EXTRACT command in a query?

  • To extract a part from a given date
  • To extract a single word from a string
  • To extract a single value from a column
  • To extract a column from a database

Correct. The purpose of the EXTRACT command in a query is to extract a part from a given date. The EXTRACT command can extract any part from a date/time value.

Test your knowledge on SQL calculations

TOTAL POINTS 3

Question 1

You are working with a database table that contains invoice data. The table includes columns for invoice_line_id (line items for each invoice), invoice_id, unit_price, and quantity (the number of purchases in each line item). Each invoice contains multiple line items. You want to know the total price for each of the first 5 line items in the table. You decide to multiply unit price by quantity to get the total price for each line item, and use the AS command to store the total in a new column called line_total.

Add a statement to your SQL query that calculates the total price for each line item and stores it in a new column as line_total.

SELECT
    invoice_line_id,
    invoice_id,
    unit_price,
    quantity,
    (unit_price * quantity) AS line_total
FROM
    invoice_item
LIMIT 5
+-----------------+------------+------------+----------+------------+
| invoice_line_id | invoice_id | unit_price | quantity | line_total |
+-----------------+------------+------------+----------+------------+
|               1 |          1 |       0.99 |        1 |       0.99 |
|               2 |          1 |       0.99 |        1 |       0.99 |
|               3 |          2 |       0.99 |        1 |       0.99 |
|               4 |          2 |       0.99 |        1 |       0.99 |
|               5 |          2 |       0.99 |        1 |       0.99 |
+-----------------+------------+------------+----------+------------+

What total appears in row 1 of your query result?

  • 7.92
  • 3.96
  • 0.99
  • 1.98

Correct. You add the statement **unit_price * quantity AS line_total to calculate the total price for each invoice and store it in a new column as line_total. The complete query is **SELECT invoice_line_id, invoice_id, unit_price*,* quantity*,* unit_price * quantity AS line_total FROM invoice_items LIMIT 5***. The* **AS** command gives a temporary name to the new column. The total 0.99 appears in row 1 of your query result.

Question 2

In a SQL query, which calculation does the modulo (%) operator perform?

  • It applies an exponent to a value
  • It returns the remainder of a division calculation
  • It converts a decimal to a percent
  • It finds the square root of a number

Correct. The modulo operator returns the remainder of a division calculation when included in a SQL query.

Question 3

You are working with a dataset with the column name “firstquarterexpenses.” How can you rename this column to make it more readable?

  • first_quarter_expenses
  • first quarter expenses
  • Firstquarterexpenses
  • first+quarter+expenses

Correct. You can rename the column first_quarter_expenses. Using underscores between words helps avoid potential issues while keeping the names readable.