5.4.1.Get started with data calculations - sj50179/Google-Data-Analytics-Professional-Certificate GitHub Wiki

Functions and conditions

Question

A data analyst is using the following formula: =COUNTIF(C2:C50, “=100”). Which part of the formula names the condition that the data must meet in order to be counted?

  • COUNTIF
  • C2
  • C50
  • =100

Correct. In the formula =COUNTIF(C2:C50, “=100”)*,* “=100” names the condition that the data must meet in order to be counted. In this formula, cells C2 through C50 will be counted if their value equals 100.

SUMIF

  • A function that adds numeric data based on one condition

**=SUMIF(range, criteria/condition, [sum_range])**

Functions with multiple conditions

In this reading, you will learn more about conditional functions and how to construct functions with multiple conditions. Recall that conditional functions and formulas perform calculations according to specific conditions. Previously, you learned how to use functions like SUMIF and COUNTIF that have one condition. You can use the SUMIFS and COUNTIFS functions if you have two or more conditions. You will learn their basic syntax in Google Sheets, and check out an example.

Refer to the resources at the end of this reading for information about similar functions in Microsoft Excel.

SUMIF to SUMIFS

The basic syntax of a SUMIF function is: =SUMIF(range, criterion, sum_range)

The first range is where the function will search for the condition that you have set. The criterion is the condition you are applying and the sum_range is the range of cells that will be included in the calculation.

For example, you might have a table with a list of expenses, their cost, and the date they occurred.

You could use SUMIF to calculate the total price of fuel in this table, like this:

**=SUMIF(A1:A9, "Fuel", B1:B9)**

But, you could also build in multiple conditions by using the SUMIFS function. SUMIF and SUMIFS are very similar, but SUMIFS can include multiple conditions.

The basic syntax is: 

=SUMIFS(sum_range, criteria_range1, criterion1, [criteria_range2, criterion2, ...])

The square brackets let you know that this is optional. The ellipsis at the end of the statement lets you know that you can have as many repetition of these parameters as needed. For example, if you wanted to calculate the sum of the fuel costs for one date in this table, you could create a SUMIFS statement with multiple conditions, like this:

=SUMIFS(B1:B9, A1:A9, "Fuel", C1:C9, "12/15/2020")

This formula gives you the total cost of every fuel expense from the date listed in the conditions. In this example, C1:C9 is our second criterion_range and the date 12/15/2020 is the second condition. As long as you follow the basic syntax, you can add up to 127 conditions to a SUMIFS statement!

COUNTIF to COUNTIFS

Just like the SUMIFS function, COUNTIFS allows you to create a COUNTIF function with multiple conditions.

The basic syntax for COUNTIF is:

=COUNTIF(range, criterion)

Just like SUMIF, you set the range and then the condition that needs to be met. For example, if you wanted to count the number of times Food came up in the Expenses column, you could use a COUNTIF function like this:

=COUNTIF(A1:A9, "Food")

COUNTIFS have the same basic syntax as SUMIFS:

**=COUNTIFS(criteria_range1, criterion1, [criteria_range2, criterion2, ...])**

The criteria_range and criterion are in the same order, and you can add more conditions to the end of the function. So, if you wanted to find the number of times Coffee appeared in the Expenses column on 12/15/2020, you could use COUNTIFS to apply those conditions, like this:

=COUNTIFS(A1:A9, "Coffee", C1:C9, "12/15/2020")

This formula follows the basic syntax to create conditions for “Coffee” and the specific date. Now we can find every instance where both of these conditions are true.

For more information

SUMIFS and COUNTIFS are just two examples of functions with multiple conditions. They help demonstrate how multiple conditions can be built into the basic syntax of a function. But, there are other functions with multiple conditions that you can use in your data analysis. There are a lot of resources available online to help you get started with these other functions:

  • How to use the Excel IFS function: This resource includes an explanation and example of the IFS function in Excel. This is a great reference if you are interested in learning more about IFS. The example is a useful way to understand this function and how it can be used.
  • VLOOKUP in Excel with multiple criteria: Similar to the previous resource, this resource goes into more detail about how to use VLOOKUP with multiple criteria. Being able to apply VLOOKUP with multiple criteria will be a useful skill, so check out this resource for more guidance on how you can start using it on your own spreadsheet data.
  • INDEX and MATCH in Excel with multiple criteria: This resource explains how to use the INDEX and MATCH functions with multiple criteria. It also includes an example which helps demonstrate how these functions work with multiple criteria and actual data.
  • Using IF with AND, OR, and NOT functions in Excel: This resource combines IF with AND, OR, and NOT functions to create more complex functions. By combining these functions, you can perform your tasks more efficiently and cover more criteria at once.

Composite functions

SUMPRODUCT

  • A function that multiplies arrays and returns the sum of those products
  • =SUMPRODUCT(array1, [array2]....)

Test your knowledge on data calculations

TOTAL POINTS 3

Question 1

What is the correct spreadsheet formula for multiplying 50 and 233?

  • =50x233
  • 50*233
  • =50*233
  • 50x233

Correct. =50233* is the correct formula for multiplying 50 and 233. Formulas begin with an equal sign (=). This is followed by the values to be computed. An asterisk () is the multiplication operator in spreadsheets.*

Question 2

The following is a selection of a spreadsheet:

= A B
1 Expense Amount
2 Rent $680.00
3 Healthcare $101.00
4 Groceries $185.00
5 Clothing $41.00
6 Transportation $22.00
7 Mobile phone $48.00
8 Dining out $79.00
9 Car insurance $65.00
10 Dog walker $40.00
11 Gym membership $19.00
12 Manicure $23.00

You are trying to determine what percentage of your monthly income is spent on big-ticket items, such as rent and groceries. To add together only the values from Column B that cost more than $150, what is the correct syntax?

  • =SUMIF(B2:B12,>150)
  • =SUMIF(B2:B12,">150")
  • =SUMIF(B2:B12,<150)
  • =SUMIF(B2:B12,"<150")

Correct. To add together only the values from Column B that cost more than $150, the correct syntax is =SUMIF(B2:B12,">150"). B2:B12 is the range, and more than 150 (>150) is the criteria.

Question 3

A data analyst is working with a spreadsheet from a cosmetics company.

You may click the link to create a copy of the dataset: Cosmetics Inc.

Which of the following is an example of an array in this spreadsheet?

  • Cells D7 and D14
  • All cells with number values
  • All cells with values greater than 100
  • The values in cells B2 through B31

Correct. The values in cells B2 through B31 together are an example of an array. An array is a collection of values in spreadsheet cells.

Pivot...pivot...pivot

Pivot tables

  • Let you view data in multiple ways to find insight and trends

Calculated field

  • A new field within a pivot table that carries out certain calculations based on the values of other fields

Question

A calculated field within a pivot table is used to carry out calculations based on what?

  • The values of other fields
  • The syntax of the available formulas
  • The function in the calculated field
  • The filtered values

Correct. A calculated field within a pivot table is used to carry out calculations based on the values of other fields. The calculated field is added as an additional row or column in a pivot table.

Elements of a pivot table

Pivot tables make it possible to view data in multiple ways in order to identify insights and trends. They can help you quickly make sense of larger data sets by comparing metrics, performing calculations, and generating reports. They’re also useful for answering specific questions about your data.

A pivot table has four basic parts: rows, columns, values, and filters.

The rows of a pivot table organize and group data you select horizontally. For example, the Release Date values can be used to create rows that grouped the data by year.

The columns organize and display values from your data vertically. Similar to rows, columns can be pulled directly from the data set or created using values. Values are used to calculate and count data. This is where you input the variables you want to measure. This is also how you create calculated fields in your pivot table. As a refresher, a calculated field is a new field within a pivot table that carries out certain calculations based on the values of other fields

Finally, the filters section of a pivot table enables you to apply filters based on specific criteria — just like filters in regular spreadsheets.

Being able to use all four parts of the pivot table editor will allow you to compare different metrics from your data and execute calculations, which will help you gain valuable insights.

Using pivot tables for analysis

Pivot tables can be a useful tool for answering specific questions about a dataset so you can quickly share answers with stakeholders. For example, a data analyst working at a department store was asked to determine the total sales for each department and the number of products they each sold. They were also interested in knowing exactly which department generated the most revenue.

Instead of making changes to the original spreadsheet data, they used a pivot table to answer these questions and easily compare the sales revenue and number of products sold by each department.

They used the department as the rows for this pivot table to group and organize the rest of the sales data. Then, they input two Values as columns: the SUM of sales and a count of the products sold. They also sorted the data by the SUM of sales column in order to determine which department generated the most revenue.

Pivot tables are an effective tool for data analysts working with spreadsheets because they highlight key insights from the spreadsheet data without having to make changes to the spreadsheet. Coming up, you will create your own pivot table to analyze data and identify trends that will be highly valuable to stakeholders.

Using pivot tables in analysis

Pivot tables are a spreadsheet tool that let you view data in multiple ways to find insights and trends.****

Pivot tables allow you to make sense of large data sets by giving you tools to easily compare metrics, quickly perform calculations, and generate readable reports. You can create a pivot table to help you answer specific questions about your data. For example, if you were analyzing sales data, you could use pivot tables to answer questions like, “Which month had the most sales?” and “What products generated the most revenue this year?” When you need answers to questions about your data, pivot tables can help you cut through the clutter and focus on only the data you need.

Create your pivot table

Before you can analyze data with pivot tables, you will need to create a pivot table with your data. The following includes the steps for creating a pivot table in Google Sheets, but most spreadsheet programs will have similar tools.

First, you will open the Data menu from the toolbar; there will be an option for Pivot table.

This pop-up menu will appear:

Generally, you will want to create a new sheet for your pivot table to keep your raw data and your analysis separate. You can also store all of your calculations in one place for easy reference. Once you have created your pivot table, there will be a pivot table editor that you can access to the right of your data.

This is where you will be able to customize your pivot table, including what variables you want to include for your analysis.

Using your pivot table for analysis

You can perform a wide range of analysis tasks with your pivot tables to quickly draw meaningful insights from your data, including performing calculations, sorting, and filtering your data. Below is a list of online resources that will help you learn about performing basic calculations in pivot tables as well as resources for learning about sorting and filtering data in your pivot tables.

Perform calculations

Microsoft Excel Google Sheets
Calculate values in a pivot table: Microsoft Support’s introduction to calculations in Excel pivot tables. This is a useful starting point if you are learning how to perform calculations with pivot tables specifically in Excel. Create and use pivot tables: This guide is focused on using pivot tables in Google Sheets and it provides instructions for creating calculated fields. This is a quick how-to guide you can save and reference as a quick reminder on how to add calculated fields.
Pivot table calculated field example: This resource includes a detailed example of a pivot table being used for calculations. This step-by-step process demonstrates how calculated fields work, and provides you with some idea of how they can be used for analysis. All about calculated field in pivot tables: This is a comprehensive guide to calculated fields for Google Sheets. If you are working with Sheets and are interested in learning more about pivot tables, this is a great resource.
Pivot table calculated fields: step-by-step tutorial: This tutorial for creating your own calculated fields in pivot tables is a really useful resource to save and bookmark for when you start to apply calculated fields to your own spreadsheets. Pivot tables in Google Sheets: This beginner’s guide covers the basics of pivot tables and calculated fields in Google Sheets and uses examples and how-to videos to help demonstrate these concepts.

Sort your data

Microsoft Excel Google Sheets
Sort data in a pivot table or PivotChart: This is a Microsoft Support how-to guide to sorting data in pivot tables. This is a useful reference if you are working with Excel and are interested in checking out how filtering will appear in Excel specifically. Customize a pivot table: This guide from Google Support focuses on sorting pivot tables in Google Sheets. This is a useful, quick reference if you are working on sorting data in Sheets and need a step-by-step guide.
Pivot tables- Sorting data: This tutorial for sorting data in pivot tables includes an example with real data that demonstrates how sorting in Excel pivot tables works. This example is a great way to experience the entire process from start to finish. How to sort pivot table columns: This detailed guide uses real data to demonstrate how the sorting process for Google Sheet pivot tables will work. This is a great resource if you need a slightly more detailed guide with screenshots of the actual Sheets environment.
How to sort a pivot table by value: This source uses an example to explain sorting by value in pivot tables. It includes a video, which is a useful guide if you need a demonstration of the process. Pivot table ascending and descending order: This 1-minute beginner’s guide is a great way to brush up on sorting in pivot tables if you are interested in a quick refresher.

Filter your data

Microsoft Excel Google Sheets
Filter data in a pivot table: This resource from the Microsoft Support page provides an explanation of filtering data in pivot tables in Excel. If you are working in Excel spreadsheets, this is a great resource to have bookmarked for quick reference. Customize a pivot table: This is the Google Support page on filtering pivot table data. This is a useful resource if you are working with pivot tables in Google Sheets and need a quick resource to review the process.
How to filter Excel pivot table data: This how-to guide for filtering data in pivot tables demonstrates the filtering process in an Excel spreadsheet with data and includes tips and reminders for when you start using these tools on your own. Filter multiple values in pivot table: This guide includes details about how to filter for multiple values in Google Sheet pivot tables. This resource expands some of the functionality that you have already learned and sets you up to create more complex filters in Google Sheets.

Format your data

Microsoft Excel Google Sheets
Design the layout and format of a PivotTable: This Microsoft Support article describes how to change the format of the PivotTable by applying a predefined style, banded rows, and conditional formatting. Create and edit pivot tables: This Help Center article provides information about how to edit a pivot table to change its style, and group data.

Pivot tables are a powerful tool that you can use to quickly perform calculations and gain meaningful insights into your data directly from the spreadsheet file you are working in! By using pivot table tools to calculate, sort, and filter your data, you can immediately make high-level observations about your data that you can share with stakeholders in reports.

But, like most tools we have covered in this course, the best way to learn is to practice. This was just a small taste of what you can do with pivot tables, but the more you work with pivot tables, the more you will discover.

Test your knowledge on using pivot tables

TOTAL POINTS 3

Question 1

The following is a sample pivot table from a furniture company's spreadsheet:

product SUM of purchase_price
bed $799.99
bookcase $58.89
chair $234.50
chaise $399.95
couch $9,000.00
desk $509.85
fan $111.92
lamp $160.97
mirror $199.95
ottoman $299.99
rug $808.65
vase $19.98
Grand Total 12604.635

What is the purpose of the pivot table in this spreadsheet?

  • To find the average price of each product
  • To summarize data about each product
  • To calculate the sum of individual prices for each product type
  • To organize all of the data into a smaller format

Correct. The purpose of the pivot table is to calculate the sum of individual purchase prices for each product type. It also shows a grand total for all of the product purchases.

Question 2

product SUM of purchase_price
bed $799.99
bookcase $58.89
chair $234.50
chaise $399.95
couch $9,000.00
desk $509.85
fan $111.92
lamp $160.97
mirror $199.95
ottoman $299.99
rug $808.65
vase $19.98
Grand Total 12604.635

How could the pivot table be adjusted to show the same data, but only for products categorized as beige?

  • Add a new column labeled beige
  • Sort the current row by product color
  • Summarize the values by product
  • Add a filter to show only beige products

Correct. To show the same data, but only for products categorized as beige, add a filter to show only beige products.

Question 3

product SUM of purchase_price Calculated Field 1
bed $799.99 $0.00
bookcase $58.89 $0.00
chair $234.50 $0.00
chaise $399.95 $0.00
couch $9,000.00 $0.00
desk $509.85 $0.00
fan $111.92 $0.00
lamp $160.97 $0.00
mirror $199.95 $0.00
ottoman $299.99 $0.00
rug $808.65 $0.00
vase $19.98 $0.00
Grand Total 12604.635 $0.00

Which spreadsheet tool should you use if you want to find an average value using values generated within a pivot table?

  • Data validation
  • A filter
  • A calculated field
  • Conditional formatting

Correct. To find an average value using values generated within a pivot table, use a calculated field. A calculated field is a new field within a pivot table that carries out certain calculations based on the values of other fields.