7.3.5.Video quiz - quanganh2001/Google-Data-Analytics-Professional-Certificate-Coursera GitHub Wiki

R data frames

Question: Fill in the blank: A data frame is a collection of _____.

A. tibbles

B. data

C. columns

D. cells

The correct answer is C. columns. Explain: A data frame is a collection of columns. It is similar to a table in spreadsheets or SQL.


Which of the following are standards of tidy data? Select all that apply.

  • Observations are organized into rows
  • Variables are organized into columns
  • Each value has its own cell
  • Columns are named

Explain: Variables are organized into columns, observations are organized into rows, and each value must have its own cell.

Working with data frames

Correction: if you look closely at the bottom of the diamonds data set, you will see there are actually 53,940 entries (or observation rows) in total and not 100.

In order to get a much shorter and simpler overview of the data observations, we will use the head() function introduced next.

Question: Which R function should you use if you want to preview just the first six rows of a data frame?

A. str()

B. mutate()

C. colnames()

D. head()

The correct answer is D. head(). Explain: The head() function provides a preview of the first six rows of a data frame. This is useful if you want to quickly check out the data, but don’t want to print the entire data frame.


Cleaning up with the basics

Questions: Which of the following functions returns a summary of the data frame, including the number of columns and rows? Select all that apply.

  • clean_names()
  • rename()
  • glimpse()
  • skim_without_charts()

Explain: The skim_without_charts() and glimpse() functions both return a summary of the data frame, including the number of columns and rows. glimpse()

The rename_with() function can be used to reformat column names to be upper or lower case.

A. True

B. False

It is true statement. Explain: The rename_with() function can be used to reformat column names to be upper or lower case.


Organize your data

If you haven't already installed the palmerpenguins package in RStudio, refer to the palmerpenguins package installation instructions .


Transforming data

Note: if you are following along within your Posit (R Studio) project, the following code was pasted into the script editor:

Copy + paste the syntax of the following concatenated list function that defines the three variables: first_name, last_name, and job_title:

first_name <- c("John", "Rob", "Rachel", "Christy", "Johnson", "Candace", "Carlson", "Pansy", "Darius", "Claudia")

last_name <- c("Mendes", "Stewart", "Abrahamson", "Hickman", "Harper", "Miller", "Landy", "Jordan", "Berry", "Garcia")

job_title <- c("Professional", "Programmer", "Management", "Clerical", "Developer", "Programmer", "Management", "Clerical", "Developer", "Programmer")

employee <- data.frame(id, first_name, last_name, job_title)

print(employee)

The employee variable is defined as a data frame with the following parameters: id, first_name, last_name, and job_title.

Once this code is pasted into your editor, run the code, and continue with the video.

Same data, different outcome

Note: You will have a chance in later lessons to create more plots using the datasauRus and ggplot packages. If you are currently following along in R Studio, here is the code syntax that will create the datasauRus plots shown in the video:

install.packages('datasauRus') run the code

library('datasauRus') run the code

ggplot(datasaurus_dozen,aes(x=x,y=y,colour=dataset))+geom_point()+theme_void()+theme(legend.position = "none")+facet_wrap(~dataset,ncol=3) run the code