8.2.1.Visualization Tools - sj50179/IBM-Data-Science-Professional-Certificate GitHub Wiki

Basic Visualization Tools

LATEST SUBMISSION GRADE 100%

Question 1

Area plots are unstacked by default.

  • True.
  • False.

Correct.

Question 2

The following code will create a histogram of a pandas series, series_data, and align the bin edges with the horizontal tick marks.

count, bin_edges = np.histogram(series_data)
series_data.plot(kind='hist', xticks = count, bin_edges)
  • True.
  • False.

Correct.

Question 3

The following code will create a horizontal bar chart of the data in a pandas dataframe, question.

question.plot(type='bar', rot=90)
  • True.
  • False.

Correct.

Specialized Visualization Tools

Box Plots

  • Minimum (Q0 or 0th percentile): the lowest data point excluding any outliers
  • First quartile (Q1 or 25th percentile): also known as the lower quartile q_n(0.25), is the median of the lower half of the dataset
  • Median (Q2 or 50th percentile): the middle value of the dataset
  • Third quartile (Q3 or 75th percentile): also known as the upper quartile q_n(0.75), is the median of the upper half of the dataset
  • Maximum (Q4 or 100th percentile): the largest data point excluding any outliers
  • Outliers: individual dots that occur outside the upper and lower extremes
  • Interquartile range (IQR) : is the distance between the upper and lower quartiles.
    • IQR = Q_3 - Q_1 = q_n(0.75) - q_n(0.25)

Specialized Visualization Tools

LATEST SUBMISSION GRADE 100%

Question 1

What do the letters in the box plot above represent?

  • A = Median, B = Third Quartile, C = Mean, D = Inter Quartile Range, E = Lower Quartile, and F = Outliers
  • A = Mean, B = Upper Mean Quartile, C = Lower Mean Quartile, D = Inter Quartile Range, E = Minimum, and F = Outliers
  • A = Mean, B = Third Quartile, C = First Quartile, D = Inter Quartile Range, E = Minimum, and F = Maximum
  • A = Median, B = Third Quartile, C = First Quartile, D = Inter Quartile Range, E = Minimum, and F = Outliers
  • A = Mean, B = Third Quartile, C = First Quartile, D = Inter Quartile Range, E = Minimum, and F = Outliers

Correct.

Question 2

What is the correct combination of function and parameter to create a box plot in Matplotlib?

  • Function = box, and Parameter = type with value = "plot"
  • Function = boxplot, and Parameter = type with value = "plot"
  • Function = plot, and Parameter = type with value = "box"
  • Function = plot, and Parameter = kind with value = "boxplot"
  • Function = plot, and Parameter = kind with value = "box"

Correct.

Question 3

Which of the lines of code below will create the following scatter plot, given the pandas dataframe, df_total?

import matplotlib.pyplot as plt

df_total.plot(kind='scatter', x='year', y='total')

plt.title('Total Immigrant population to Canada from 1980 - 2013')
plt.xlabel ('Year')
plt.ylabel('Number of Immigrants')

Correct.