06 03 Visualizing a Categorical and a Quantitative Variable - HannaAA17/Data-Scientist-With-Python-datacamp GitHub Wiki

Count plots and bar plots

Categorical plots

  • Examples: count plots, bar plots
  • Involve a categorical variable
  • Comparisons between groups
  • catplot()

Bar plots

  • Displays mean of quantitative variable per category
  • Default confidence intervals
    sns.catplot(x='day', y='total_bill', data=tips, kind='bar', ci=None)

Creating a Box Plot

  • Show the distribution of quantitative data
  • See median, spread, skewness, and outliers
  • Facilitate comparisons between groups
    sns.catplot(x='time', y='total_bill', data=tips, kind='box')
  • Omit the outliers using sym=" "
  • Change the whiskers using whis=
    • By default, the whiskers extend to 1.5*the interquartile range
    • Make them extend to 2*IQR: whis=2
    • Show the 5th and 95th percentiles: whis=[5,95]
    • Show min and max values: whis=[0,100]

Point plots

  • Points show mean of quantitative variable
  • Vertical lines show 95% confidence intervals
  • categorical variable on x-axis
sns.catplot(x='age'
            y='masculinity_important',
            data=data,
            hue='feel_masculine',
            kind='point')
  • To disconnect the points: join=False
  • To display the median instead of mean: estimator=median
    • median is more robust to outliers
  • Customize the confidence interval, add caps: capsize=0.2
  • Turn off CI: ci=None