08 01 Seaborn Introduction - HannaAA17/Data-Scientist-With-Python-datacamp GitHub Wiki

Introduction to Seaborn

Distplot

  • Similar to the histogram
  • By default, generates a Gaussian Kernel Density Estimation (KDE)
  • sns.distplot(df['col_name'])
    • Automatic label on x axis
    • Muted color palette
    • KDE plot
    • Narrow bins

Using the distribution plot

  • Disable the KDE and specify the number of bins to use to plot a simple histogram
    • sns.distplot(df['alcohol'], kde=False, bins=10)
  • Alternative data distribution: rug plot
    • sns.distplot(df['alcohol'], hist=False, rug=True)
  • The displot function uses several functions including kedplot and rugplot
    • It is possible to further customized a plot by passing argument to the underlying function
    • sns.distplot(df['alcohol'], hist=False, rug=True, kde_kws={'shade':True})

Regression Plot in Seaborn

  • The regplot function generates a scatter plot with a regression line.
  • data, x, y variables must be defined.
  • sns.regplot(x='alcohol', y='pH', data=df)

Implot() builds on top of the base regplot()

  • Implot() faceting: hue, col etc.