Exhibit guide - ganong-noel/lab_manual GitHub Wiki

This document describes default settings. We deviate from these defaults in specific cases when a plot is unclear or ugly.

Capitalization (from gslab)

  • Titles of tables and figures, are written in “title caps” (e.g., "Main Results" not "Main results").
  • Axis labels, row and column headers in tables, etc. should only have the first word capitalized (e.g., “Log population” not “Log Population”)

Make plots in R

  • Legend positioning
    • Default is bottom-right c(0,1) inside the plotting area
    • If the legend is within the plot and the plot is animated, set the justification to the top left c(0, 1) to avoid the legend moving as more items are added. An example of this is in the minimum working example for animation for animation (discussed below)
    • Caveats:
      • Move legend so it doesn't block data. Instructions.
      • If there is no place within the plot region that doesn't block data, put the legend below the plot area using "bottom"
  • File format: plots are PNGs 8 inches wide by 4.5 inches tall
    • We prefer PNG to PDF since git can diff on PNGs, but not PDFs.
      • Caveat: sometimes if you switch to a new computer there will be subtle plot differences (example). When you take over someone else's code, you should first run it without making any changes. Ideally, this will lead to a commit with only these plot changes.
    • Use 300 dpi
  • Colors: use package RColorBrewer (nicely visualized at http://colorbrewer2.org)
    • For two colors or with an idea that we are moving around a single variable, we use Blues
    • For qualitaitve differences, we use Set2. if possible, add other info (e.g. shape aesthetic) so lines are distinguishable in black-and-white printing.
  • Scales: use scale_y_continuous(labels = scales::percent) for variables measured in percent and scale_y_continuous(labels = scales::dollars) for variables measured in dollars
  • Animations: save intermediate plots in a list and then use save_animation() to automatically save the graphs following the naming conventions. See MWE here, which includes tips on getting the legend not to move.
    • The argument show.legend can be added to a geom to ensure that it does not appear in the legend. This may be useful for adding text or shading without altering the legend from earlier plots.
    • The argument drop can be set to FALSE in scale_colour_manual() (and similar functions) to include an item on the scale for levels of a factor which are not currently on the plot. This is useful if for example different colors are being revealed at different times. Make sure that the variable being mapped to color (or similar) is explicitly a factor (not, for example, a logical) to get this to work.
    • Labels can be added to faceted plots using geom_text() rather than annotate which will apply the label all facets. The geom_text() should take a separate data frame which includes your label, the location you want it and the facet it should appear in. More detail in this stack exchange post.
  • Histograms with two series
    • Use bars side-by-side to emphasize similarity (e.g. Figure 1 in our strategic default paper)
    • Solid vs hollow bars to emphasize difference (e.g. online appendix figure 19 in our UI paper)
  • Axis titles: we use subtitle to convey the y-axis label, not the vertical ylab. Be sure to specify theme(plot.title.position = "plot")
    • Note: an exception to this rule is for faceted plots. Here, use the vertical ylab
  • For social media
    • Extra wide: use 8 x 4 aspect ratio
    • Include a plot title with the lesson (e.g. "Two-thirds of unemployment recipients have benefits greater than earnings" not "Distribution of UI replacement rates")
    • Plots should be self-explanatory without reference to figure notes. This usually requires adding explicit annotations, like this.
    • For line plots, place labels for each series next to the series itself, like this

Make plots in Python

Refer the the plot guide for R above for style. We use the package plotnine to make plots in python that look similar to ggplot2 plots in R. Note the following useful tips:

  • Colors can imitate ggplot using plotnine.scale_color_brewer(), but to get the same transparency as the default in RColorBrewer, set the alpha argument in geoms. Markers use the markers available to matplotlib here Linetypes also use matplotlib linestyles

    import plotnine as p9
    colors = p9.scale_color_brewer(type = 'qual', palette = 2)
    shapes = p9.scale_shape_manual(values = ['o', '^', 's', 'D'])
    vert_line = p9.geom_vline(xintercept = 1.5, linetype = '--')
    plot = plot + p9.geom_line(alpha=0.5) + shapes + vert_line
    

Make tex tables

  • Tables should have a double horizontal line at the top and a single horizontal line at the bottom. The lines should be set using the package booktabs as follows:
    • Start table with \toprule \toprule. End table with \bottomrule.
    • Use \midrule (and, for shorter lines, \cmidrule) inside tables.
    • Do not use \hline or \cline for formatting tex tables. You can tell when tables are erroneously using \hline and \cline because these lines are not as bold as those set corerectly with booktabs.
  • Tables should never extend into the margins of a paper. Strategies to reduce the width of tables include:
    • Reducing space between columns by modifying \setlength{\tabcolsep}{3pt}.
    • By default, we set the font size of the table content to \small. However, you may reduce this to \footnotesize.
  • Tables are permitted to be narrower than the width of a line of text. While you may increase the width of a table, the rule of thumb is that the font size of text within a table may never exceed the font size outside of a table.
  • Tables should have vertical space before the table note. We default to \vspace{0.2cm}.
  • Be sure to not wrap tables in a float. For example, using the stargazer package in R, set float = FALSE.
  • When reviewing all the tables in a paper, ensure that they are consistent when it comes to:
    • Space between the table and the table note
    • Whether the table note spans is text-aligned or table-aligned
    • Space between columns (unless necessary to reduce the width of a particularly large table)
    • Font size (unless necessary to reduce the width of a particularly large table)
  • Below is sample code that produces a simple table consistent without our guidelines:
\begin{table}[htbp]
\caption{Reasons for Unpaid Temporary Leave}
\label{tab: reason codes}
\centering
\setlength{\tabcolsep}{3pt}
\small
\begin{tabular}{llcc}
\toprule \toprule
Broad category & Reason                    & Share of spells & Share of spell-weeks \\
\midrule
Worker-driven  & Family and parental leave & 0.181           & 0.145                \\
Worker-driven  & Medical leave             & 0.155           & 0.120                \\
Worker-driven  & Education leave           & 0.131           & 0.155                \\
Worker-driven  & Total                     & 0.467           & 0.420                \\
\midrule
Firm-driven    & Work is slow              & 0.211           & 0.225                \\
Firm-driven    & Seasonal employment       & 0.125           & 0.145                \\
Firm-driven    & Contract on hold          & 0.050           & 0.065                \\
Firm-driven    & Total                     & 0.386           & 0.435                \\
\midrule
Other          & Miscellaneous             & 0.147.          & 0.145                \\
\bottomrule
\end{tabular}
\begin{minipage}{\textwidth}
\vspace{0.2cm}
\footnotesize
Notes: This table shows the six most common reason codes for temporary unpaid leave within employment relationships in the PayrollCompany data. Within ``Miscellaneous'', COVID-related inactivity accounts for 8\% of spells and another 326 codes account for the remaining inactivity spells.
\end{minipage}
\end{table}

Embed plots in a Lyx document

  • Plot sizing
    • With two plots on a page, we set "scale graphics" to 80%
      • Deviation: Sometimes we will go down to 70% in order to fit two separate figures on to the same page.
    • With 4+ plots on a page, we set "scale graphics" to 50%
  • Plot titles are specified here, so there is no need to specify a title in R
    • It is fine to have titles in R on diagnostic plots or when we are iterating on a plot to convey a message to other team members.