Paper Production Guide - ganong-noel/lab_manual GitHub Wiki
Note: this guide is adapted from the gslab guide here.
Whenever we get ready to circulate or submit a draft of a paper, the following tasks need to be done. These will usually be allocated among the RAs with one person in charge of supervising a given task and pulling together the results in a single deliverable. Each task has a default time estimate and number of RAs assigned to it. If it looks like the time it will take to complete a task is much longer than the time estimate, PN/PG should be consulted.
- One or more RPs will be assigned to supervise production. The supervisor is in charge of implementing the process, which includes the following tasks:
- Create a main
paper_production
branch. - Create github issues for each task and add to the project board. Note that a task should be a unit of work and a person. For example, when we split FACT to be done across two RAs, we should split the issue ticket in two.
- Keep track of each task and its completion status. It may be useful to have a "main" paper production issue with a table that tracks each task, issue number, task name, task lead and assignees, status, and priority.
- Create a main
- Each production tasks should branch off the
paper_production
branch using standard gnlab procedure. Naming followsissueXXX_taskname
convention - After a task is completed, the issue should be moved for PI review, and then the task branch
issueXXX_taskname
should be merged back intopaper_production
branch. PG/PN might do this merge themselves during review.- If you get a merge conflict and choose to resolve it in a text editor, it's really important to be careful with what text editor you use. Some editors delete trailing white spaces which will wreak havoc on the formatting in lyx. We suggest using Visual Studio Code.
- If you accidentally do delete trailing white spaces, it's helpful but not sufficient to use the diff in Github to see where errors might have been introduced. You should also look at the spacing around citation/exhibit references, sticky notes, quotation marks, italicized words, and math.
- If you get a merge conflict and choose to resolve it in a text editor, it's really important to be careful with what text editor you use. Some editors delete trailing white spaces which will wreak havoc on the formatting in lyx. We suggest using Visual Studio Code.
- DAILY TASK: Supervisor should send a daily update on Slack about the current status of paper production, and dependencies across tickets
- Most of the tasks are written around identifying problems. However, when you do identify a problem, please also propose a solution if possible.
- References below to “the paper” apply also to the online appendix and any other document that we are planning to circulate externally, unless otherwise noted.
- Attach deliverables (comments or PDFs) to the relevant GitHub task and notify PG/PN with an @ reference. If working on a tight deadline, please notify through slack or e-mail. Changes that PN/PG do not need to review should be linked to via a GitHub commit link.
- When deliverables may require PN/PG to make edits, please indicate in the GitHub issue comment the name of the branch in which these edits should be made.
- For any task with multiple RAs and a single PDF deliverable, create a separate github issue for each RA. The task lead is responsible for pulling together comments from different RAs into a single document and can exercise discretion combining / editing comments/eliminating duplicates in the process.
- When working on a task, if you discover issues relevant to a different task, raise and log it in the github issue for that task,
- For any task where there are clear typos/ errors to be made that you are 100% confident we would want to correct, coordinate and implement these comments without PI review
- RAs should only spend time making the final deliverable that PN/PG will review clear and concise. Documents circulated only among RAs should not be made "pretty."
- An entire round of production should take no more than one week of work for 3-4 RAs.
Work Allocation
- 2 hours (1 RA)
Tasks
- Turn on package management software (
checkpoint
for R, TBD for Python) and ensure that results are unchanged. - Do this task before PRELIM and EXT
Work Allocation
- 4 hours (1 RA)
Tasks
- Make sure STABLE task is done
- Bootstraps and simulations run with sufficiently high number of draws
- Quadrature accuracy set sufficiently high
- Tolerances on solvers satisfactory
- Exit flags for solvers indicate convergence
- Ensure there are no xxx, ccc, ggg, nnn, or track changes remaining in the draft, editor letter, or referee response letters
Work Allocation
- 8 hours (1 RA)
Goal
- codebase: confirm that results reported in paper would be unchanged if all externals calls to Google Drive, Dropbox, etc. are pointed to the most recent version
- paper: update all the paths in Lyx from
input/needs_path
to to their production versions inout
. compare old compiled PDF and new compiled PDF side-by-side to be sure that plots didn't change. Note that Material produced for referee letters only by screen-capping figures in other people’s papers or figures in our other research projects can stay ininput
- Delete
needs_path
directory and its contents
Deliverables
- A list of any external calls that are out of date and would impact the results.
- A list of any source/targets specified incorrectly and any directories that need to be re-run.
- A revised paper.lyx where the figure and table paths are updated
Work Allocation
- 8 hours (1 RA)
Goals
- All citations in the text match references in the bibliography.
- All references in the bibliography are cited somewhere in the text.
- Author names and years in text citations are correct and in-text citations follow the formatting guidance in the writing style guide including punctuation.
- If we are citing working papers, check the publication status. If they have been published update the reference accordingly.
- The references list is correct and uses a consistent style. (It is not important which style guide we follow, just that we are consistent.)
Notes/Standards/Useful Information
- Follow the citation guide.
- Working papers should be cited as:
- JMP/non-NBER: Authors. (year). Title. Working Paper
- NBER: Authors. (year). Title. Working Paper 12345, National Bureau of Economic Research
Suggested Workflow
- Go through the reference list, note down which papers have been updated.
- Make the necessary changes for the corresponding paper in the bib file.
- Now, when compiling the lyx document there might be a bunch of broken references, because the cite key has changed when updating the bib file.
- For each broken cite in the paper, reference a printout of the draft to see what the correct cite should be. Update accordingly, using the correct key in the new bib file.
Deliverables
- An updated paper.lyx file with corrected in-text citations and references. Use track changes in lyx
- A single pdf document that highlights (using Acrobat commenting tools) any changes that are substantive enough to require PI review.
Work Allocation
- Up to 16 hours (1 Academic and Research Specialist)
Input from gnlab
- A single pdf with entire paper including tables, figures, and appendices
- A lyx document with the input text.
Goals
- Follow rules outlined in writing style guide
- Typos, spelling/grammar errors, unclear wording, etc. corrected (don’t forget to include table notes, figure notes, footnotes, axis labels, appendices, etc.)
- Check that when a statistic in the text reports a number from a table it matches what is actually in the table
- Check that table/figure references refer to the correct table/figure, and that the table/figure presents the promised information (as far as you are able)
- Check that figures and tables are referenced in order
- Variable names, notation, and other concepts are used consistently; the same notation is never used to refer to two different things (this might be tricky if the notation isn't natural to follow for you, in which case skip this step)
- Suggest any other improvements to the writing you might have
Deliverables
Note: Use track changes in lyx
- A single lyx document with tracked changes. Any notes should be in yellow stickies, with your initials to GGG/NNN at the beginning, i.e. “[your initials] to GGG/NNN: I recommend a different word choice here for xx and yy reason”
- A single pdf document with any changes/comments that can’t be inputted in lyx clearly marked using Adobe Acrobat’s commenting tools.
- Upload both to github issue by dragging and dropping.
remarks
- If you make the same edit in several places (e.g. U.S., semicolons, etc), it is fine to just have a sticky explaining this the first time. This should helpfully save you time.
- First review the PDF, then review the Lyx. However, please enter edits/comments preferentially in the Lyx document since it is a bit easier for us to address issues efficiently from the Lyx doc
Work Allocation
- Up to 16 hours (1 RA)
Goals
- Correct spelling, grammar, figure/table refs, etc. Follow the draft style guide
- Note: If there is a recurring issue, inform PIs immediately via comment and @ tag below, and provide examples. PI will decide how to handle. Possible solutions: (i) address the issue throughout; (ii) ignore; (iii) do not change but flag all instances in the final deliverable. PN/PG will also instruct on whether the Draft Style Guide section of the RA manual should be updated to clarify the issue.
- This task does not extend to (if extant) the cover letter and referee replies.
Tasks (identical to ARS)
Note: Use track changes in lyx
- Check text (including table/figure notes, footnotes, axis labels, appendices, etc.) for typos, spelling/grammar errors, unclear wording, etc.
- Check that when a statistic in the text reports a number from a table it matches what is actually in the table
- Check that table/figure references refer to the correct table/figure, and that the table/figure presents the promised information
- Check that figures and tables are referenced in order
- Titles of sections, tables, figures, etc. are clear, descriptive
- Variable names, notation, and other concepts are used consistently; the same notation is never used to refer to two different things
- Suggest any other improvements to the writing you might have
Tasks (unique to RA)
- Use lyx spellcheck
- Within text, check that vertical white space looks reasonable. If you see a problem adjacent to a section header right-click on each header and change the spacing to 1.5 until the issue is resolved. If you see a problem elsewhere, use custom vertical space (insert -> formatting -> vertical space -> custom) to adjust.
Deliverables
- A single lyx document with tracked changes. Any notes should be in yellow stickies, with your initials to GGG/NNN at the beginning, i.e. “[your initials] to GGG/NNN: I recommend a different word choice here for xx and yy reason”
- A single pdf document with any changes/comments that can’t be inputted in lyx clearly marked using Adobe Acrobat’s commenting tools.
Work Allocation
- 6 hours each for initial pass (2 RAs)
- 2 hours to address PI feedback after first pass
- Note: There should be separate (duplicate) tickets for each RA working on this
Goals
- Every quantitative/factual claim is supported be one of (i) Table entry (ii) Figure or (iii) Citation
- Note: For external citations, please save a copy of the referenced paper as a pdf in
paper_slides/facts/
directory. Use(authors)_(yyyymmdd)_(Journal abbrev. Or WP)_(paper_title)
format. The sourcing statement/stick/comment in lyx should reference the specific table/figure in the saved document that contains the fact.
Definition of a fact
A fact is:
- A direct quantitative statement. e.g “The parameter x is ”
- A statement of quantitative comparison. e.g “Model x has a better fit than model y”
- A statement that claims a date. e.g “The program started after September ”
Examples of qualitative statements that do not need to be checked
- “Becker and Friedman (1950) examine the role of markets for organs”
- “The model fits well”
Tasks
- Check that all facts are supported
- Most facts should already have a comment/sticky in the lyx document with a pointer to a source (i.e., table/figure either internal or external). RA should check that the sourced content actually contains fact. If they cannot add the source, find UNSUPPORTED. If the source conflicts, add MAYBE WRONG to the sticky, followed by an explanation of the problem.
- If a fact lacks a comment/sticky, add UNSUPPORTED
- Process note for facts documented with "test_that" statements in code: The comment/sticky should refer to the name of the code file and include a snippet of the test_that statement so PIs and RAs can easily check the latest version of the file to confirm that the statement is correct. Example: "SOURCE test_that statement in representativeness_v2.R reading "xxx xxx""
Notes
- For ALL changes to lyx text, use track changes
- Please do all work (comments, new stickies, updates to existing stickies) within paper.lyx. This permits easy back-and-forth until all facts are correctly sourced.
- The above note does not preclude making additional comments/raising notifications in the github issue
- After you submit the deliverable, expect to iterate on the factcheck portion until all the unsupported or maybe wrong claims have been addressed and all track changes have been accepted or rejected
Deliverables
NOTE: For ALL changes to lyx text, use track changes
- factcheck Produce a version of
paper.lyx
where every claim is either supported, marked unsupported, or maybe wrong
Paper_production DEF - Check sample definitions, variable definitions, empirical specification, model specification
Work Allocation
- 8 hours (1 RA)
Goals
- Main statements made in paper are consistent with code. This applies to sample definitions, variable definitions, empirical specification, and model specification (depending on which section of the paper you are working on).
- Checking every variable and sample definition can for some projects take a large amount of time. By default, this task should be limited to a single person-day of work. Unless specifically directed otherwise, you should focus on checking the definition of the main sample(s) in the paper and variables in the core specification(s), and either ignore or just spot-check robustness analyses, supplemental analyses in appendices, etc.
Deliverables
- A list of any inconsistencies between text and code
Paper_production AGGREGATION_AND_CR_IN_FIREWALL - Check aggregation standards and code review for Chase
Work Allocation
- 8 hours (1 RA, inside Chase)
Goals
- Aggregation standards document filled out. Details inside Chase. Must be turned in with pdf of paper for any PUBLIC disclosure review
- Any required code review documents also filled out. This document usually has one row for each pull request that has been approved Note: see the confluence page "JPMCI Release Workflow" for details. Eventually we need to submit every item listed in "Initial Handoff".
Deliverables
- An excel file with the aggreation standards documented. File should be (a) emailed to PIs at Chase email and (b) saved in the project's repo
- Any required code review documents
Work Allocation
- 4 hours (1 RA)
Goals
- formatting for social media of the paper's key plots following guidelines here
Deliverables
-
png
files
Work Allocation
- 4 hours (1 RA)
Goals
- All data sources are cited following AEA (guidelines)[https://www.aeaweb.org/journals/policies/sample-references]
- Any JPMCI data sources (should be in
analysis/input
, might be used in say a model) are also included in an appendix table
Deliverables
- Add missing dataset citations using Zotero
- A version of
paper.lyx
where these data sources are included in the bibliography. The citations do not need to be added to the text. - A github comment listing the JPMCI data sources which are missing from appendix table
Work Allocation
- 4 hours (1 RA)
Goals
- Funding sources are acknowledged.
- Seminar participants and those who provided comments are acknowledged.
Notes
- The “meetings” folder on google drive will list comments from seminars, e-mails and conversations.
- Getting a list of funding sources is harder but a good place to start is acknowledgments on other recent papers. You can also ask PN/PG if NSF or other ongoing funding sources should be added.
Deliverables
- A list of unacknowledged sources of funding and comments.
Work Allocation
- 2 hours (1 RA)
Goals
- Floats (figures and tables) are in the correct order
Tasks
- Check that all tables/figure are referenced at least once in main text or appendix.
- Check that all content in the appendix is referenced at least once in the paper
- Within each section (tables, figures, discussion) of the online appendix, check that content appears in the order it is referenced in the paper.
- Exception to the above: any figure/table using Chase data that is in a referee or editor response (when applicable) but NOT in the regular paper still needs to be included at the end of the appendix to meet disclosure requirements. These figures/tables do not need to be referenced in the text.
Deliverables
- Produce a list of plots that are not referenced
- Produce a list of plots that are referenced in the wrong order, and a suggestion of where they should be
Work Allocation
- 8 hours (1 RA)
Goals
- Color plots print well in black and white;
- markers are sufficiently distinct
- Axis labels, row and column headers in tables, etc. should only have the first word capitalized
- Plots follow the guidelines in gslab Data Visualization - this in turn links to Schwabish (2014)
Deliverables
- Produce a list of plots that have issues in B&W, and propose fixes.
- Produce a list of plots that you think have legibility issues, fail to follow good guidelines, or are otherwise not acceptable
- Note: If it is easy to modify the plots, the ideal format for suggestions is graphics files attached to the relevant issue illustrating the proposed alternative coloring
Work Allocation
- 8 hours each (2 RAs)
- Note: There should be separate (duplicate) tickets for each RA working on this
Goals
- All nontrivial mathematical claims in the paper are documented.
- By default, this does not include checking statements within proofs. However, PN/PG should be consulted at the beginning of this task to confirm the desired scope.
Deliverables
- A list of all theoretical claims that are made in the text of the paper that are not supported by
- (i) proofs in the main appendix
- (ii) discussion/proofs in the online appendix
- (iii) discussion/proofs in text.pdf. E.g., we may say It is easy to show that equations A, B, and C together imply equation D. You should not include claims that are completely obvious. We’re looking for things where if somebody came back and said I don’t believe this is true we would need to go back and do at least a couple of lines of algebra to confirm that we’re right
- A version of claims.pdf with comments noting, for each claim, either
- (i) a place it is referenced in the paper or online appendix, or
- (ii) that the claim does not appear to be referenced
task for a PI
- upload as NBER working paper
- send to BFI as working paper
Work Allocation
- 4 hours (1 RA)
Goals
- All figures and tables in the slide deck are up to date
Deliverables
- A revised slide deck with figures and tables as they appear in the paper
- A list of places where the interpretation of a figure or table appears to have changed (e.g. such that the sentence we would say in the voiceover is different)
This task done by an RA with Chase data access
Goals
- Clarify what code corresponds to the released document
- Clean up and allocate ownership correctly in
teams/gnlab/project_name
Deliverables
- "tag" the final commit (bitbucket instructions here)
- verify that the
md
file listing all the builds is a complete record of what is saved to the drive and updatemd
if needed. then, propose a subset of builds to keep - after PI signs off, delete all the builds that are not explicitly kept.
- review all files and folders inside to verify that ownership status based on SID is Peter or Pascal (use
ls -l
at shell). transfer ownership of all files.
This task is often done more than once because some code lives in various restricted environments
Deliverables
- Update
README.md
following guidelines here - Update date on
README.md
- "tag" the final commit (github instructions here)
Work Allocation
- 8 hours each (2 RAs)
- Note: There should be separate (duplicate) tickets for each RA working on this
Goals
- Every question or comment by an editor or referee is addressed directly in the corresponding response letter.
- Every statement made in a cover letter to the editor or a reply to a referee is correct. If the statement refers to a change to the paper, the paper has changed as indicated since the previous submission. If the statement is a table or figure referenced but not shown in the paper, the table or figure presented to the editor/referee matches a supporting document.
- If a table or figure is included in the referee report and uses Chase data, it is also included in the paper. Even if it is not referenced in the paper, it should be included at the end of the online appendix.
- Conduct PROOF on the cover letter and replies, with a focus on clear errors or issues that would cause confusion. Consistency in style is not that key in replies since they will not be published and are intrinsically transient.
- Fill in correct exhibit numbers in paper when listed as xxx
- Check all figure and table numbers in referee reports to make sure they are correct
Deliverables
- A single pdf document combining the editor’s letter and all referee reports, with all unaddressed comments/questions clearly marked using Adobe Acrobat’s commenting tools.
- A single pdf document combining all responses to the editor/referees, with all unsupported/incorrect claims and the results from the PROOF clearly marked using Adobe Acrobat’s commenting tools. Please also note the specific table/figure/section number in which claims made in the responses are documented, unless these are already referenced by number.
Paper_production QUOTES- Update quotes in editor and referee letters match the most recent version of the paper
Work Allocation
- 4 hours (1 RA)
Goals
- All quotes from the paper text in editor and referee letters reflect the most recent version of the paper as it appears after all proofreading has been completed
Deliverables
- A tracked changes version of editor, R1, R2, R3, and R4 letters (if they exist)
Work Allocation
- Unsure
Goals
- All language in the paper and appendixes has been checked for similarity to other text. We haven't figured out the technology for how this will work yet, ideally there's an online platform where you can submit free text (or even better, a PDF) and check it for plagiarism. Given that Turnitin.com has existed for a long time, it seems likely that there's a good way to do this but not sure of the tech details yet.
Work Allocation
- Consult PG/PN
Goals
- We have flagged all errors in translating our submitted manuscript into the journal’s typeset format.
- Note: the goal here is not to proofread the paper (i.e., not to perform the above-listed steps). We presume we will have done that as of the last submission, so the only possible remaining errors are those from typesetting.
Deliverables
- A single pdf document that lists all "meaningful" discrepancies between our original typeset manuscript as of the last submission to the journal and the galley proofs. Construct this list by using Adobe Acrobat’s commenting tools to markup the galley proof.
- Pay special attention to the formatting of tables, figures, and equations, as these are where most discrepancies tend to arise.
- "meaningful" is tricky to define. Exclude recurring stylistic edits to make the writing consistent with the journal's style guide (e.g. removal of hyphens, changes in capitalization, change from "panel (a)" to "left panel)). Sometimes, this is a judgment call. If you aren't sure if the change is meaningful, do tell us about it.
- In the same pdf document, also comment on anything else that looks like a typo or error that you happen to come across. Note that you should be looking only for discrepancies with respect to the last submission, not doing any other form of proofreading. But if you do notice a probable error along the way it is best to flag it as it may still be possible to correct it.