Analyst, not a coder and common challenges - ganong-noel/lab_manual GitHub Wiki

Analyst, not a coder

(aka how to be an effective researcher)

Your job in gnlab is to analyze and answer questions about the world. Writing code is one small piece of answering questions. This might be a change from expectations in places where you have been a research assistant in the past or coursework you might have done. This page provides advice on the non-code aspects of being a successful analyst.

Understanding motivation

One of the hardest parts of being an RP is that it is often difficult to understand the bigger picture: Why am I doing this task? How does it fit into the broader project? It is essential that you know the answers to these questions before you start working on the task.
We try to communicate context four ways
- Objective at the top of github ticket
- Live discussion at standup
- Weekly meeting
- Practice runthroughs
If we haven't succeeded in explaining the motivation, please ask! A pet peeve of Peter’s from when he was an RP was being assigned work and not understanding how it fit in to the broader picture.

Titling exhibits

Add a title to every table and figure you make with what you think the key lesson is.*

Think back to the motivation. Does your current draft of the exhibit answer the motivation? Why or why not? If not, consider mentioning this in your github comment or memo.

*The one exception is that if you know for certain that an exhibit is going in a slide deck or paper then omit the title. But this is true for < 10% of the exhibits that we make.

Getting stuck

It is normal to get stuck. At the same time, we put a high premium on working autonomously. When you are unsure between two choices, you can

ask us and wait for feedback
make a choice your self and include a note in the github comment:

I had a choice between (a) and (b) and it wasn't obvious what to do. I provisionally made choice (a) for the following reasons, but glad to revisit.

When possible, we prefer path (2) because this allows you to surface parts of your work that you are uncertain about while still pushing forward with the analysis

Level of polish

A constant tradeoff in writing code and prose is that having good style is time consuming. How much time to invest in style and polish?

For github comments and memos: more time for longer work products, more time where the subject matter is complex. Always proofread memos and material written for appendices of papers.
For code: writing sloppy code is called going into "technical debt". Code that is merged to main must follow the style guide (and therefore not be sloppy). Debt has pros and cons. Be thoughtful about whether to go into debt and communicate when you are going into debt.

Tips on Proofreading

Print a hard copy
Block external noise. Pascal has special headphones for this.
Go to a place where no one will bug you. Peter likes to go to the fourth floor of Harris.

Going in depth on hard problems

At times, we will give you hard problems to work on. The more you work independently and propose creative solutions the better. "Going deep" is obviously useful because it improves the research, but it also has professional development benefits. First, you will learn more by independently solving a problem. Second, if you are applying to grad school, our letter of recommendation will be more effective because we will have better and more concrete examples to point to.

Written output

We will not discuss this material at the retreat, but use this as a yardstick for your Github comments

Your first comment

An effective ticket comment usually includes

Choices you made (see "getting stuck" above)
Statistical output (tables or figures)
Written interpretation of the output
- does this answer the assigned question?
- does it make sense given what you know about the project more broadly?
- does it make sense given what you know about the world? (compare to what you find via google)
Next steps (if ticket is not ready to close)
- if left to my own devices, this is what I would do next
- here are some other options that I considered

For longer comments, it should be easy to quickly get the main takeaway with a TLDR.

The "details" part of a comment should be easy to skim. Use section headers, bolding and underlines. This guide shows how to write skimmable comments.

Subsequent comments

We will invariably ask for additional iterations of output. This is challenging because it because it adds an additional variable: your answers might change. A good response includes the four items above, together with:

Does the answer exactly replicate?
- Did the numbers change a little bit, but the interpretation is the same? (we call this "numbers changing")
- Did the numbers change a lot? (we call this "sentences changing" or "substantively changing")
- If it isn't obvious already, why did the numbers change?

Troubleshooting bugs

When we ask you to find and/or fix a bug, please tell us how the bug arose and how you fixed it, rather than just that you fixed it.

Clarity and proofreading

Here are some "rules" we think are helpful for all types of writing: academic papers, markdown memos, and github comments.

Clarity: Rule #4: No great paper—no matter how well constructed, brilliant, and well written—first emerged from the author’s printer in that form. It was rewritten at least 10 times. Rewriting is the true art of writing.

Proofreading: Rule #2: The insights of your paper will first be judged by how you present them. If your paper is written in an unprofessional manner, your empirical work, mathematical proofs, and models will be viewed with initial skepticism.

From The Ten Most Important Rules of Writing Your Job Market Paper

Common challenges

Becoming an RP is a new and different challenge from what you have done before.

Here are a few common aspects of what is challenging:

When you are assigned a task in a class, the professor (hopefully) has a clear idea of what the answer is. In research, by definition we don't know the answer. Often we don't even know what the right question is! We hope you will ask clarifying questions about the task and propose an alternative question if you think we have assigned the wrong task to achieve the objective.
Sometimes you might be afraid to ask questions. We expect you to ask lots of questions as do your peer RAs; it is hard to do good work (at least in the collaborative approach that our lab takes) without asking questions.
Sometimes you might go too fast. We expect you to go slow and check your work. For example, write unit tests and communicate what those tests were your readout.
Fixed costs. There are fixed costs to learning about the background that motivates the project and to the lab's highly-involved IT setup. You should feel good about time spent paying these fixed costs.

Bottom Line: So what makes a good RP

High Fidelity: we want to know we can trust the output you produce. This means taking the extra time to double check your code, output, and comments; being fluent in and able to defend the methodological choices you have made; and communicating places where numbers are soft and answers are uncertain.
Working toward the bigger picture: our jobs become easier and we think yours become more enjoyable when you are creatively answering research questions rather than completing tasks we define for you. In practice this means focusing on the ticket objective over it's checkboxes, proposing and implemented next steps without PI input, and communicating the ways your work influences the larger project.
Being a team player: Sometimes you will be asked to work on things completely orthogonal to the project you are focused on but benefit the lab as a whole. Lab works best when we see this work as equally important and view ourselves as part of a research team rather than a collection of individual researchers.