Research Paper Writing Notes - HealthRex/CDSS GitHub Wiki

“If it wasn’t published, it didn’t happen.” Advice from my undergrad department chair that I didn’t understand very well at the time. While we should always be doing good work for the sake of doing good work, you get very close to zero credit for anything in the University unless you can communicate it through a peer-reviewed publication. If you complete 80% of a research project, but don’t actually finish and publish it, your CV/resume will look an awful lot like you did nothing.

Tips and Guides

Writing scientific papers is a big topic, but you can find good tips such as in the links below.

Ten simple rules for writing research papers. PLoS Comput Biol. 2014
Ten simple rules for getting published. PLoS Comput Biol. 2005
Ten simple rules for scientists: Improving your writing productivity. PLoS Comput Biol. 2018

Collected below are summary tips that have helped me work through my own writing. See also notes I’ve collected from workshops and mentors on research grant proposal writing. I’ve found that getting better at proposal writing directly corresponds to writing better research paper sections Intro (Significance and Innovation), Objective (Specific Aim), and Methods (Research Strategy) sections.

Outline a Template

Like Test Driven Development, but for paper writing. Before you start writing down paragraphs, sketch out the basic outline and logical flow of the paper. You can find an existing paper you like and then copy their structure, but beware of different formats like Nature/Science that put the Methods section at the end. You should be able to do this at the proposal phase, before any research experiments have even been done. Bonus points if you create placeholder figures/tables that you can fill in with real data once your experiments are completed. This makes it much easier to drive your primary research experiments and development when you have defined fill-in-the-blanks to tackle.

Look for research reporting guidelines that match your study type. Particularly for clinically oriented journals, the checklists may even be required to reassure them you’ve done a valid study if they don’t have the relevant editorial expertise to fully understand it.

TRIPOD - Individual prediction models (closest for supervised machine learning)
RECORD - Observational research with routinely collected data (closest for EMR based clinical data studies)
STROBE - Observational research
SQUIRE - Quality improvement
CONSORT - Clinical trials

Objective/Hypothesis Statement ~ Specific Aim

Perhaps the most important sentence in the entire document. As a peer reviewer, I specifically search for this statement and evaluate the paper based on whether it achieved the stated Objective. While there can be fair debate about whether an Objective/Study is important, novel, or otherwise compelling, there should be no debate about whether a study's objective was completed or not. If it is debatable or unclear, that means either your objective is not well defined or the study is not properly or completely done.

End the Introduction with the Objective statement that boil downs WHAT you are doing in this study into one (maybe two) sentences. Don't describe background context, methods, interpretation or potential future directions here. Those other items are material for the Intro (WHY), Methods (HOW), and Discussion. The Objective statement should closely map to a plain Specific Aim statement.

Aim: Achievable state of the world or knowledge.
Specific: Defined well enough that an external observer can verify whether the aim was completed (even if they do not fully understand how).

Ideally phrased in hypothesis answering form, "Determine whether X is better than Y by measurement A." If more open-ended, data-mining, avoid sounding like a fishing expedition: "We will explore XYZ," by framing as: "Determine which of XYZ is best by measurement A."

Hypothesis answering format can be hard with computational work, which is often more about creating a function/capability than answering a question. The research question could be how well that created function works, but beware of how to define your benchmark for success.

High Jump vs. Long Jump: In a high jump, you either jump over the bar or you don’t. Danger. If your method, tool, intervention has to achieve X% performance to be useful, what if your best results are short by a few percent? Is your whole project dead in the water after you spent months on it? In a long jump, just jump as far as you can, and record how far you got. No matter what, you’ll have a number to report and learn from, though you’ll still want some practical benchmark to compare against.

Setting the Bar vs. Raising the Bar: In an established field with a defined competitive landscape (e.g., ImageNet, Kaggle competitions, CASP protein structure prediction, etc.) you are trying to raise the bar by squeezing out a couple percent more accuracy than the state-of-the-art. Danger. This quickly becomes very hard to do with diminishing asymptotic returns. If you miss the top score by a fraction of a percent, you get to be the first loser that no one remembers. You can instead try to set the bar by defining and tackling a new problem. No matter how good or bad your results are, you are the state-of-the-art, because no one else has even tried. Nice, but you’ll instead have to invest effort in your Introduction to justify why anyone cares about this new problem definition.

Examples: “Our objective is to...”

“Show that X is better than Y.”
- Danger. If you set out to “show” something, that means you’ve already decided on the conclusion. Most likely you’re biased in your interpretation (probably because you want your preferred method, tool, or invention to look great).
- If you really do know that X is better than Y, then you already know the answer. This is not research, this is a programming exercise.
- If needed, try instead: “Determine how much better X is than Y” or "Determine under which conditions X is better than Y."
- What if your study ends up showing Y is better than X? Would you still publish this? A good question is important regardless of which way the results emerge.
“Cure cancer” or “end poverty.”
- No, these are not achievable states.
“Develop and test an intervention to reduce cancer mortality” or to “increase access to healthcare without increasing costs.”
- Okay, these may be possible, but now make it specific…
“Develop models to predict patient mortality.”
- No, this is not specific. How can you (or an external observer) tell if you’ve done a good or bad job of this?
“Develop and validate models to predict inpatient mortality for patients with severe sepsis with superior accuracy and lead-time than existing standard-of-care prognosis scores (e.g., APACHE, SAPS).”
- Okay, I will be able to tell at the end whether you have successfully completed this objective or not, even though I still may be wondering whether it is worth answering this question → Introduction to motivate WHY this is worth doing.

Writing vs. Editing

Make a clear distinction between these two phases. Otherwise you can get stuck in writer’s block while your inner critic stops you from writing anything. I’ve tried writing Intro sections on a blank page, knowing that it must be compelling, and two hours later found that I’d written two sentences. This will not work. Just write down a semi-formed stream of consciousness with everything that might be important to fill out your template outline, knowing that you will come back in the next phase to repeatedly edit it back down into something compelling.

Writing Order

Rough order I try to tackle writing a paper:

Objective: WHAT is the study going to do?
Methods: HOW was the study done? Plain, boring instruction manual with appropriate level of detail for a reviewer to evaluate and possibly reproduce what you did. Better yet, think of it as an instruction manual written to your future self when you have to go back and redo experiments/revisions several months later. Minimal to no value judgements or justifications here (that goes in Intro and Discussion).
Results: Plain statement of the facts, figures, and findings of your study. Better if in graphical and tabular formats. Minimal to no interpretations here (that goes in Discussion).
Introduction: WHY should anyone care about your study question?
- I start with the prior sections (Methods and Results) because they are much easier plain reporting of what you did. Introduction is where more rhetorical structure and story-writing helps to motivate your work.
- Think Significance (and Innovation) sections of a research proposal. Lead with someone dead or dying within the first two sentences, point to the cause being an unsolved problem in the world, and connect to how your study directly addresses at least part of that problem.
- Manipulate your reader’s emotions from being outraged that such a terrible problem remains unsolved in the world, and then bring them to relief and joy that your study is directly addressing the problem.
Discussion:
- Lead with a summary of the most important findings in the Results followed by your interpretation of the implications of your findings. So you found that X is greater than Y. What does that mean for the world at large?
- Limitations: Essential section to illustrate an understanding of how issues could alter your study conclusions. Try to end with an up-spin of how you mitigated those risks with replication, sensitivity analysis, external references, or ongoing work. This is great to proactively intercept reviewer critiques, but don't try to address every possible critique in advance. There are hundreds of limitations and issues with every study, and you could spin your wheels trying to proactively address things that no one cares about. If you’re not sure what issues reviewers will care about, you can just ask them (i.e., submit the paper).
- Future Directions? Don’t bother. Nobody cares what you promise you might do in the future, unless it helps them understand the point of your current work. Just report on what you actually did do, and an intelligent reader will decide for themselves if there’s anything worth doing next.
Conclusion:
- End with a brief one to two sentence statement that directly answers the Objective statement/question.
- Do NOT overreach. Common reason to kill a paper in peer review is if the conclusions do not entail from the results.
- Examples:
  - “In conclusion, we find that our algorithm can diagnose X with 10% greater accuracy than a panel of human experts.” Fair.
  - “In conclusion, we find that our algorithm is smarter than humans, and therefore millions of lives can be saved if we replaced humans with our algorithm.” I don’t think so. Save that for the press release hype if you feel like it.
  - “In conclusion, computers have far more search and memory capacity than humans.” This is probably true, but is not a conclusion from this study’s results. This could go in the Discussion of general implications (preferably backed by citations), but restrict your Conclusion statement to only what directly follows from your own study results.
Abstract: Summarize all of the above in few words. I like structured abstracts to make sure you hit the key elements. I expect plain reporting of at least the primary outcome numbers here, and not just a vague conclusion statement.
Cover Letter: Journals may require some specific statements here on authorship contributions and conflicts of interest. Further opportunity to hook the editor with a couple paragraphs, otherwise the worst outcome is they throw away your paper before you even get a chance to receive external peer review feedback. Reference related papers in the same journal to illustrate how your study makes a directly valuable contribution that advances the discussion already being published in the journal. Example:

	Dear Editors: 
	Please find attached our research manuscript submitted for publication in the 
	International Journal of Medical Informatics, entitled 
	“Decaying Relevance of Clinical Data Towards Future Decisions in Data-Driven Inpatient Clinical Order Sets.” 

	The era of electronic health records enables the rapid development of clinical prediction rules and 
	risk models for clinical decision support. Our study illustrates a novel potential application of a 
	clinical order recommender system (analogous to Netflix or Amazon’s product recommenders). 
	More importantly, we now reveal how clinical prediction models systematically overestimate 
	their utility towards future medical practices, with a “half-life” of clinical data relevance 
	as brief as four months. The important theme for the medical informatics community is that results 
	from predictive models built on large clinical datasets (like electronic health records) may be fleeting. 
	Systems require frequent (automated) updating to reflect current practice and our dynamic understanding 
	of the relationship with improved outcomes. 

	This manuscript builds on preliminary work that was well received at the 
	Pacific Symposium of Biocomputing 2016 (particularly the first example result table), 
	but constitutes a substantial expansion with evaluation of the relationship between 
	months of historical training data and future prediction accuracy, as well as a numerical solution 
	to the problem with a decaying weighting scheme. A copy of the previous conference proceedings 
	are attached for full disclosure.

	We suggest the following peer reviewers based on the relevant expertise and familiarity with research 
	based on attendance at the conference session where preliminary results were reviewed.
	- XXX ([email protected])
	- XXX ([email protected])

	For potential conflict of interest, we request to exclude the following individuals as possible reviewers:
	- XXX
	- XXX

	[Corresponding Author Contact Info]

Writing like Coding

Every section and sentence in a paper should be like lines of code in a computer program. A deliberate logical flow of dependencies from one to another. You cannot just add or remove random lines of code in a program where they don’t make sense, nor can you include them out-of-order. Try to “execute” every sentence of the paper in your reader’s brain to precisely mind control them towards the specific conclusion you want.

Editing - Subtractive Design

Once you’ve got a whole draft down, go back and ruthlessly cut down and rewrite everything. Many specific tips you can pick up such as avoid adverbs (just state the facts), rewrite sentences with 2+ commas, use parallel construction to simplify repetitive phrases, and more from references like Strunk and White’s The Elements of Style for vigorous writing that has no unnecessary words or sentences. A simple unifying heuristic that helps me:

Rewrite as if you have to pay $1 for every word that ends up in your manuscript. (Not far from the truth given open access publication fees these days!) This will prompt you to make every important point you need to make, and nothing else, using the fewest words possible. Ruthlessly kill any fluff or tangents, and all that’s left will be highly concentrated with meaning. An odd analogy, but I enjoyed this article about video game design: http://www.sirlin.net/articles/subtractive-design

Evaluation Goals: Clear then Correct then Compelling

A nice video with general tips of emphasis for reviewing research papers (https://www.acpjournals.org/journal/aim/reviewers). When writing and reviewing code, I care less about cool functions or minor tweaks for efficiency while I care most about readability and maintainability. If no one can understand, use, or modify your code, then nothing else is going to matter when your codebase eventually stagnates and dies. By analogy, I evaluate papers and presentations in order by:

Clear: If a reader cannot understand what’s happening, then nothing else is going to matter. They won’t even be able to evaluate or comment on the other elements.
Correct: Regardless of whether or not it is important, is the paper at least valid? Do the conclusions derive from a sound argument based on the data and directly answer the objective statement?
Compelling: Double-check that you’ve nailed clarity and correctness. If you fail on either of those points, then I can’t give you any credit for potential impact. We already have plenty of blowhards in Silicon Valley pitching grand visions to change the world, but with nothing credible under the hood to back it up. Nobody needs you to become another one. Note that I call it compelling and not interesting. When you pitch a research idea and someone says, “that sounds interesting,” consider it a death knell. There are an unlimited number of ideas and questions that are intellectually interesting, but just because “there is little known about X,” it does not make it worthwhile to use the scarce resource of your time to wander around navel-gazing in there. Develop ideas that are intellectually interesting and whose answers will drive a call-to-action to change the way someone lives their life. Clinical journals in particular aspire to publish “practice-changing results.”

Picking a Publication Venue

Conference Paper vs. Journal Article

Strange cultural dynamics between disciplines. Cutting edge computer science work comes out in high-quality, peer-reviewed conference papers, because journal review cycles are too slow to keep up with a fast moving field. In contrast, most biomedical researchers barely understand what a “conference paper” is, thinking you mean a poster or 1-page abstract or something. For better or worse, many medical schools and the NIH will give much less credit for conference papers as compared to journal articles. General approaches I’ve adopted:

If a conference has a poster or abstract submission option, that can be a good excuse to attend while retaining full control to publish elsewhere.

Otherwise, a reasonable strategy is to submit good papers to interdisciplinary conferences (e.g., AMIA, Pacific Symposium in Biocomputing) to get peer reviewed work down and gain some feedback from the conference, but then still retain the option to follow-up with an enriched journal paper with at least “25%+ more results.” I make that transparent in the cover letter to journals about this and is usually not a problem.

If you already have something very complete and ready for a high impact journal now, I’d recommend just doing that rather than submitting as a conference paper. If instead you’ve got some good, but preliminary work, I’ve found these conferences helpful to get feedback from and still get some credit on your CV for a peer-reviewed, PubMed indexed paper while the expanded version is still developing.

For AMIA conferences, they have option to designate submissions as "Student Papers" that makes the work eligible for the Student Paper Competition and seems to improve acceptance rate a bit (https://academic.oup.com/jamia/article/28/9/1928/6310427). Post-docs and clinical fellows do still count as "students" for this purpose. You'll need your PI to add an accompany attestation letter similar to the below. https://docs.google.com/document/d/17HCIkgH5UUBXTt6NIbr3rXbLLVnCfWwi/edit

Targeting a Journal

The impact of your work matters in terms of how many people you reach and influence. This may be correlated with, but is not the same as, the Impact Factor of the journal you publish in. The real value I find of higher impact journals is that they are essentially an advertising mechanism, given that many people only read higher impact journals as an artificial information filtering mechanism to manage the overload of research publications. Rough strategy I use: Stack rank a list of journals that your paper may credibly fit into and then just start submitting from the top on down the line based on interest (but beware differing format requirements, word count restrictions, etc.). Some have referral networks that automate this, such as JAMA -> JAMA Internal Medicine -> JAMA Network Open. If you have the time, go ahead and aim a little higher than you might think you can hit for the first shot. Sometimes you’ll be surprised. If you’re never getting a rejection, you’re playing too safely.

Google Scholar journal ranking lists by category https://scholar.google.com/citations?view_op=top_venues&hl=en&vq=med_medgeneral

Nature, Science, Cell or die.

Okay, but I’ve seen many people die on this road. I also think this is a false dichotomy. Aim high and broad whenever you can, but just about any valid work done should be published somewhere. It's a disservice to science overall to leave results unpublished, even if negative or incremental, as that knowledge will be lost and condemn others to repeat it.

First, middle, last and “co-first authorships”

You primarily want first authorships. Your PI primarily wants last authorships. Any other middle authorship is credited substantially less (but also bad if you have zero middle authorships, because it makes you look like a jerk who doesn't work well with others).

"Co-first authorship" can help in some cases, but is somewhat fake (especially if list 3+ people as co-first). At the end of the day, the article is going to be referenced and indexed in PubMed as "Chen et al." With enough material, maybe you can produce two papers where the two co-first authors swap positions for a computational vs. clinical venue, but no guarantees. Being a co-first author on a paper is still strong for your own CV/resume, as you can describe it respectively on your job and promotion applications.

Either way, just make sure these choices and expectations are clear early on to minimize drama and disputes at the last minute. If I know I will be a middle author at the outset, I don't mind as long as I know, because then I will respectively throttle the amount of effort and attention I can credibly contribute to the project.

Guidance on who/what qualifies for authorship: http://www.icmje.org/recommendations/browse/roles-and-responsibilities/defining-the-role-of-authors-and-contributors.html

Completely made-up publication distribution that should work reasonably well:

50%+ primary research articles (first or last author)
25% middle author articles
<25% perspective, reviews, editorials, etc.

Gold open access publishing?

As long as your study is valid and you pay $3,000+ open access publishing fee, you should get published, without judgement on the impact or novelty of the study. In my evaluation notes above, I interpret this to mean that the paper must have Clarity and Correctness, even if it's not Compelling. Usually a last resort, but a backup option if there's no one left to drive the effort for a major overhaul of the project/paper and we just need to get the work already done out the door. E.g., PLoS ONE, Nature Scientific Reports, Medicine

Pre-Prints (https://www.biorxiv.org/, https://www.medrxiv.org/)

Overall I see this as the inevitable natural progression of open science in the internet age, with the rapid dissemination of information and potential for mass crowdsourced peer review. As of 2019 however, not everyone is on board yet: https://twitter.com/alisonkgerber/status/1164402584771338240?s=19
https://twitter.com/boehninglab/status/1002974942730866689?s=20

Flashy journals like to impose publication embargos and paywalls so that they can get the scoop on the first look for breaking news research. Just make sure the eventual journal/conference you're targeting understands and accepts pre-prints. If so, then okay to get your material online and timestamp your ideas.

Computational fields are way ahead of the curve here in adopting pre-prints as first publication forum, whereas (bio)medicine and healthcare publications are lagging. In 2019 it remains newsworthy that a major medical journal will even consider it acceptable to post pre-prints before consideration for peer-reviewed publication. https://twitter.com/IAmSamFin/status/1176835504739880962?s=20

Tools

Collaborative Editing [ ] Google Docs - Overleaf for Latex I've pretty much dropped Microsoft Word at this point because the collaborative editing and commenting features of Google Docs makes it so much easier to work with teams (even though it makes it harder to manage figures and line art).
Reference Management [ ] PaperPile, Mendeley, Papers, etc. (http://naepub.com/writing-basics/2016-26-4-4/) Setup an account with one of these ASAP. Don't ever waste your time trying to manually construct a bibliography. Whenever you read even the abstract of an article, just add it to your database. Very often I've otherwise been caught wanting to reference some number of finding I remember reading about a few months ago, but I just can't remember what the article name was.
Figure / Table Management [ ] Many journal submission formats want them as separate files [ ] Keep line art (e.g., graphs/charts) in vectorized graphics format (e.g., eps, svg, wmf/emf). This keeps them looking clean when printing out or zooming in.

Scientific Writing vs. Perspective/Opinion vs. Speaking/Sales Pitch

Colloquial phrases you can get away with in speech generally do not belong in scientific writing. Example: “Disease X causes horrendous loss of life with massive impacts on society.” No, just state the facts/numbers and provide references, and let the data speak for themselves. “Disease X causes 10,000 deaths and an estimated $100 billion in healthcare costs per year.[Reference 5,6]”

Scientific writing is not supposed to be creative writing. Don’t come up with new clever ways to describe the same concept over and over again. Decide on key phrases and stick to the same ones to maintain clarity of your message. If you’re writing a perspective or opinion piece, then feel free to open up to a more colorful writing style.

Benchmark Descriptions

Gold Standard = Best definition / measurement of the "truth." For clinical problems and diagnoses, often no one really knows the truth. In that case, I avoid using "gold standard" and instead just refer to the "reference standard."
Standard of care = What people are actually doing in regular clinical practice
State of the art = Best available in research. Such as the best accuracy achieved in ImageNet competition.

I would not consider these interchangeable, as they each apply to a different context.

Active Voice

Passive Voice: "It has been said...," "This was done..."
Active Voice: "Smith et. al said..., "We did this..."

I never really understood the complaint in college writing classes about passive voice being "boring to read," and doesn't elicit as much "action" as active voice. What made more explicit sense to me is that passive voice phrases are incomplete. They are hiding information about who the "they" are that did the action. To be complete and objective then, you must write in active form to identify who the subject is. This makes it okay to reference yourself ("We did this...").

Structured Communication Outline Suggestions

https://twitter.com/rskudesia/status/1120324415424585728?s=19

Suggested Reviewers

This is where meeting people at conferences and poster sessions can be valuable to see who’s interested in your work. Editors will still pick other people anyway, probably based on the papers you cite.

Responding to Reviews

http://matt.might.net/articles/peer-review-rebuttals/
Desk rejection by the Editor. This is worst case, as means you don't even get comments back to help. For high profile journals (e.g., NEJM, JAMA) at least they'll give you this rejection promptly (<1 week), so you don't lose too much time.
As long as you get some kind of comments from peer reviewers, even a Reject with Review, Russ says can always interpret as a "Major Revisions Necessary" if you're up for redoing a bunch of experiments and analysis to try submitting again vs. moving on to another journal.
When revising your manuscript, keep in "Track Changes" / "Comment" mode as some journals want both the final clean version and the track changes version.
Prepare a separate document that is the response to reviewers.
Through highlighting or blockquotes, alternate between each individual point raised in the review, and your response to how you address it in the revision.
Profusely thank the reviewer for each major point raised (I probably do so excessively), helping you identify areas for clarification and improvement in the manuscript quality for the community.
Even if you disagree with a point they make, social psychology technique: Lead with agreeing with them. "We thank the reviewer for highlighting this point, and agree on the importance of this aspect of a complex issue. To better clarify, we have revised the manuscript..."
Almost anything that you write in these responses should yield at least a one line edit in the manuscript. If you don't make a respective change in the manuscript, then you're just "arguing" with the reviewer in the response. Reference the respective section of the manuscript you revised.
Different styles, but I prefer not to copy the entire contents of the edited section, as it violates principle of "Do Not Repeat Yourself" that can get you into problems with version consistency if you keep editing the manuscript. Likewise if the same response applies to multiple comments, instead of copy-paste, I just "refer to response to Reviewer #1 on the same issue."
Even though you're answering the reviewer critiques, note that it technically remains the (Associate) Editor's decision what do with your manuscript. They just use the reviewers for advice. Means you don't necessarily have to agree with every reviewer suggestion, and even if you answer all of their suggestions, you still may be rejected.

Examples:

Responses to medical informatics paper addressing potential redundancy with conference paper and clinical relevance https://drive.google.com/file/d/0B01s8toiGe4NTE1fcmllOUd2TDA/view?usp=sharing
Responses to clinical research paper that had a hard time understanding informatics applications and contributions. https://drive.google.com/file/d/1ivCDAIRuRU5uy0bbAJDPecMqK7Yt02ja/view?usp=sharing

Publication Fees / Open Access

In general, if we have the budget, better to support Open Science, rapid dissemination, and more citations by paying Open Access publication fees.

Significant Digits

Don't over report. Usually no more than two sig digits, almost never more than three. "The area under receiver operating characteristic achieved was 0.7443 compared to the baseline of 0.6375." Ugh, that's just distracting, just say 0.74 vs. 0.64. Try to avoid reporting small decimals in general, as lay readers have a harder time following. If need, express as "event rate per 1,000 patients" or something.

To the best of our knowledge, this is the first study to...

I don't like this phrase at all. It's usually fake. Peer reviewer just has to find one counter-example, and you've got egg on your face. Just say this study is important because it does X. Unstated, but implied it's something new. If reviewer doesn't know any other counter-examples, then they should go along with you.

Press Release

After paper is accepted, consider contacting your School's press department (https://med.stanford.edu/mednews.html) with a copy of the accepted manuscript to tell them how cool the work is, and implications for a lay audience. Give them a handful of bullet points that will grab headlines for the lay press. This is NOT intended for other scientists or clinicians. This description is meant to get news reporters interested. Journals usually have an embargo period where you're not supposed to advertise or release study findings until the paper is published. It's okay to contact press department in the meantime. The whole point is so that news and press departments can get a heads up, prep a story, and know not to release the story until the embargo is lifted. Ideally, a news story wants to be timely, so they want to synchronize the story on the exact same day the article is published, which means they need to know about it beforehand.

Notes from discussion with Stanford OCPA School of Medicine press Options:

Scope Blog: Designed to be easy for anyone to read (100K unique monthly visitors + Tweet out)
Stanford Medicine Magazine (quarterly)
Inside Stanford Medicine
Press Release ~1 week prior to paper coming out. OCPA will pitch to external journalists to write about it based on their own database of contacts.

Usually just pass papers along with pitch notes to OCPA reps, then they'll figure out what to do. Looking for stories that "'sister" would respond to. Smart but of general interest. Could be hot, timely topic trying to feed news cycle, or expectation defying novel results. Deep mechanistic papers or incremental changes on top of things already written about are likely important for science, but less likely going to work for news. Can also be past a tipping point. E.g., Had a few hot papers on AI image recognition that got a lot of attention, but many of them now so no one is picking up those stories anymore.

Stanford HAI Blog is another popular venue that is often interested in writing up many of our pieces. Here's what their content manager said before:

Thanks for reaching out! I run the HAI blog and am always seeking great commentary on AI in medicine.

I’m happy to write about specific research that has interesting implications, or do more high level commentary (we’ve done stories recently about how to evaluate medical AI tools, how useful LLMs are in medicine, etc.). For new research papers, I usually assign a writer to interview a faculty member, and for commentary, I usually ask the faculty to pen the article.

Oh, and I often work with Hanae Armitage so we’re not duplicating stories (and am looking forward to seeing her upcoming Q&A with you).

Let me know, Shana Lynch

Example Message sent to a Press Office / Reporter

We have a paper just accepted to JAMA Network Open that could be of broad interest.
(Attached, but still going through copyedit phase with journal. Public release date pending.)
 
Basic idea: We use Machine Learning algorithms on Electronic Medical Record data to predict the results of hospital diagnostic tests BEFORE the tests are even ordered.
The point: There’s little reason to expose patients to harm or waste scarce healthcare resources checking tests when the results are highly predictable (unlikely to give new information).
Headline: Machine learning creates opportunity to systematically identify low-value medical testing
 
Narrative outline: 
-	Healthcare reform debates are a political dogfight over containing out of control healthcare costs.
-	Especially frustrating that up to 1/3 of the $3 trillion we spend on healthcare is estimated to be waste
-	We focus on the most common medical procedure, laboratory testing, with estimates that up to ½ of hospital lab tests are medically unnecessary
-	Our study identifies clearly wasteful activity, such as repetitive orders for Hemoglobin A1c (diabetes test), even when it’s impossible for the value to change that quickly
-	We generalize this idea using machine learning algorithms to identify hundreds of thousands of lab tests in the hospital that are likely wasteful when their results could be predicted in advance.
 
Keywords this will hit:
-	Healthcare costs
-	Choosing Wisely
-	Diagnostic Excellence
-	Machine Learning / Artificial Intelligence / Big Data
-	Multi-Institution Collaboration (we worked with UCSF and U Michigan on this one)
 
This could be a nice follow-up to an article in the Washington Post:
https://www.washingtonpost.com/news/posteverything/wp/2018/10/05/feature/doctors-are-surprisingly-bad-at-reading-lab-results-its-putting-us-all-at-risk/?utm_term=.1511624cddeb

Social Media

Can and should post your work as it comes out to social media (e.g., LinkedIn, X/Twitter) for additional exposure and potential to connect with others.

Can be as simple as just posting a link to the article when it's out, but you'll get much better response with an image (e.g., graphical abstract or key results figure) and statement of the key conclusion/statement. If have the time, even better to have a few items in a row to outline describe the paper (for those who will read for 3 minutes, but won't read the whole paper).

Some examples below (not perfect, but gets some ideas).

https://x.com/jonc101x/status/1668132695573155840

https://x.com/jyx_su/status/1791515982080675988

https://x.com/jonc101x/status/1779877478124986586

https://x.com/jonc101x/status/1760088082752401670

Acknowledgements

Be sure to acknowledge any funding sources in paper. They specifically look for their grant numbers when compiling reports on measurable outputs. Usually add a disclaimer here as well. For example:

Dr. Chen has received research funding support in part by

NIH/National Institute of Allergy and Infectious Diseases (1R01AI17812101)
NIH-NCATS-Clinical & Translational Science Award (UM1TR004921)
NIH/National Institute on Drug Abuse Clinical Trials Network (UG1DA015815 - CTN-0136)
Stanford Bio-X Interdisciplinary Initiatives Seed Grants Program (IIP) [R12] [JHC]
NIH/Center for Undiagnosed Diseases at Stanford (U01 NS134358)
Stanford Institute for Human-Centered Artificial Intelligence (HAI)
Gordon and Betty Moore Foundation (Grant #12409)

This research used data or services provided by STARR, STAnford medicine Research data Repository,” a clinical data warehouse containing live Epic data from Stanford Health Care (SHC), the University Healthcare Alliance (UHA) and Packard Children’s Health Alliance (PCHA) clinics and other auxiliary data from Hospital applications such as radiology PACS. The STARR platform is developed and operated by Stanford Medicine Research IT team and is made possible by Stanford School of Medicine Research Office.

The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH, Stanford Healthcare, or any other organization.

(Financial) Conflict of Interest Statements

Jonathan H. Chen

Co-founder of Reaction Explorer LLC that develops and licenses organic chemistry education software.
Paid medical expert witness fees from Sutton Pierce, Younker Hyde MacFarlane, Sykes McAllister, Elite Experts.
Paid consulting fees from ISHI Health.
Paid honoraria or travel expenses for invited presentations by insitro, General Reinsurance Corporation, Cozeva, and other industry conferences, academic institutions, and health systems.

Preparing IRB protocols

We have a general IRB protocol for data-mining medical records if people need access to potentially identifiable data.

If there is a need more for specific projects or interactions, such as user surveys, testing, clinical research deployments, etc., then individuals will need to draft their own IRB proposals on the planned work. See an example of IRB Protocol for data mining here and for user-testing here.