7.5.1.Develop documentation and reports in RStudio - sj50179/Google-Data-Analytics-Professional-Certificate GitHub Wiki

R Markdown

  • A file format for making dynamic documents with R

R Notebook

  • Lets users run the code and show the graphs and charts that visualize the code

R Markdown resources

R Markdown is a useful tool that allows you to save and execute code, and generate shareable reports for stakeholders. As you learn more about how to use it, it can be helpful to bookmark some resources to refer to later.

This reading explores some great online resources that will help you learn more about R Markdown and how to use it to document your analysis.

R Markdown documentation

RStudio's R Markdown documentation includes a series of tutorials that will help you learn about the main features of R Markdown, including code chunks, output formats, notebooks, interactive documents, and more. The tutorials include online lessons that you can complete directly in your RStudio Cloud workspace.

R Markdown reference materials

RStudio has developed a reference guide and a cheat sheet that you can bookmark and use whenever you practice writing R Markdown files.

  • The R Markdown Reference Guide contains three sections: Markdown syntax, knitr chunk options, and Pandoc options. The guide is super detailed and includes tons of examples and explanations so that you can easily find the exact information you need to customize your R Markdown documents.
  • The R Markdown Cheat Sheet is a convenient summary of the different steps and workflow processes for R. It also includes sections with abbreviated explanations of knitr and pandoc chunk options, and other useful information to review or look up while you work.

R for Data Science book

For a well-organized introduction to the basics of R Markdown, check out the Communicate section of the R for Data Science book. It covers the main features and functions of R Markdown, the various output formats, and the workflow for combining text and code to create an analysis notebook.

R Markdown: The Definitive Guide

If you want to really explore the capabilities of R Markdown in a systematic way, R Markdown: The Definitive Guide provides a comprehensive guide to the R Markdown ecosystem. This book contains four main parts:

  1. Part I explains how to install the relevant packages and offers an overview of R Markdown, including the syntax for Markdown and code chunks.
  2. Part II provides detailed documentation of the built-in output formats included in R Markdown, like document formats and presentation formats.
  3. Part III shares several R Markdown extension packages that allow you to build different applications or generate output documents with different styles.
  4. Part IV covers advanced topics in R Markdown.

Optional: Jupyter notebooks

Jupyter notebooks are documents that contain computer code and rich text elements – such as comments, links, or descriptions of your analysis and results. You will find them used in a variety of online tools, including Project Jupyter, Kaggle, and Google Colaboratory ("Colab" for short). These notebooks can be executable documents that you can run to perform an analysis.

Jupyter notebooks can come in handy with everything from data cleaning and transformation, to statistical modeling and visualizations. They are compatible with R, so you can consider them as an alternative to R Markdown. And just like R Markdown documents, you can easily share Jupyter notebooks with team members and stakeholders.

Jupyter notebooks in Kaggle

If you are working in Kaggle, there are two types of notebooks available: Jupyter notebooks and scripts (including R Markdown scripts). For more information, refer to the How to Use Kaggle Notebooks page.

Jupyter notebooks in Google Colab

Google Colab is a product from Google Research. Colab is a hosted Jupyter notebook service that requires no setup to use. For more information, refer to the Welcome to Colaboratory page.

Additional resources

To learn more about Jupyter notebooks, check out these resources:

  • Project Jupyter: This is the home of Jupyter notebooks, as well as JupyterLab – the web-based interactive development environment for Jupyter notebooks, code, and data.
  • Jupyter Notebook: An Introduction: This detailed introduction of Jupyter notebooks comes from the people at Real Python, a tutorial-based site devoted to all things Python. You can take a video course or follow the written tutorial to get started with Jupyter notebooks and learn about its features and capabilities.

And, just like R Markdown, Jupyter notebooks include basic formatting tools and rules that will help you keep your work organized and user-friendly. In fact, Jupyter uses R Markdown as its language for writing and formatting text in a notebook.

To learn about basic formatting in Jupyter notebooks, check out these resources:

  • The Jupyter** **Notebook: This resource provides an overview of Jupyter notebooks, including information about the structure of the user interface and notebook document. You’ll also learn about the basic workflow for using a notebook document, along with information about keyboard shortcuts and other features that will help you format your work.
  • Using Jupyter Notebook for Writing: This resource focuses on how to use Markdown to format your writing in a Jupyter notebook. Use this as a guide to manage the syntax of your writing, including making titles and subtitles and adding links.
  • The Jupyter Notebook Formatting Guide: This resource includes a wide variety of formatting options for Jupyter notebooks. You’ll learn about the basics as well as some more advanced options, like embedding PDF documents and videos.

After you know how to apply basic formatting to your notebooks, you can start exploring more advanced options.

Question

The knit button can be used to save an R Markdown document as a shareable HTML report.

  • True
  • False

Correct. The knit button creates a shareable HTML report of the R Markdown file.

Question

A data analyst has code chunks in their R Markdown file. How do they appear in an HTML report?

  • HTML code
  • Generated output
  • Attachments
  • Plain text

Correct. The code chunks are run and the output appears in the HTML report.

Hands-On Activity: Your R Markdown notebook

Get started with R Markdown

R Markdown is a file format for making dynamic documents with R. These documents, also known as notebooks, are records of analysis that help you, your team members, and stakeholders understand what you did in your analysis to reach your conclusions. You can publish a notebook as an html, pdf, or Word file, or in another format like a slideshow.

At any point during this activity, you can consult the R Markdown Cheat Sheet. This resource is a reference guide for all things R Markdown: from opening a file to publishing a final report of your analysis.

Select and review your analysis

In this course, you’ve had the chance to practice and save files of your analysis in RStudio. To get started, open up an analysis that you have saved.

You can use Open File in the File menu:

Or you can use the Files tab in the bottom-right viewer pane:

Now, review the file you opened. Examine the data you pulled from and the functions you used to analyze it.

When you create an R Markdown notebook, you want to be able to share it with others so they can understand your process and conclusions. You may also want to keep it for your own records as a way to keep track of your progress using R for analysis.

Open an Rmd file

Now, you’ll transfer the code from the file you opened to a new R Markdown file so that you can write your own explanation of the steps you took. By doing this, you can create a more complete record of your overall thought process so that others will be able to understand it.

  1. Open a new R Markdown (Rmd) file to begin building the basic structure of your notebook. Select File -> New File -> R Markdown.

  1. In the dialog box that opens, add a title for your notebook. Name it something that will help you easily recognize what your analysis is about (e.g., “Penguins Plots”).

  2. Type your name in the Author field.

  3. For now, leave the file in the recommended html output format. When you render the file later, it will appear as an html report. You can always change it to a pdf or Word file later.

  4. Click OK. An R Markdown file will appear in a new tab in the script viewer pane. You should now have two tabs: one for the new Rmd file and one for your analysis. You can toggle back and forth between them when you need to by clicking on the tab you want to access.

Format your notebook

The first part of your notebook is the YAML header section. YAML is a language used in data files to improve human readability, and the YAML header section exists to provide information about a document to the humans reading it. RStudio automatically populates this section with the information you provide and other general information, such as the date you create the file.

You can change the information in this section at any time by adding text or by typing over the current text. Notice that each line has a number associated with it. That makes it easy to reference a location in the notebook and also for you to track where you make changes in the notebook.

The next section with the gray background is a code chunk. You encountered these each time you ran a chunk of code during the activities in this course.

Again, RStudio automatically populates the notebook with this formatted default code chunk. This chunk basically means that your code will be shown in your final report when you’re ready to render it.

All code chunks begin and end with delimiters. To start a code chunk, you can type three tick marks followed by a lowercase “r” in curly brackets: ```{r}

To end it, type just the three tick marks: ```

There are two shortcuts to adding code. On your keyboard, you can press Ctrl + Alt + I (PC) or Cmd + Option + I (Mac). Or you can click the Add Chunk command in the editor toolbar:

To add a code chunk to your Rmd file, follow these steps:

  1. Click the end of the last line of your Rmd file. Use either of the previously-mentioned shortcuts to create a code chunk.

  2. Press Enter (Windows) or Return (Mac) two or three times after the default code chunk to create space between the existing code chunk and the next code chunk you will add.

  3. Copy the code from the analysis file you opened earlier and paste it in the gray area between the beginning and ending delimiters.

  4. Select the rest of the template content in the file and delete it. This gives you a blank space to work in to help avoid potential errors from mixing your own comments and code with the pre-existing ones in the template.

The white background is where you will type plain text with markdown syntax. As you learned earlier in this course, markdown is a syntax for formatting plain text files. Using markdown makes it easier to write and format text in your notebook.

Here are some basic formatting options:

  • To start a new paragraph, end a line with two spaces
  • To apply italics to a word or phrase, place an asterisk at the beginning and at the end of the word or phrase, for example, italics works
  • To apply bold to a word or phrase, place two asterisks at the beginning and at the end of the word or phrase, for example, bold is useful
  • To create a header, type a hashtag (#) followed by a space and your text for example: # Getting Started with R Markdown

When creating headers keep the following in mind:

  • Headers will appear in blue
  • A single hashtag is the largest header
  • The more hashtags you add (up to six), the smaller the header

To format comments in your notebook, follow these steps:

  1. Click in a line above the code chunk you added but below the YAML section.

  2. Type a main header for your report using a single hashtag. You might want to restate the title in the YAML in a different way or add to it with a short description.

  3. Add a smaller header below that to label the first part of your programming. Follow that with a description of the code chunk that you added.

Tick marks format the text to appear as code even though the text is not in a code chunk. The tick marks in the code above create a gray background behind “tidyverse” and “palmerpenguins.”

Continue formatting

Keep working on your formatting until you have at least three levels of headers and more descriptions for your analysis. At any point, you can click Knit in the script pane to render the file.

When you render your file, you can preview how it will look in the format you selected when you opened the file. In this example, you will preview an html file.

Test your knowledge about documentation and reports

TOTAL POINTS 4

Question 1

Fill in the blank: Markdown is a _____ for formatting plain text files.

  • guide
  • syntax
  • coding language
  • file application

Correct. Markdown is a syntax for formatting plain text files.

Question 2

A data analyst creates an interactive version of their R Markdown document to share with other users that allows them to execute code the analyst wrote. What did they create?

  • An HTML report
  • A code chunk
  • An R notebook
  • A markdown

Correct. They created an R notebook, which is an interactive R Markdown option. It lets users run code from the R Markdown document and displays charts and graphs to visualize that code.

Question 3

A data analyst wants to convert their R Markdown file into another format. What are their options? Select all that apply.

  • Dashboard
  • JPEG, PNG, and GIF
  • HTML, PDF, and Word
  • Slide presentation

Correct. R Markdown files can be converted into HTML, PDF and Word, slideshow presentations, or dashboards.

Question 4

A data analyst has finished editing their R Markdown file and wants to save it as an HTML report. What tool will they use?

  • Knit
  • Hashtags
  • Save
  • Output

Correct. The knit button will produce a report containing all text, code, and results from the R Markdown file.