7.5.1.Develop documentation and reports in RStudio - quanganh2001/Google-Data-Analytics-Professional-Certificate-Coursera GitHub Wiki
R Markdown is a useful tool that allows you to save and execute code, and generate shareable reports for stakeholders. As you learn more about how to use it, it can be helpful to bookmark some resources to refer to later.
This reading explores some great online resources that will help you learn more about R Markdown and how to use it to document your analysis.
RStudio's R Markdown documentation includes a series of tutorials that will help you learn about the main features of R Markdown, including code chunks, output formats, notebooks, interactive documents, and more. The tutorials include online lessons that you can complete directly in your RStudio Cloud workspace.
RStudio has developed a reference guide and a cheat sheet that you can bookmark and use whenever you practice writing R Markdown files.
-
The R Markdown Reference Guide contains three sections: Markdown syntax, knitr chunk options, and Pandoc options. The guide is super detailed and includes tons of examples and explanations so that you can easily find the exact information you need to customize your R Markdown documents.
-
The R Markdown Cheat Sheet is a convenient summary of the different steps and workflow processes for R. It also includes sections with abbreviated explanations of knitr and pandoc chunk options, and other useful information to review or look up while you work.
For a well-organized introduction to the basics of R Markdown, check out the Communicate section of the R for Data Science book. It covers the main features and functions of R Markdown, the various output formats, and the workflow for combining text and code to create an analysis notebook.
If you want to really explore the capabilities of R Markdown in a systematic way, R Markdown: The Definitive Guide provides a comprehensive guide to the R Markdown ecosystem. This book contains four main parts:
-
Part I explains how to install the relevant packages and offers an overview of R Markdown, including the syntax for Markdown and code chunks.
-
Part II provides detailed documentation of the built-in output formats included in R Markdown, like document formats and presentation formats.
-
Part III shares several R Markdown extension packages that allow you to build different applications or generate output documents with different styles.
-
Part IV covers advanced topics in R Markdown.
Jupyter notebooks are documents that contain computer code and rich text elements – such as comments, links, or descriptions of your analysis and results. You will find them used in a variety of online tools, including Project Jupyter, Kaggle, and Google Colaboratory ("Colab" for short). These notebooks can be executable documents that you can run to perform an analysis.
Jupyter notebooks can come in handy with everything from data cleaning and transformation, to statistical modeling and visualizations. They are compatible with R, so you can consider them as an alternative to R Markdown. And just like R Markdown documents, you can easily share Jupyter notebooks with team members and stakeholders.
If you are working in Kaggle, there are two types of notebooks available: Jupyter notebooks and scripts (including R Markdown scripts). For more information, refer to the How to Use Kaggle Notebooks page.
Google Colab is a product from Google Research. Colab is a hosted Jupyter notebook service that requires no setup to use. For more information, refer to the Welcome to Colaboratory page.
To learn more about Jupyter notebooks, check out these resources:
-
Project Jupyter : This is the home of Jupyter notebooks, as well as JupyterLab – the web-based interactive development environment for Jupyter notebooks, code, and data.
-
Jupyter Notebook: An Introduction : This detailed introduction of Jupyter notebooks comes from the people at Real Python, a tutorial-based site devoted to all things Python. You can take a video course or follow the written tutorial to get started with Jupyter notebooks and learn about its features and capabilities.
And, just like R Markdown, Jupyter notebooks include basic formatting tools and rules that will help you keep your work organized and user-friendly. In fact, Jupyter uses R Markdown as its language for writing and formatting text in a notebook.
To learn about basic formatting in Jupyter notebooks, check out these resources:
- The Jupyter Notebook : This resource provides an overview of Jupyter notebooks, including information about the structure of the user interface and notebook document. You’ll also learn about the basic workflow for using a notebook document, along with information about keyboard shortcuts and other features that will help you format your work.
- Using Jupyter Notebook for Writing: This resource focuses on how to use Markdown to format your writing in a Jupyter notebook. Use this as a guide to manage the syntax of your writing, including making titles and subtitles and adding links.
- The Jupyter Notebook Formatting Guide: This resource includes a wide variety of formatting options for Jupyter notebooks. You’ll learn about the basics as well as some more advanced options, like embedding PDF documents and videos.
After you know how to apply basic formatting to your notebooks, you can start exploring more advanced options.
Earlier in this course, you worked on activities that were presented in an R Markdown (Rmd) file format. Data analysts use this file format to make dynamic documents—called notebooks—with R. In this activity, you’ll copy the analysis you did in a past activity into your own R Markdown notebook.
By the time you complete this activity, you will know how to create R Markdown documents to record your analysis in R. This will allow you to keep track of your data analysis process and share your work with others.
R Markdown is a file format for making dynamic documents with R. These documents, also known as notebooks, are records of analysis that help you, your team members, and stakeholders understand what you did in your analysis to reach your conclusions. You can publish a notebook as an html, pdf, or Word file, or in another format like a slideshow.
At any point during this activity, you can consult the R Markdown Cheat Sheet. This resource is a reference guide for all things R Markdown: from opening a file to publishing a final report of your analysis.
In this course, you’ve had the chance to practice and save files of your analysis in RStudio. To get started, open up an analysis that you have saved.
You can use Open File in the File menu:
Now, review the file you opened. Examine the data you pulled from and the functions you used to analyze it.
When you create an R Markdown notebook, you want to be able to share it with others so they can understand your process and conclusions. You may also want to keep it for your own records as a way to keep track of your progress using R for analysis.
Now, you’ll transfer the code from the file you opened to a new R Markdown file so that you can write your own explanation of the steps you took. By doing this, you can create a more complete record of your overall thought process so that others will be able to understand it.
- Open a new R Markdown (Rmd) file to begin building the basic structure of your notebook. Select File -> New File -> R Markdown.
- In the dialog box that opens, add a title for your notebook. Name it something that will help you easily recognize what your analysis is about (e.g., “Penguins Plots”).
- Type your name in the Author field.
- For now, leave the file in the recommended html output format. When you render the file later, it will appear as an html report. You can always change it to a pdf or Word file later.
- Click OK. An R Markdown file will appear in a new tab in the script viewer pane. You should now have two tabs: one for the new Rmd file and one for your analysis. You can toggle back and forth between them when you need to by clicking on the tab you want to access.
The first part of your notebook is the YAML header section. YAML is a language used in data files to improve human readability, and the YAML header section exists to provide information about a document to the humans reading it. RStudio automatically populates this section with the information you provide and other general information, such as the date you create the file.
You can change the information in this section at any time by adding text or by typing over the current text. Notice that each line has a number associated with it. That makes it easy to reference a location in the notebook and also for you to track where you make changes in the notebook.
The next section with the gray background is a code chunk. You encountered these each time you ran a chunk of code during the activities in this course.
Again, RStudio automatically populates the notebook with this formatted default code chunk. This chunk basically means that your code will be shown in your final report when you’re ready to render it.
All code chunks begin and end with delimiters. To start a code chunk, you can type three tick marks followed by a lowercase “r” in curly brackets: ```{r}
To end it, type just the three tick marks: ```
There are two shortcuts to adding code. On your keyboard, you can press Ctrl + Alt + I (PC) or Cmd + Option + I (Mac). Or you can click the Add Chunk command in the editor toolbar:
To add a code chunk to your Rmd file, follow these steps:
- Click the end of the last line of your Rmd file. Use either of the previously-mentioned shortcuts to create a code chunk.
- Press Enter (Windows) or Return (Mac) two or three times after the default code chunk to create space between the existing code chunk and the next code chunk you will add.
- Copy the code from the analysis file you opened earlier and paste it in the gray area between the beginning and ending delimiters.
- Select the rest of the template content in the file and delete it. This gives you a blank space to work in to help avoid potential errors from mixing your own comments and code with the pre-existing ones in the template.
The white background is where you will type plain text with markdown syntax. As you learned earlier in this course, markdown is a syntax for formatting plain text files. Using markdown makes it easier to write and format text in your notebook.
Here are some basic formatting options:
- To start a new paragraph, end a line with two spaces
- To apply italics to a word or phrase, place an asterisk at the beginning and at the end of the word or phrase, for example, italics works
- To apply bold to a word or phrase, place two asterisks at the beginning and at the end of the word or phrase, for example, bold is useful
- To create a header, type a hashtag (#) followed by a space and your text for example: # Getting Started with R Markdown
When creating headers keep the following in mind:
- Headers will appear in blue
- A single hashtag is the largest header
- The more hashtags you add (up to six), the smaller the header
To format comments in your notebook, follow these steps:
- Click in a line above the code chunk you added but below the YAML section.
- Type a main header for your report using a single hashtag. You might want to restate the title in the YAML in a different way or add to it with a short description.
- Add a smaller header below that to label the first part of your programming. Follow that with a description of the code chunk that you added.
Tick marks format the text to appear as code even though the text is not in a code chunk. The tick marks in the code above create a gray background behind “tidyverse” and “palmerpenguins.”
Keep working on your formatting until you have at least three levels of headers and more descriptions for your analysis. At any point, you can click Knit in the script pane to render the file.
When you render your file, you can preview how it will look in the format you selected when you opened the file. In this example, you will preview an html file.
Suppose you include the header ## Conclusion in your R Markdown Notebook. How can you change this header to make it smaller?
A. Add another hashtag
B. Remove a hashtag
C. Add another space between the hashtags and the title
D. Remove the space between the hashtags and the title
The correct answer is A. Add another hashtag. Explain: To make a header smaller, add more hashtags before the title. For example, ### Conclusion would be smaller than ## Conclusion. Going forward, you can use R Markdown syntax to format your notebooks. This gives you creative freedom in how you want to present your analysis when you share it with others.
In this activity, you made your own R Markdown notebook. In the text box below, write 2-3 sentences (40-60 words) in response to each of the following questions:
- How can you make use of R Markdown notebooks in the future?
- What formatting did you use in this R Markdown notebook that will make it easier for others to understand your analysis?
Explain: Congratulations on completing this hands-on activity! A good response would include that R Markdown notebooks are valuable resources for creating records and reports of your analysis in R and formatting your work to share with others.
Creating an R Markdown notebook is a useful way to keep track of your analyses for your own purposes. You can also use notebooks to create final reports of your analysis to share with others. Going forward, you can take advantage of resources like the R Markdown Cheat Sheet to make your notebooks more effective tools.
Fill in the blank: Markdown is a _____ for formatting plain text files.
A. syntax
B. coding language
C. guide
D. file application
The correct answer is A. syntax. Explain: Markdown is a syntax for formatting plain text files.
A data analyst creates an interactive version of their R Markdown document to share with other users that allows them to execute code the analyst wrote. What did they create?
A. A markdown
B. An HTML report
C. An R notebook
D. A code chunk
The correct answer is C. An R notebook. Explain: They created an R notebook, which is an interactive R Markdown option. It lets users run code from the R Markdown document and displays charts and graphs to visualize that code.
A data analyst wants to convert their R Markdown file into another format. What are their options? Select all that apply.
- Dashboard
- HTML, PDF, and Word
- Slide presentation
- JPEG, PNG, and GIF
Explain: R Markdown files can be converted into HTML, PDF and Word, slideshow presentations, or dashboards.
A data analyst has finished editing their R Markdown file and wants to save it as an HTML report. What tool will they use?
A. Output
B. Hashtags
C. Save
D. Knit
The correct answer is D. Knit. Explain: The knit button will produce a report containing all text, code, and results from the R Markdown file.