Course 7‐5 - Forestreee/Data-Analytics GitHub Wiki

Google Data Analytics Professional

[Data Analysis with R Programming]

WEEK5 - Documentation and reports

When you’re ready to save and present your analysis, R has different options to consider. In this part of the course, you’ll explore R Markdown, a file format for making dynamic documents with R. You’ll find out how to format and export R Markdown, including how to incorporate R code chunks in your documents.

Learning Objectives

  • Demonstrate an understanding of how to export R Markdown notebooks
  • Incorporate R code chunks into R Markdown notebooks
  • Use basic formatting in R Markdown to create structure and emphasize content
  • Describe the R Markdown notebooks and their use to document R programming code
  • Create and outline a structure for an R Markdown notebook
  • Access and use a customized R Markdown template included in an R package
  • Demonstrate an understanding of the uses of R Markdown templates

Develop documentation and reports in RStudio

Documentation and reports

Hi, and welcome back. We've covered a lot in our time working in R. We've learned the ins and outs of R and RStudio including how to analyze and visualize your data.

Now, you'll learn how to document and report your work using R Markdown. R Markdown is a file format for making dynamic documents with R. You can use an R Markdown file as a code notebook to save, organize, and document your analysis using code chunks, comments, and other features. When you finish your data cleaning and exploration, you can create a report in R Markdown to summarize your findings for stakeholders.

The core of the work we do in my department involves analytics. When my team started getting bigger, we noticed we didn't have a shared language for data analysis. So, there was an effort for everyone to learn R so we could collaborate more easily. Now, everyone speaks the same programming language. We can review each other's code, which has led to more consistency and collaboration and better analysis. R Markdown reports are great for sharing knowledge. These reports let anyone from a small group of online users to a large company share and reproduce analysis.

In this course, we'll start with an overview of R Markdown, and then we'll learn how to install R Markdown in RStudio. After that, we'll check out how to create an R Markdown document. We'll also explore the structure and components of the document, so you'll have an idea of how to use them to record and report your analysis. Next, we'll show you how to insert and edit pieces of code called chunks into your document. Finally, we'll check out the process of exporting your documentation. It's always good to have a report of the analysis you've done, both for yourself and your stakeholders. After that, we'll wrap up our time with R. Of course, your time with R can keep going and you can keep practicing.

I hope you'll put R to good use in your future job as a data analyst too. It's a huge advantage in your career. Speaking of your career, when you've completed everything in our program, you'll have the chance to add to or start building your portfolio by completing a case study. This is a great way to showcase all the skills you've learned so far, and stand out to future employers. We'll talk more about this project later.

In the meantime, let's get back into the R groove. See you soon.

Overview of R Markdown

Hi again. As a data analyst, you'll need to refer to your analysis at a moment's notice. You might need to share it with your fellow team members or a stakeholder might ask about one of your conclusions. Documenting your work makes it easy to quickly share your analysis with anyone, and that's where R Markdown comes in.

Earlier, we learned that R Markdown is a file format for making dynamic documents with R. R Markdown lets you create a record of your analysis and conclusions in a document. It ties together your code and your report so you can share every step of your analysis. The best part is, you don't even have to leave RStudio to do this. This document will help stakeholders and team members understand what you did in your analysis to reach your conclusions. Their feedback will also help you improve your analysis. R Markdown documents are written in Markdown. Markdown is a syntax for formatting plain text files. Using Markdown makes it easier to write and format text in your document. Markdown is also easy to read and to learn. For example, if you want to italicize a word or phrase in Markdown, just add a single underscore or asterisk right before and after the word. When you create a report of the document, the Markdown formatting is no longer visible, just the word or phrase in italics. We'll show you more formatting options soon, but they're all similar to this example.

Basically, they're simple enough to let you focus on descriptions and explanations of your analysis without having to think too much about how to format them. Besides text, R Markdown also includes an interactive option called an R Notebook that lets users run your code and show the graphs and charts that visualize the code. Any R Markdown document can be used as a notebook. This creates a clear overall picture of your analysis and conclusions.

R Markdown lets you convert your files into lots of different formats too. You can create HTML, PDF, and Word documents, or you can convert to a slide presentation or dashboard. Having these options makes it easy to share the same analysis in a variety of ways, depending on your audience. The Markdown language was originally designed for HTML output. HTML is the set of markup symbols and codes used to create a web page. R Markdown has the most available features for this format, but you can get good results in any of the formats. While R Markdown is a great way to record and share your analysis, there are other options too.

Notebooks like Jupyter, Kaggle, and Google Colab do a lot of the same things as an R Markdown notebook, including the interactive elements. You'll read more about these options in a little bit.

Coming up, we'll create an R Markdown document. You'll get to see this effective analysis tool in action. See you soon.

Question:

R Markdown is a file format for making dynamic documents with R.

Question2:

R notebooks are an interactive R Markdown option that allows other users to run your code and display graphs and charts that visualize that code.

R Markdown resources (Reading)

Optional: Jupyter notebooks (Reading)

Using R Markdown in RStudio

Welcome back. Exploring the different tools available for analysis is one of the more fun parts of being a data analyst. By now, you've had the chance to try out tools like spreadsheets, BigQuery or other SQL tools and Tableau.

Now we'll check out a tool you can use in RStudio, R Markdown. As a reminder R Markdown is a great tool for documenting your analysis at any stage. But especially when you've completed a project. Let's open up RStudio and get started with R Markdown. Feel free to follow along with the video and try it out on your own later. Or go ahead and join us now in your own RStudio account. We'll first install the R Markdown package by using our install packages function and R Markdown in parentheses. As a reminder, installing packages can take some time. Bright red text may show up in your console as it installs. That's all perfectly normal. Okay, let's open up a new R Markdown or RMD file using the File menu. If you're working along with us and you're prompted to install packages that you'll need to open your file, go ahead and click Yes. Right away, you might notice some of the outputs available in R Markdown. For now, we'll use the default HTML and document options. The other output options will also be available later. We'll add a file name and author, and then open our file. Next, we'll save it so we can use it later.

So now we have an RMD file filled with metadata at the top and chunks of code in the gray sections. There's text in between for explaining the code and adding comments on your analysis and conclusions. This R Markdown document's in its original format. It's definitely useful and can be edited and added to, but if we want to produce a report containing all text, code and results, we need to click the knit button. Now we've got the report. It's an HTML file that you can share with others.

Let's compare the original .rmd file with the HTML report. You can tell that the text has been transformed into a more viewer-friendly format. Also, the code chunks have all been run. And we now have their output: both the columns of data and the plot from an analysis on the cars dataset.

The report's clear and formatted in a way that's easy to follow and understand.

Note:

With the HTML report shown along side the .RMD file, notice that headings in the report are created when you include one or more hashtags (#) before the heading text, such as ## Including Plots. The more hashtags used, the smaller the heading font. # Including Plots creates a Header 1 style heading whereas ## Including Plots creates a Header 2 style heading.

We could share it with stakeholders even if they've got no experience in R. R Markdown files are definitely an effective way to complete the data analysis process. You can start your analysis in R and create a report, complete with code and visualizations, all in the same workspace.

Coming up, we'll show you more examples of how to use R Markdown to make your documentation even more effective. Bye for now.

Question:

Hashtags are used for headers; for example, ### Results indicates that Results is a Header 3 style heading because there three hashtags.

Question2:

The knit button creates a shareable HTML report of the R Markdown file.

Question2:

The code chunks are run and the output appears in the HTML report.

Hands-On Activity: Your R Markdown notebook (Practice Quiz)

Test your knowledge about documentation and reports (Practice Quiz)

Create R Markdown documents

Structure of markdown documents

Hello there. Earlier, we showed you how to get started using R markdown. We created a Markdown document called an RMD file, which is super useful for making and saving a final report that summarizes your data exploration and analysis findings.

In this video, we'll check out the structure of the text in an RMD file and how you could format it to better organize and emphasize your findings. Let's go into RStudio and open the file we saved earlier called R Markdown Intro.

(If you're working along with us and don't have a file saved, you can open up a new R Markdown or RMD file using the file menu. If you're prompted to install packages, go ahead and click Yes. Click Okay to open with the default options and then save your file.)

Now let's dig deeper into this file. We'll start at the top. This is the YAML header section. YAML is a language for data that translates it, so it's readable. Fun fact: YAML originally stood for yet another markup language. This section is called out using three dashes on the first and last lines. This syntax automatically creates the YAML header section when it's used in an RMD file. In an RMD file, this section's basically for metadata or the data about the data in the rest of the file. The title, author, date and file type of an output are automatically included when you create a new file. There's lots of different functions and formatting options in this section.

For now, just make sure you have at least the four details we've got in our file now. You can use the template that appears when you open the file and just edit over it. Or you can start from scratch using the three dashes to create the YAML section and the rest of the content in the file. We'll cover these steps over the next few videos and in the other program resources.

Next, let's check out the text in the white areas of our file. Think of the text as a way to comment on and explain your code and analysis and any visualizations you're including. You can format the text to include links, ordered lists, equations and more. The text is formatted using Markdown, the syntax we introduced earlier. We've included a reading that shows you all of the ways to format text, as well as lots of other great Markdown tips and tricks. You'll also learn some more formatting examples in the next video.

For now, let's try some examples that are in this file. On line 12, there's two hashtags and a space before the words R Markdown. Hashtags are used for headers. The more hashtags, the smaller the header. The space is important as well. Otherwise RStudio won't recognize that this is a header. Let's knit our file again. There's the R Markdown header in the HTML file. If we add two more hashtags in the dot RMD file and click knit once again, the output changes. The header's now smaller. We'll change it back because the original format made sense. Since this header is introducing information about R Markdown that comes in the next two paragraphs, we want to emphasize it. In the first paragraph of this section, there's a brief summary of Markdown.

There's a link in the text, and it's formatted using angle brackets. Using these brackets results in a clickable link in the output. That's a handy feature if you want to refer to any helpful links or include them as sources for your analysis.

In the next paragraph, knit is set off with two asterisks on either side of the word. This bolds the word. Using one asterisk on either side would instead italicize the word.

Let's scroll down to the last paragraph. Here we've got some in-line code, which can be inserted directly into the text of a dot RMD file. The code appears in a gray box like the code chunks we'll talk about soon. Using in-line code like this lets you refer to the code directly as you explain it.

Let's knit our file one more time. All the formatting works together to make a well designed, readable file that's easy to share with stakeholders and team members. That's it for now. But there's lots more to learn about creating your own reports. Stay tuned.

Question:

Hashtags are used for headers; the more hashtags, the smaller the header.

Question2:

Using R Markdown notebooks (Discussion Prompt)

Meg: Programming is empowering

I'm Meg and I'm a product lead at Google. As a product lead, I work with designers and web developers to build features that our users will love. Specifically, I work for Kaggle, which is an online data science community for people who are learning data science and machine learning. We get to build exciting features that help people learn from data and advance in their careers. I work with designers and researchers to do studies to understand what our users want and what they need in the product, and I work with engineers to figure out exactly how to write out those requirements for the features that we decide to build.

Learning any programming language is really empowering because your only limits are your creativity and your curiosity. It's really the curiosity that I have about the world that led me to research and data analysis, especially with R. I just felt it was really freeing to be able to ask a question about the world and know how I could work with data to get an answer to that question. The second thing that I think is really exciting and empowering about knowing a programming language is the transferable skills that it gives you. Then the last thing that I think is really exciting is really the community and ecosystem that comes along with it. R is really no exception when it comes to that. In fact, I think R's community is really outstanding. To have the community and that public ecosystem of resources right at your fingertips is really going to supercharge what you can do with data as a data analyst and I think that's super exciting.

Its completely normal to feel intimidated or confused or stuck as you're first learning R. There are things that are quirky about the language, and it's not your fault. You just have to get past those things and I can promise that things will make a lot more sense, especially once you're able to start using the tidyverse. I would say stick with it. The other piece of advice I'd have is try to connect with the R community as early as possible in your R-learning journey. The thing that's just fantastic about R is the fact that its community is really vibrant, it's really welcoming and you're going to find things like people who are expert practitioners in R who are sharing their mistakes and they're willing to share their learning journey as well. I think it'll really help you see that you're not alone. I definitely had these moments where I was pretty frustrated when I was first learning R. When it really clicked for me, was when I had the opportunity to use R to answer my own personal research questions. That is when you feel you have a personal vested interest in the outcome of your analysis. That feeling of reward and satisfaction when you push through is something that can help really build that momentum to keep learning.

Even more document elements

Great to see you again. As we've explored R Markdown, we've learned how the interactive elements work. when you change something in your RMD file, RStudio automatically applies it to your report.

In this video, we'll show you a few more formatting options for making your report more complete and dynamic. Let's go back once again to our dot RMD file called R Markdown Intro. We'll make some edits, add some elements, and then convert it to an HTML document. We'll start by adding bullets. Just like in a standard document, bullets can help you organize your content. In our file, we'll turn the list of the documents that R Markdown can author into bullet points. We use asterisks in the dot RMD file to create the bullets in the output document. We'll revise this section so that the bullet points are set up in the right way.

Now let's check out a different way to include a link in the file. When you're writing a report, you might want to link to the company website or to pages you used for research. Right now, in the first paragraph, we have the website URL inside of angle brackets. To embed the link in text instead, let's change the formatting. We'll start by changing the text to better fit the embedded link. Then we'll add square brackets and the word we want to embed the link in, and we'll change the angle brackets around the URL to parentheses. Both types of links work great, but some URLs are long and might clutter your report. In those cases, embedding links saves space and can look cleaner.

Let's say you want to embed an image too. Images are good for showing your workflow or displaying visualizations you want to reference in your report. Or maybe you just want to add a GIF or another fun image. That's great. Just make sure your stakeholders will appreciate it. Let's go ahead and have some fun with this example. We'll embed an image with the plot example. Then we'll add an exclamation point and a caption for the image in square brackets. Next, we'll copy the URL for our image and paste it inside parentheses. Now let's check out our finished product.

We'll view it in our browser. That way when we click the embedded link, it opens up in a new tab in the same browser. Here's our bullet points and our image and caption.

And here's our embedded link, which we can click to open the website, and there's lots more ways to format an RMD file to get it ready to turn into a complete, organized, and effective report. While reports are a key part of presenting your analysis to stakeholders, they can be just as important for your own learning, you can use dot RMD documents to keep track of your learning by including notes and linking to online resources. You can embed useful images and add bullet points too.

Speaking of learning, let's find out more about code chunks and RMD files in the next video. See you.

Question:

Question2: The correct syntax for adding an image with the caption example 1 to an RMarkdown file is example 1. The syntax is an exclamation mark followed by the caption in brackets and the image URL or pathway in parentheses.

Test your knowledge about creating R Markdown documents (Practice Quiz)

Understand code chunks and exports

Code chunks

Hi again, we've learned a lot about R Markdown or RMD Files and how they can be formatted and converted into reports for stakeholders. We've also explored the YAML header and the comments, descriptions, and explanations for the analysis shown in the report.

Now comes the heart of an RMD File. The code. Code added to an RMD File is usually called a code chunk. We've shown you these code chunks in some of the RStudio sessions we've been running. You may have noticed code chunks if you've been practicing along with us. Now we'll show you what they're all about.

Earlier, we worked with the PalmerPenguins dataset. We ran some code to analyze the data and create visualizations. After this step, we'd probably want to set up a practice report with notes about our analysis. We'll do that here and incorporate some code and visuals. Let's start by opening up RStudio and then the script with R programming. The GGplot_hook file.

Note: If you would like to follow along with the video, then complete the steps below to access the ggplot_hook.R file:

To start, log into your RStudio (Posit) Cloud account in a new tab. Open the project that you have been working with this link. If you haven't gone through this process already, at the top right portion of the screen you will see a "red stamp" indicating this project as a Temporary Copy. Click on the adjacent button, Save a Permanent Copy, and the project will be saved in your main dashboard for use with future lessons. Once that is completed, navigate to the file explorer in the bottom right and click through to access the R script file: Course 7 -> Week 5 -> ggplot_hook.R Once opened, you may follow along with the rest of the video to create an R Markdown (.rmd) file for sharing purposes.

We could share this file directly with others, but it's not very effective. It's hard to read and doesn't always include any takeaways so we'll create a new.Rmd file instead. We'll add a title and author. Now we have two tabs in our script pane. We can toggle between them. Just like with browser tabs. We'll save our new.Rmd file.

v This file is in the template format. We'll delete everything except the header section and start our own. We already set up the header section. We don't need to make any changes there.

Instead of deleting, we could make edit section by section. Adding our own comments and code along the way. But deleting the file contents gives us a blank space to work in and helps us avoid potential errors from mixing our own comments and code with the pre-existing ones in the template. Before we add any code, we want to describe its purpose. The first code chunk, we'll set up our R environment by loading our packages using the library function. On a new line, we'll type two hashtags to format a header for this section, followed by the header text setting up my environment. Then we'll add a note about the code. We've added apostrophes before and after tidyverse and palmerpenguins because they're the names of the packages which makes them part of the actual code.

Now we'll add our code. RStudio has a handy code menu that we can use to insert a code chunk. There's also a button in our script pane that lets us add code chunks. This creates a gray section in our file and chunk delimiters. A delimiter is a character that indicates the beginning or end of a data item. You could also type these delimiters directly in the file. Three backticks, followed by an r in braces to begin the code chunk and three backticks to end it. Or you could use the keyboard shortcuts, Control plus Alt plus I on a PC or Chromebook, and Command plus Option plus I on a Mac. Since we're in RStudio, the code menu works just fine. Inside the first brace, we'll label our code chunk. After the R we'll add a space and then type loading packages. This adds another layer to our organization. We can now easily find this code chunk and its label using the contents menu at the bottom of our script pane.

Next, between our delimiters, we'll add our first code chunk, which we use to load our two packages. Even if they're already loaded, loading them again make sure that our packages will be updated to the latest version. We can type starting on the line after the first delimiter.

But since we also have our programming file available, we'll copy and paste from there. Now we can run our code right in the file to check it for errors. There's the code output. We can also change our code chunk options. There's options for changing the output and for turning off warnings and messages. These are useful for when you're ready to create a final report for stakeholders. You'll have control over what you'll show them in the report. For example. if you get warnings with your output that don't impact your findings, you can turn off the warnings for stakeholders.

Question:

Code added to a .rmd file is usually called a code chunk.

So the code chunks are the key to making this report a good learning tool and eventually a document worth presenting to stakeholders.

Note: You may have noticed that the instructor had created an .html version from the R Markdown document for sharing purposes.

In order to generate a highly structured and quality R Markdown document, you may want to review the video lecture on Structure of markdown documents before moving to the next activity.

When you have code embedded in a file and can show its output, you can provide evidence for your findings and share your sources. If you need more evidence for how R Markdown can help you document your analysis we've got it coming next. See you then.

Question2:

Exporting documentation

Hi again. One of the most powerful things about an R Markdown file is that you can convert it to different output types to create shareable reports. We've been focusing on HTML documents, but there are other options we can explore.

Let's start by opening our earlier report. We created this report as a learning document to help you think through your code and analysis.

For this video, imagine it's a report you need to share with stakeholders. The file is currently in the dot RMD format, but as we've shown, it can be converted using the Knit button. The Knit drop-down menu includes three main options: HTML, PDF, and a Word document. You can use Knit to convert your file to any of these types whenever you want, but it's good practice to wait to convert to a PDF or Word doc. Instead, stick with HTML while you're working. HTML doesn't have page breaks, so you can focus on generating content for your report and not its appearance.

The Knit button isn't the only option for converting your file. The YAML can be edited to change your metadata or incorporate more details. For example, we'll change our output in this file to PDF. When we click the Knit button to render the file and run the code, the output's in a PDF format. You know how changing the metadata can have an effect on the whole report.

If you need to create a certain type of document over and over or if you want to customize the appearance of your final report, you can create a template. If it's a monthly or annual report that you're creating for stakeholders, you can just run one line of code to update your data, and your report's ready. We won't cover creating a template here, but it's something you might want to learn more about on your own as you get more experience in R.

We've now explored a big part of R Markdown and R documentation. We've explained what R Markdown is and how to use it in RStudio to create dot RMD files. We've checked out the structure of these files and how you can format them to make reports. We've shown you what code chunks are and how to include them in your documentation. We've shown you how to take all of that analysis plus your explanation of that analysis and transform it from a dot RMD file into a report that you can use as a learning document or share with stakeholders.

It's a great way to wrap up the data analytics process in R and RStudio. It's nearly time for you to wrap things up in this program, but if you want to review any concepts or practice some more in RStudio, feel free to rewatch the videos anytime for some extra help. See you soon.

Output formats in R Markdown (Reading)

Hands-On Activity: Exporting your R Markdown notebook (Practice Quiz)

Hands-On Activity: Using R Markdown templates (Practice Quiz)

Test your knowledge on code chunks (Practice Quiz)

Module 5 challenge


Course 7 Module 5 Glossary

Course wrap-up

Course challenge