4. Set up R script - clarelgibson/tableau-public-autorefresh GitHub Wiki

◀︎ Previous step

In this step you will set up an R Project and create the R scripts necessary to generate data, connect to Google and export the data to a Google Sheet every time the script is run.

For this project, you will use the renv package. If you haven't used it before, renv is a package that helps you to create reproducible environments for your R projects. There is some good documentation available on the Posit website if you need more guidance. You will need renv on this project to make running your scripts in the cloud easier.

Set up an R project

If you haven't already done so, clone the GitHub repo that you created in the first step to your local computer and create a new R Project in the cloned directory.

RStudio new project wizard

Initialise renv for the project

In your console, run the following line of code.

If you don't already have the renv package installed, you will need to install it using install.packages().

renv::init()

Write code to generate data

You will need to create a new R script file and write some code to generate data that will change over time. In a real-world scenario, this might be extracting data from an API using the httr and jsonlite packages, or it could be reading data from a database.

For this example project, if you don't yet have a use case in mind, or just want to start simple, you can generate some dummy data in R itself that will be different every time the script is run.

The GitHub repository for this project contains some scripts in the R folder that will do everything you need for this step. You can choose to copy these files and use them as they are or you can create your own script from scratch.

To help you understand what is going on in the R scripts, the sections below explain each part of the code.

This script contains custom functions needed to run the prepData.R script.

Make test data

This simple function returns a dataframe containing three columns: timestamp contains the date and time that the code was executed, label contains a numerical value from 1 to 10 and value contains a random number which will be different each time to code is executed.

make_test_data <- function() {
  df <- data.frame(
    timestamp = Sys.time(),
    label = 1:10,
    value = rnorm(10)
  )
  
  return(df)
}

Google authentication

This function will perform authentication with both Google Drive (using the GOOGLE_AUTHENTICATION_CREDENTIALS token if found in the environment variables; user-inputted credentials if not defined in environment variables) and with Google Sheets.

google_auth <- function() {
  require(googledrive)
  require(googlesheets4)
  
  drive_auth(path = Sys.getenv("GOOGLE_AUTHENTICATION_CREDENTIALS"))
  gs4_auth(token = drive_token())
}

Create directory in Google Drive

This function will first check if a directory already exists with the name name. If it does not exist, the function creates the directory and provides access to anyone with a link.

make_gdrive_folder <- function(name) {
  require(googledrive)
  
  # Check if folder already exists (0 = FALSE, >0 = TRUE)
  dir_flg <- nrow(drive_find(name))
  
  if (dir_flg == 0) {
    drive_mkdir(name) %>% 
      drive_share_anyone()
    cat("Folder", name, "created successfully.")
  } else {
    cat("Folder", name, "already exists. No need to recreate.")
  }
}

Create a Google Sheets document

This function will first check if a Sheet already exists with the name name. If it does not exist, the function creates the Sheet at the path specified.

make_gdrive_sheet <- function(name, path) {
  require(googledrive)
    
  # Check if file already exists (0 = FALSE, >0 = TRUE)
  sheet_flg <- nrow(drive_find(name))
  
  if (sheet_flg == 0) {
    drive_create(
      name = name,
      path = path,
      type = "spreadsheet"
    )
    cat("Spreadsheet", name, "created successfully.")
  } else {
    cat("Spreadsheet", name, "already exists. No need to recreate.")
  }
}

This script generates new data and saves it into a Google Sheet document. The comments explain what is going on in the script.

# SETUP ########################################################################
# > Packages ===================================================================
library(googlesheets4)

# > Scripts ====================================================================
source("./R/utils.R")

# > Variables ==================================================================
# Define a name for the Google Drive folder to be used for the project.
gdrive_dir <- "tableau-public-autorefresh"
gdrive_sheet <- paste0(gdrive_dir, "-data")

# GET DATA #####################################################################
df <- make_test_data()

# > Connect to Google ==========================================================
# Calling this function will open browser to complete authentication
google_auth()

# > Make project folder ========================================================
make_gdrive_folder(gdrive_dir)

# > Make google sheet file =====================================================
make_gdrive_sheet(
  name = gdrive_sheet,
  path = gdrive_dir
)

# > Store id of new file =======================================================
gdrive_sheet_id <- as_dribble(gdrive_sheet)$id

# > Append data into sheet =====================================================
sheet_write(
  data = df,
  ss = as_dribble(gdrive_sheet),
  sheet = "Sheet1"
)

Take a snapshot of your project

Once you are happy with your code, in your console, run the following line of code.

renv::snapshot()

This line will save the state of your project library to the lockfile (called renv.lock) which was created when you called renv::init().

You will need to repeat this call to renv::snapshot() any time that you install, load or remove an R package within your project.

▶︎ Next step

⚠️ **GitHub.com Fallback** ⚠️