4. Set up R script - clarelgibson/tableau-public-autorefresh GitHub Wiki
In this step you will set up an R Project and create the R scripts necessary to generate data, connect to Google and export the data to a Google Sheet every time the script is run.
For this project, you will use the renv
package. If you haven't used it before, renv
is a package that helps you to create reproducible environments for your R projects. There is some good documentation available on the Posit website if you need more guidance. You will need renv
on this project to make running your scripts in the cloud easier.
If you haven't already done so, clone the GitHub repo that you created in the first step to your local computer and create a new R Project in the cloned directory.
In your console, run the following line of code.
If you don't already have the
renv
package installed, you will need to install it usinginstall.packages()
.
renv::init()
You will need to create a new R script file and write some code to generate data that will change over time. In a real-world scenario, this might be extracting data from an API using the httr
and jsonlite
packages, or it could be reading data from a database.
For this example project, if you don't yet have a use case in mind, or just want to start simple, you can generate some dummy data in R itself that will be different every time the script is run.
The GitHub repository for this project contains some scripts in the R
folder that will do everything you need for this step. You can choose to copy these files and use them as they are or you can create your own script from scratch.
To help you understand what is going on in the R scripts, the sections below explain each part of the code.
This script contains custom functions needed to run the prepData.R
script.
This simple function returns a dataframe containing three columns: timestamp
contains the date and time that the code was executed, label
contains a numerical value from 1 to 10 and value
contains a random number which will be different each time to code is executed.
make_test_data <- function() {
df <- data.frame(
timestamp = Sys.time(),
label = 1:10,
value = rnorm(10)
)
return(df)
}
This function will perform authentication with both Google Drive (using the GOOGLE_AUTHENTICATION_CREDENTIALS
token if found in the environment variables; user-inputted credentials if not defined in environment variables) and with Google Sheets.
google_auth <- function() {
require(googledrive)
require(googlesheets4)
drive_auth(path = Sys.getenv("GOOGLE_AUTHENTICATION_CREDENTIALS"))
gs4_auth(token = drive_token())
}
This function will first check if a directory already exists with the name name
. If it does not exist, the function creates the directory and provides access to anyone with a link.
make_gdrive_folder <- function(name) {
require(googledrive)
# Check if folder already exists (0 = FALSE, >0 = TRUE)
dir_flg <- nrow(drive_find(name))
if (dir_flg == 0) {
drive_mkdir(name) %>%
drive_share_anyone()
cat("Folder", name, "created successfully.")
} else {
cat("Folder", name, "already exists. No need to recreate.")
}
}
This function will first check if a Sheet already exists with the name name
. If it does not exist, the function creates the Sheet at the path
specified.
make_gdrive_sheet <- function(name, path) {
require(googledrive)
# Check if file already exists (0 = FALSE, >0 = TRUE)
sheet_flg <- nrow(drive_find(name))
if (sheet_flg == 0) {
drive_create(
name = name,
path = path,
type = "spreadsheet"
)
cat("Spreadsheet", name, "created successfully.")
} else {
cat("Spreadsheet", name, "already exists. No need to recreate.")
}
}
This script generates new data and saves it into a Google Sheet document. The comments explain what is going on in the script.
# SETUP ########################################################################
# > Packages ===================================================================
library(googlesheets4)
# > Scripts ====================================================================
source("./R/utils.R")
# > Variables ==================================================================
# Define a name for the Google Drive folder to be used for the project.
gdrive_dir <- "tableau-public-autorefresh"
gdrive_sheet <- paste0(gdrive_dir, "-data")
# GET DATA #####################################################################
df <- make_test_data()
# > Connect to Google ==========================================================
# Calling this function will open browser to complete authentication
google_auth()
# > Make project folder ========================================================
make_gdrive_folder(gdrive_dir)
# > Make google sheet file =====================================================
make_gdrive_sheet(
name = gdrive_sheet,
path = gdrive_dir
)
# > Store id of new file =======================================================
gdrive_sheet_id <- as_dribble(gdrive_sheet)$id
# > Append data into sheet =====================================================
sheet_write(
data = df,
ss = as_dribble(gdrive_sheet),
sheet = "Sheet1"
)
Once you are happy with your code, in your console, run the following line of code.
renv::snapshot()
This line will save the state of your project library to the lockfile (called renv.lock
) which was created when you called renv::init()
.
You will need to repeat this call to renv::snapshot()
any time that you install, load or remove an R package within your project.