R Basics - bcb420-2025/Clare_Gillis GitHub Wiki
Notes on Unit 2 - R Basics
Time expected: 4 hours
Actual time: 5 hours
Chapter 2 - Installing R and R Studio
- Tasks 1 & 2 & 3- Already have R, RStudio, and docker installed
- Image is set of instructions for building a container (system, packages, etc)
- Container is instance of an image (like an environment)
- Anything on the container is erased when the container stops running
- Volume lets you attach container to file system
- This maps directory PWD on your computer to /home/rstudio/projects on the container
-v ${PWD}:/home/rstudio/projects
- Task 4 - Done!
- search() - show all packages loaded
- ls("package:<package></package>")
- Task 5 - Done!
- Task 6 - Done!
- Task 7 - Done!
- Task 8 - Done!
- Task 9 - Done!
- Task 10 - Done!
- Task 11 - Done!
- Task 12 - Done!
- You can set alternating column colours using a vector like this
stripes <- c("red", "grey") hist(rnorm(200), col=stripes)
Chapter 7 R scalars and vectors
- Scalar - single value
- Vector - sequence of scalars with same type
- Matrix - vectors with defined (often more than 1) dimensions
- Data frame - like a spreadsheet with a bunch of vector columns
- List - general collection of items (scalars, vectors, matrices...)
- typeof() and class() to get type and class of an obj
- is.<type></type>() to check if an object is a certain type (ex. list, null...)
- Task 13 - Done!
- Vector of mixed type coerces everything to a general consistent type (often character - most general)
- Negative indices exclude that index
- To append a 4 to the end of a 3-long vector do x[4] <- 4 or x <- c(x, 4)
- Task 14 - We're adding the sum of the second-last number in myVec (myVec[length(myVec)]) and the last number in myVec (myVec[length(myVec)]) to the end of myVec - Done!
- SEMICOLON MEANS NEWLINE!
- dim(a) <- c(x, y) - x is rows, y is columns
- rbind - row-bind. cbind - column-bind
- Task 15 - Done!
- Task 16 & 17 - file.path("data_files","plasmidData.tsv") doesn't seem to exist in R_Exercise-BasicSetup
- Task 18 - Done!
- Remember lapply functions, they're helpful
- Task 19 - Done!
- Task 20 - remember grep. Done!
- Need to use apply() to use any() or all()
- ^ = starts with. $ = ends with.
- Put brackets around an assignment to assign and print the result
- Task 21 - Done!
- Use || and && in control structures (| and & are vectorized)
- ifelse() is vectorized
- Task 22 - Done!
- Task 23 - Done!
- Task 24 - Done!
- can do missing(<variable></variable>) to check if argument is missing
- Task 25 - Done!
- Task 26 - Done!
- stripchart() for histograms is equivalent to rug() for bivariate data
- can add text() and points() to hist (ex to add counts and norm distn)
- Task 27 - Done!
- plot(), rug(), barplot(), hist(), boxplot()
- See here for colour pallettes, colour names, fonts, point types, and other plotting style info
- MAKE CODE EASY TO READ - be explicit, comment, indent comments to align properly
- <- not = for assignment
- use for (i in seq(along=x)) not for (i in 1:length(x)) - if x is NULL the loop will execute once with an undefined variable
- use ::() notation for functions
- Don't change global state
- 80 chars per line
- Use headers
- Use parentheses for math - don't rely on default behaviour
- 'CamelCaseStyle'
- isX or hasX = boolean variables
- findX, doX, getX (ie. verbs) = simple funcitons
- Pre-allocate objects right size (growing objects is slow)
- Explicitly assign values to crucial arguments (don't rely on default behaviour)
- use [END]
- Task 28 - Done!