R Language - gizotso/R GitHub Wiki

R Operators

Assignment Operators

  • <-, <<-, =: leftward assignment
    • x <- c(1,2,3)
    • (x = 1): with parenthesis around expression result is displayed
    • in previous R releases, _ was supported as assignment operator
  • ->, ->>: rightward assignment

Arithmetic Operators

  • +, -, *, /
  • ^ : puissance. ex: 2^3
  • factorial(): factorial function. factorial(4) > [1] 24
  • %%: modulo. 7%%3 > [1] 1
  • %/% : division entire. 7%/%3 > [1] 2

Logical Operators

  • <, >, <=, >=, ==, !=
  • ! x: NON logique
  • X & y : logical AND (element to element comparison)
  • X && y : evaluates left to right examining only the first element of each vector
  • X | y ; logical OR (element to element comparison)
  • X || y
  • XOR(x, y) : Exclusive OR
  • any(): are some values TRUE
  • all(): are all values TRUE

Set Operators

set operators (x, y, e) : vectors (of the same mode) containing a sequence of items (conceptually) with no duplicated values.

  • union(x, y)
  • intersect(x, y)
  • setdiff(x, y)
  • setequal(x, y): TRUE|FALSE are sets equals
  • is.element(e, x) <=> e %in% x
x <- c(1:5)
y <- c(3:8)
union(x, y)
intersect(x, y)
setdiff(x, y)
setequal(x, y)

1 %in% x
## [1] TRUE
c(1, 2) %in% c(2, 3, 4, 5)
##[1] FALSE  TRUE

z <- c("Monday", "Tuesday", "Friday")
#Is every value in z a day of the week?
all(z %in% c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday","Saturday", "Sunday"))

Comparisons

  • identical(x, y): compare data representation
  • all.equal(x, y): compare approximative equality
x=1:3; y=1:3
x==y
##[1] TRUE TRUE TRUE
x==3

1 > c(0, 1, 2)
## [1]  TRUE FALSE FALSE

c(1, 2, 3) == c(3, 2, 1)
##[1] FALSE  TRUE FALSE

Warning with &&, only first vector elements are compared

x = c(TRUE, FALSE, TRUE)
y = c(TRUE, TRUE,  TRUE)

x == y  ##[1] TRUE FALSE  TRUE
x &  y  ##[1] TRUE FALSE  TRUE
# this is probably what we expect.
x && y  ##[1] TRUE

Warning with numbers comparisons, numbers are cast to logical (positive numbers -> TRUE)

x=1:3; y=c(1,2,4)
x & y  ##[1] TRUE  TRUE TRUE

identical(x,y)
##[1] FALSE
all.equal(x)   
##[1] "Mean relative difference: 0.3333333"
0.9 == (1 - 0.1)         ##[1] TRUE
identical(0.9, 1 - 0.1)  ##[1] TRUE
all.equal(0.9, 1 - 0.1)  ##[1] TRUE
0.9 == (1.1 - 0.2)       ##[1] FALSE
identical(0.9, 1.1 - 0.2) ##[1] FALSE
all.equal(0.9, 1.1 - 0.2) ##[1] TRUE
all.equal(0.9, 1.1 - 0.2, tolerance = 1e-16) ##[1] "Mean relative difference: 1.233581e-16"

is between

x <- c(5, 15)
# 10 < x < 20
10 < x & x <20
(10 < x) & (x < 20)
## [1] FALSE  TRUE

eval

x = 2
eval(1 < x & x < 3)
[1] TRUE#

&vs &&: && will not evaluate the second test in a pair of tests if the first test makes the result clear. We use it mainly in condition evaluations

x = c(1,1,3)
x[1]==x[2] & x[1]==x[3]
## [1] FALSE

x[1]==x[2] && x[1]==x[3]
## [1] FALSE

if (length(x) > 0 && any(is.na(x))) { do.something() }
any( c(0, 1, 2) > 1) ##[1] TRUE
all( c(0, 1, 2) > 1) ##[1] FALSE

Conditional statements

cf R help: ?Control

v = c(1,2,3)
ifelse(v >1, 1, 0)

v = 1:10
ifelse(v<5 | v>8, v, 0)

Condition Handling and Recovery: Try, TryCatch

try is a wrapper to run an expression that might fail and allow the user's code to handle error-recovery.

options(show.error.messages = FALSE)
try(log("a"))
print(.Last.value)
## [1] "Error in log(\"a\") : non-numeric argument to mathematical function\n"
## attr(,"class")
## [1] "try-error"
## attr(,"condition")
## <simpleError in log("a"): non-numeric argument to mathematical function>

options(show.error.messages = TRUE)
print(try(log("a"), TRUE))
tryCatch(1, finally = print("Hello"))

loops

Loops are not your best friend in R, you must think vector processing.

v = c(1:5)
for (i in v) print(i * 10)
for (k in 1:5){
	print(k)
}

for (value in c("My", "second", "for", "loop")) {
  print(value)
}
g = 0
while (g < 1){
 	g <- rnorm(1)
 	cat(g, "\n")
}
# break can also be used in a while

repeat {
	g <- rnorm(1)
	if (g > 1.0) {break}
	cat(g, "\n")
}

This is not something in R spirit, use vector operation whenever possible

# z = x + y
z <- numeric(length(x))
for (i in 1:length(z)) z[i] <- x[i] + y[i]
x=1:10
for (i in 1:length(x)) {
   if (x[i] %% 2) {
      y[i] <- 0
   }
   else y[i] <- 1
}

This is more efficient with ``ifelse(x%%2 == 0, 1, 0)`

x=c(1:5, 5:1)
y <- numeric(length(x))
for (i in 1:length(x)) if (x[i] %% 2) y[i] <- 0 else y[i] <- 1
 [1] 0 1 0 1 0 0 1 0 1 0 #  <=> y[x%%2 == 0] = 0; y[x%%2 != 0] = 1;

Functions

x = c(1,2,3,0,10,10,5,4) max(c(x,NA)) # [1] NA : if any NA in a vector, result returns NA max(x, na.rm=TRUE)# [1] 10

<=>

max(x[which(is.na(x)==FALSE)])

  • lengh(x): nb of elements of x
  • sum(x), prod(x)
    • prod(x^2), prod(1:10)
  • min(x), max(x)
  • range(x): c(min(x), max(x))
  • which
    • which(x ==10): index of elements for which logical evaluation returns true
    • which(x>5)
    • which.max(x) : index of the maximum of x
    • which.min(x): index of the minimum of x
  • match(x, table, nomatch = NA_integer_, incomparables = NULL)
    • x = 1:10; y = 7:20; match(x, y, nomatch = 0) = [1] 0 0 0 0 0 0 1 2 3 4 intersect_x_y = y[match(x, y, nomatch = 0)]= [1] 7 8 9 10. # cf base function intersect(x,y)
    • %in% (match(x, table, nomatch = 0) > 0) 1:5 %in% c(2,4) = [1] FALSE TRUE FALSE TRUE FALSE
  • rank(x): x's elements Rank. >[1] 2 3 4 1 7 7 6 5

Numeric Functions

  • abs(x)
  • sqrt(x)
  • round(x, n): round x elements with n digits
    • round(c(pi, exp(1), 2.999), 2) = [1] 3.14 2.72 3.00
  • log(x) => log(x, base=exp(1)): ln natural logarithm (has the number e (≈ 2.718) as its base.
  • log10(x) => log(x, 10): common logarithm
  • exp(x)

http://www.statmethods.net/management/functions.html

Statistical Functions

  • mean(x), median(x)
  • var(x) | cov(x) : variance of x’s elts (calculated sur n-1) if x is a matrix[data.frame, var-cov matrix is calculated
  • cor(x): Correlation matrix if if x is a matrix[data.frame]
  • var(x, y) | cov(x, y): Covariance btw x and y, or btw x cols and y cols for matrix|data.frames
  • cor(x, y): Linear correlation btw x and y or correlation matrix for matrix|data.frames
  • scale(x): centre et réduit les données. center = FALSE|TRUE, scale = FALSE|TRUE. x_reduced = (x-mean(x))/sd(x)
  • cumsum(x), cumprod(x), cummin(x), cummax(x): cumulative functions
    • cumsum(x): vector which ith element is sum of element from x[1] to x[i] x[1:i]
  • pmin(x, y, ...), pmax(x, y, ...): ‘parallel’ maxima (or minima) of the argument vectors. Vector which ith element is min resp max btw x[i], y[i], … pmax(5:1, 3.14) = [1] 5.00 4.00 3.14 3.14 3.14

Manipulation Functions

  • sort(x): sort x elements in ascending order. for desc order : rev(sort(x)). sort(c('red','blue','green')) = [1] "blue" "green" "red"
  • rev(x): reverse order of x elements. rev(1:3) = [1] 3 2 1
  • unique(x): unique(c(1,3,4,5,1,NA)) = [1] 1 3 4 5 NA
  • na.omit(x): remove NA from a vector resp entire rowsfor a matrix/data.frame if NA for one col
    • na.omit(c(1,2,3,NA))
    • na.omit(matrix(c(1:5, NA), 3,2))
  • na.fail(x): returns an error message if x contains 1 or more NA
    • na.fail(c(1,2,NA)): Error in na.fail.default(c(1, 2, NA)) : missing values in object
  • subset(x, ...): Subsetting Vectors, Matrices and Data Frames
    • subset(airquality, Temp > 80, select = c(Ozone, Temp))
    • subset(airquality, Day == 1, select = -Temp) #"-" to un-select a column

samples

  • sample(x, size, replace=FALSE): random samples and permutations. Takes a sample of the specified size from the elements of x with or without replacement.
    • set.seed(1); sample(1:6, 10, replace=TRUE) # Roll Die 10 times
    • sample(c(0,1), 100, replace = TRUE) # 100 Bernoulli trials
  • rnorm: sample from normal distribution N(0,1)
    • rnorm(n)
    • rnorm(1, mean=280, sd=10)
  • runif: uniform distribution
    • runif(n, min=0, max=1): runif(10) returns 10 uniform random numbers on [0, 1]
  • choose(n, k): number of combinations of k elements from n elements. Combinaisons sans répétition/remise, ex: tirage simultané k cartes dans un jeu de n cartes. Combinatoire @Wikiversity
  • combn(x, m): generates all combinations of n elements, taken m at a time.
    • combn(1:4, 2)
    • choose(4,2) = 6 = factorial(n) / (factorial(k) * factorial(n-k))
  • perm <- function(n, k) { factorial(n) / factorial(n-k)} # permutations avec répétition

Random data with R: http://www.math.csi.cuny.edu/Statistics/R/simpleR/stat007.html

Programming Functions

function (formal arguments) body Whenever you create a function, it gets a reference to the environment in which you created it. This reference is a built-in property of that function. When you are evaluating expressions in the command level of an R session, you are working in the global environment. When you load packages with a namespace, that namespace environment replaces the global environment

rnorm2 <- function(n,mean,sd) { mean+sd*scale(rnorm(n)) }
add <- function(x, y) {
   x+y # by default the value of the last line is returned.
}

add(1,2)
## [1] 3
square <- function(x){
	x2 <- x^2
	return(x2)
}
square(x = 2)
CV <- function(x) sd(x)/mean(x)
CV(c(1,2,4))
## [1] [1] 0.6546537
foo <- function() print(x)
x = 1

foo()       
##[1] 1

foo("hello")
## Error in foo("hello") : unused argument ("hello")

Recursive Function

fib <- function(n) if (n>2) c(fib(n-1),sum(tail(fib(n-1),2))) else if (n>=0) rep(1,n)

fib(10)
##[1]  1  1  2  3  5  8 13 21 34 55
factn <-function(n) if (n == 0) 1 else n*factn(n-1)

factn(10)
##[1] 3628800

Function as Objects

R functions can be treated as objects

a <- function(n) function(a) runif(a)
b <- a(1)
b(10)

This can be useful when wanting to make many different kinds of functions

a <- list()
b <- function(i){ i; function() runif(i)}
for (i in 1:10) a[[i]] <- b(i)
a[[1]]()
[1] 0.2617396
a[[2]]()
[1] 0.8822248 0.3374574
a[[3]]()
[1] 0.0348156 0.4212788 0.6107646
⚠️ **GitHub.com Fallback** ⚠️