R Language - gizotso/R GitHub Wiki
-
<-,<<-,=: leftward assignmentx <- c(1,2,3)-
(x = 1): with parenthesis around expression result is displayed - in previous R releases, _ was supported as assignment operator
-
->,->>: rightward assignment
- +, -, *, /
- ^ : puissance. ex: 2^3
- factorial(): factorial function. factorial(4) > [1] 24
- %%: modulo. 7%%3 > [1] 1
- %/% : division entire. 7%/%3 > [1] 2
- <, >, <=, >=, ==, !=
- ! x: NON logique
- X & y : logical AND (element to element comparison)
- X && y : evaluates left to right examining only the first element of each vector
- X | y ; logical OR (element to element comparison)
- X || y
- XOR(x, y) : Exclusive OR
- any(): are some values TRUE
- all(): are all values TRUE
set operators (x, y, e) : vectors (of the same mode) containing a sequence of items (conceptually) with no duplicated values.
- union(x, y)
- intersect(x, y)
- setdiff(x, y)
- setequal(x, y): TRUE|FALSE are sets equals
- is.element(e, x) <=> e %in% x
x <- c(1:5)
y <- c(3:8)
union(x, y)
intersect(x, y)
setdiff(x, y)
setequal(x, y)
1 %in% x
## [1] TRUEc(1, 2) %in% c(2, 3, 4, 5)
##[1] FALSE TRUE
z <- c("Monday", "Tuesday", "Friday")
#Is every value in z a day of the week?
all(z %in% c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday","Saturday", "Sunday"))-
identical(x, y): compare data representation -
all.equal(x, y): compare approximative equality
x=1:3; y=1:3
x==y
##[1] TRUE TRUE TRUE
x==3
1 > c(0, 1, 2)
## [1] TRUE FALSE FALSE
c(1, 2, 3) == c(3, 2, 1)
##[1] FALSE TRUE FALSEWarning with &&, only first vector elements are compared
x = c(TRUE, FALSE, TRUE)
y = c(TRUE, TRUE, TRUE)
x == y ##[1] TRUE FALSE TRUE
x & y ##[1] TRUE FALSE TRUE
# this is probably what we expect.
x && y ##[1] TRUEWarning with numbers comparisons, numbers are cast to logical (positive numbers -> TRUE)
x=1:3; y=c(1,2,4)
x & y ##[1] TRUE TRUE TRUE
identical(x,y)
##[1] FALSE
all.equal(x)
##[1] "Mean relative difference: 0.3333333"0.9 == (1 - 0.1) ##[1] TRUE
identical(0.9, 1 - 0.1) ##[1] TRUE
all.equal(0.9, 1 - 0.1) ##[1] TRUE
0.9 == (1.1 - 0.2) ##[1] FALSE
identical(0.9, 1.1 - 0.2) ##[1] FALSE
all.equal(0.9, 1.1 - 0.2) ##[1] TRUE
all.equal(0.9, 1.1 - 0.2, tolerance = 1e-16) ##[1] "Mean relative difference: 1.233581e-16"is between
x <- c(5, 15)
# 10 < x < 20
10 < x & x <20
(10 < x) & (x < 20)
## [1] FALSE TRUEeval
x = 2
eval(1 < x & x < 3)
[1] TRUE#&vs &&: && will not evaluate the second test in a pair of tests if the first test makes the result clear. We use it mainly in condition evaluations
x = c(1,1,3)
x[1]==x[2] & x[1]==x[3]
## [1] FALSE
x[1]==x[2] && x[1]==x[3]
## [1] FALSE
if (length(x) > 0 && any(is.na(x))) { do.something() }any( c(0, 1, 2) > 1) ##[1] TRUE
all( c(0, 1, 2) > 1) ##[1] FALSEcf R help: ?Control
v = c(1,2,3)
ifelse(v >1, 1, 0)
v = 1:10
ifelse(v<5 | v>8, v, 0)try is a wrapper to run an expression that might fail and allow the user's code to handle error-recovery.
options(show.error.messages = FALSE)
try(log("a"))
print(.Last.value)
## [1] "Error in log(\"a\") : non-numeric argument to mathematical function\n"
## attr(,"class")
## [1] "try-error"
## attr(,"condition")
## <simpleError in log("a"): non-numeric argument to mathematical function>
options(show.error.messages = TRUE)print(try(log("a"), TRUE))tryCatch(1, finally = print("Hello"))Loops are not your best friend in R, you must think vector processing.
v = c(1:5)
for (i in v) print(i * 10)for (k in 1:5){
print(k)
}
for (value in c("My", "second", "for", "loop")) {
print(value)
}g = 0
while (g < 1){
g <- rnorm(1)
cat(g, "\n")
}
# break can also be used in a while
repeat {
g <- rnorm(1)
if (g > 1.0) {break}
cat(g, "\n")
}This is not something in R spirit, use vector operation whenever possible
# z = x + y
z <- numeric(length(x))
for (i in 1:length(z)) z[i] <- x[i] + y[i]x=1:10
for (i in 1:length(x)) {
if (x[i] %% 2) {
y[i] <- 0
}
else y[i] <- 1
}This is more efficient with ``ifelse(x%%2 == 0, 1, 0)`
x=c(1:5, 5:1)
y <- numeric(length(x))
for (i in 1:length(x)) if (x[i] %% 2) y[i] <- 0 else y[i] <- 1
[1] 0 1 0 1 0 0 1 0 1 0 # <=> y[x%%2 == 0] = 0; y[x%%2 != 0] = 1;
x = c(1,2,3,0,10,10,5,4) max(c(x,NA)) # [1] NA : if any NA in a vector, result returns NA max(x, na.rm=TRUE)# [1] 10
max(x[which(is.na(x)==FALSE)])
- lengh(x): nb of elements of x
- sum(x), prod(x)
- prod(x^2), prod(1:10)
- min(x), max(x)
- range(x): c(min(x), max(x))
-
which
- which(x ==10): index of elements for which logical evaluation returns true
- which(x>5)
- which.max(x) : index of the maximum of x
- which.min(x): index of the minimum of x
-
match(x, table, nomatch = NA_integer_, incomparables = NULL)
- x = 1:10; y = 7:20; match(x, y, nomatch = 0) = [1] 0 0 0 0 0 0 1 2 3 4 intersect_x_y = y[match(x, y, nomatch = 0)]= [1] 7 8 9 10. # cf base function intersect(x,y)
-
%in%(match(x, table, nomatch = 0) > 0) 1:5 %in% c(2,4) = [1] FALSE TRUE FALSE TRUE FALSE
- rank(x): x's elements Rank. >
[1] 2 3 4 1 7 7 6 5
- abs(x)
- sqrt(x)
- round(x, n): round x elements with n digits
- round(c(pi, exp(1), 2.999), 2) = [1] 3.14 2.72 3.00
- log(x) => log(x, base=exp(1)): ln natural logarithm (has the number e (≈ 2.718) as its base.
- log10(x) => log(x, 10): common logarithm
- exp(x)
http://www.statmethods.net/management/functions.html
- mean(x), median(x)
- var(x) | cov(x) : variance of x’s elts (calculated sur n-1) if x is a matrix[data.frame, var-cov matrix is calculated
- cor(x): Correlation matrix if if x is a matrix[data.frame]
- var(x, y) | cov(x, y): Covariance btw x and y, or btw x cols and y cols for matrix|data.frames
- cor(x, y): Linear correlation btw x and y or correlation matrix for matrix|data.frames
- scale(x): centre et réduit les données. center = FALSE|TRUE, scale = FALSE|TRUE. x_reduced = (x-mean(x))/sd(x)
- cumsum(x), cumprod(x), cummin(x), cummax(x): cumulative functions
- cumsum(x): vector which ith element is sum of element from x[1] to x[i] x[1:i]
- pmin(x, y, ...), pmax(x, y, ...): ‘parallel’ maxima (or minima) of the argument vectors. Vector which ith element is min resp max btw x[i], y[i], … pmax(5:1, 3.14) = [1] 5.00 4.00 3.14 3.14 3.14
- sort(x): sort x elements in ascending order. for desc order : rev(sort(x)). sort(c('red','blue','green')) = [1] "blue" "green" "red"
- rev(x): reverse order of x elements. rev(1:3) = [1] 3 2 1
- unique(x): unique(c(1,3,4,5,1,NA)) = [1] 1 3 4 5 NA
- na.omit(x): remove NA from a vector resp entire rowsfor a matrix/data.frame if NA for one col
- na.omit(c(1,2,3,NA))
- na.omit(matrix(c(1:5, NA), 3,2))
- na.fail(x): returns an error message if x contains 1 or more NA
- na.fail(c(1,2,NA)): Error in na.fail.default(c(1, 2, NA)) : missing values in object
- subset(x, ...): Subsetting Vectors, Matrices and Data Frames
- subset(airquality, Temp > 80, select = c(Ozone, Temp))
- subset(airquality, Day == 1, select = -Temp) #"-" to un-select a column
- sample(x, size, replace=FALSE): random samples and permutations. Takes a sample of the specified size from the elements of x with or without replacement.
- set.seed(1); sample(1:6, 10, replace=TRUE) # Roll Die 10 times
- sample(c(0,1), 100, replace = TRUE) # 100 Bernoulli trials
- rnorm: sample from normal distribution N(0,1)
- rnorm(n)
- rnorm(1, mean=280, sd=10)
- runif: uniform distribution
- runif(n, min=0, max=1): runif(10) returns 10 uniform random numbers on [0, 1]
- choose(n, k): number of combinations of k elements from n elements. Combinaisons sans répétition/remise, ex: tirage simultané k cartes dans un jeu de n cartes. Combinatoire @Wikiversity
- combn(x, m): generates all combinations of n elements, taken m at a time.
- combn(1:4, 2)
- choose(4,2) = 6 = factorial(n) / (factorial(k) * factorial(n-k))
- perm <- function(n, k) { factorial(n) / factorial(n-k)} # permutations avec répétition
Random data with R: http://www.math.csi.cuny.edu/Statistics/R/simpleR/stat007.html
function (formal arguments) body Whenever you create a function, it gets a reference to the environment in which you created it. This reference is a built-in property of that function. When you are evaluating expressions in the command level of an R session, you are working in the global environment. When you load packages with a namespace, that namespace environment replaces the global environment
rnorm2 <- function(n,mean,sd) { mean+sd*scale(rnorm(n)) }add <- function(x, y) {
x+y # by default the value of the last line is returned.
}
add(1,2)
## [1] 3square <- function(x){
x2 <- x^2
return(x2)
}
square(x = 2)CV <- function(x) sd(x)/mean(x)
CV(c(1,2,4))
## [1] [1] 0.6546537foo <- function() print(x)
x = 1
foo()
##[1] 1
foo("hello")
## Error in foo("hello") : unused argument ("hello")fib <- function(n) if (n>2) c(fib(n-1),sum(tail(fib(n-1),2))) else if (n>=0) rep(1,n)
fib(10)
##[1] 1 1 2 3 5 8 13 21 34 55factn <-function(n) if (n == 0) 1 else n*factn(n-1)
factn(10)
##[1] 3628800R functions can be treated as objects
a <- function(n) function(a) runif(a)
b <- a(1)
b(10)This can be useful when wanting to make many different kinds of functions
a <- list()
b <- function(i){ i; function() runif(i)}
for (i in 1:10) a[[i]] <- b(i)
a[[1]]()
[1] 0.2617396
a[[2]]()
[1] 0.8822248 0.3374574
a[[3]]()
[1] 0.0348156 0.4212788 0.6107646