R Data Manipulation - gizotso/R GitHub Wiki

Data Manipulation in R

Vector

v = 1:5

[1] 1 2 3 4 5

accessing elements

v[2]: 2nd elt

[1] 2

v[ c(1,3) ]: 1st and 3rd elements

[1] 1 3

v[(2:3)]: [1] 2 3

accessing named elements

names(v) = paste('e',1:5,sep="")
v['e4']
4
v = c(i=1, j=2)
v[1]  <=> v['j']

Logical Indexing

v>2

[1] FALSE FALSE TRUE TRUE TRUE

v[v>2]

[1] 3 4 5

v[v>2] <- 0: right assignment

[1] 1 2 0 0 0

v[v %% 2 == 0]

[1] 2 4

v[c(TRUE, FALSE)]: recycling happens here

[1] 1 3 5

x = runif(10) #10 numbers between 0 and 1
x[x<.5]

x = rpois(40, lambda=5) # poisson random numbers
# select pair integers
x[x %% 2 == 0]

Indexing

which(v>2)

[1] 3 4 5

v[which(v>2)]

[1] 3 4 5

remove elements

x = c(1,2,3) x[-1] # remove first elt [1] 2 3

Operations

Arithmetical Operations (warning with vector length / recycling)

1 + 1:5

[1] 2 3 4 5 6

x = 1:6 ##[1] 1 2 3 4 5 6  [int]
y = x-3 ##[1] 1 2 3 4 5 6  [num]
z = 2*y/3 ##[1] -1.3333333 -0.6666667  0.0000000  0.6666667  1.3333333  2.0000000  [num]
t = sqrt(abs(z)^3)
sqrt(x)

##nteger divide (modulo) operation
x%%2 ##[1] 1 0 1 0 1 0

x = 1:4; y = rep(1,4); z=5:6
x+1 ##[1] 2 3 4 5 scalar addition
x+y ##[1] 2 3 4 5
x+z ##[1] 6 8 8 10  #z has been recycled to match length
z=5:7 ## length is 3
x+z ##[1]  6  8 10  9
	# Warning message:
	# In x + z : longer object length is not a multiple of shorter object length

Logical Operations (<, >, <=, >=, ==, !=)

x=1:6
y=c(1,4,2,5,4,3)
x < y ## [1] FALSE  TRUE FALSE  TRUE FALSE FALSE

# x & y intersection
(x<=3) & (y>3)
# x|y union
(x<=3) | (y>3)

vector multiplication

x = c(1,2,3,4)
x%*%x
##       [,1]
## [1,]   30

Outer multiplication

x%o%x # outer(x, x)
     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    2    4    6    8
[3,]    3    6    9   12
[4,]    4    8   12   16
outer(1:5, 1:2)
     [,1] [,2]
[1,]    1    2
[2,]    2    4
[3,]    3    6
[4,]    4    8
[5,]    5   10

Matrix Operations

  • t(M): transpose matrix
m1 = matrix(1, nr = 2, nc = 2)
m2 = matrix(1:4, nr = 2, nc = 2)

m1%*%m2 #matrix multiplicaton
m1*m2   #row to row multiplication
  • solve(X): matrice inverse
X =  matrix(c(2,2,3,1,7,4,5,5,5), ncol=3)
##      [,1] [,2] [,3]
## [1,]    2    1    5
## [2,]    2    7    5
## [3,]    3    4    5

X_1 = solve(x)
X%*%X_1
##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    0    1    0
## [3,]    0    0    1  
  • qr(): décomposition
  • eigen(): calcul des valeurs et vecteurs propres
  • svd(): décomposition en valeurs singulières.

Note : inv(matrice)

[a b] ^-1 = 1/detA * [ d -b]
[c d]                [-c  a]

Data Frame

data.frame(id=c(1,2,3,NA) ,char=c('a','b','c','d'))

subset()

⚠️ **GitHub.com Fallback** ⚠️