Use cases for the rpostgisLT package - mablab/rpostgisLT GitHub Wiki
At the core, we are dealing with trajectories, which are loosely
defined objects, but essentially a sequence of points built into
successive steps. That's the starting point. Now, together PostgreSQL
and R provides tools to process, manage and analyse trajectories. Good
practice and the strengths of both tools would tend towards using
PostgreSQL for processing and management, and R for analysis, but
there is no strict border, and both allow some (or all) of the other
processes as well. In R, adehabitatLT
actually allows for pretty
much everything using the ltraj
class, which is already formally
defined. On top of this thus comes rpostgisLT
: the essence of this
package is to allow for bidirectional transfer, at any times between
PostgreSQL and R, with no data loss or data alteration. This works by
establishing the corresponding structure of data in PostgreSQL
(pgtraj
objects, see the
vignette of the package)
that stores (non destructively) all information from ltraj
objects
in the database, and allows for the use of PostgreSQL tools (notably
PostGIS) on the data.
A typical session will start by initializing the connection to the
database, using PostgreSQL credentials through the RPostgreSQL
package (note that loading rpostgisLT
automatically loads
RPostgreSQL
too):
library(rpostgisLT)
con <- dbConnect("PostgreSQL", dbname = <dbname>, host = <host>, user = <user>, password = <password>)
The next step will be to check whether the intended pgtraj schema is
ready to use with the function pgtrajSchema
(note that by default,
the function checks and/or create the schema "traj", which can be
changed with the schema
argument):
pgtrajSchema(con)
If it is successful, the function should return TRUE
, together with
a message like:
The pgtraj schema 'traj' was successfully created in the database.
Or in the case of an existing schema:
The schema 'traj' already exists in the database, and is a valid pgtraj schema.
The most basic feature demonstrates a simple transfer from R to PostgreSQL and back to R: the resulting object should equal the original.
data(ibexraw)
ibexraw
is.regular(ibexraw)
## FALSE
## Note that there is an issue with the time zone. In 'ibexraw', the
## time zone is not set:
attr(ld(ibexraw)$date, "tzone")
## This means that it is assumed to be UTC, and is thus converted to
## local time zone on display (EDT2EST for me):
head(ld(ibexraw)$date) # Note that the first timestamp
# should be '2003-06-01
# 00:00:56'
## We need to fix that upfront:
ibex <- ld(ibexraw)
attr(ibex$date, "tzone") <- "Europe/Paris"
ibex <- dl(ibex)
ltraj2pgtraj(con, ibex) # Default should be in schema
# 'traj' and use ltraj name
# ('ibex') as pgtraj name.
ibexTest <- pgtraj2ltraj(con, "ibex") # Default should look into
# 'traj' schema.
all.equal(ibex, ibexTest)
## TRUE
Note that changes were implemented to the ltraj
data structure in adehabitatLT
v0.3.21: 1) row names are character strings; 2) there is an additional attribute proj4string in an ltraj that stores the projection reference. The adehabitatLT package must be updated to that version in order to install rpostgisLT
; however, old ltraj
s that were created with a previous version of adehabitatLT
should still work with rpostgisLT
, and can be manually updated to include a proj4string using a valid CRS
object as follows:
attr(ltraj, "proj4string") <- CRS("+proj=longlat +datum=WGS84")
Now there are many ways to alter a trajectory (in R or PostgreSQL), but each of them should run smoothly on either side while being transferable at any time to the other side. Each of these modifications should end up with the same test, i.e. that all.equal
between the original R object and the one that has been stored in PostgreSQL and retrieved in R returns TRUE
.
An ltraj
can include NAs in their sequence (i.e. missing relocations), but still provide a record of them with their timestamp but no coordinates (as is the case with the example dataset puechcirc
). On the other hand, ibexraw
only provides a record when coordinates are available. We can add missing relocations using setNA
:
refda <- strptime("2003-06-01 00:00", "%Y-%m-%d %H:%M", tz = "Europe/Paris")
(ibex <- setNA(ibex, refda, 4, units = "hour"))
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)
## TRUE
The next logical step is to regularize the trajectory, by "rounding" timestamps to their expected values using sett0'
:
(ibex <- sett0(ibex, refda, 4, units = "hour"))
ibex.ref <- ibex # At this stage, 'ibex' is our
# reference data
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)
## TRUE
Two types of interpolation can be computed: in space, i.e. rebuilding a trajectory based on a given step length; and in time, i.e. linearly interpolate missing data.
## 1. In space
summary(ld(ibex)$dist)
(ibex <- redisltraj(ibex, 400))
ibex <- removeinfo(ibex)
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)
## R uses fractional seconds (PostGIS doesn't), so dates are not exactly equal
## 2. In time
ibex <- ibex.ref
(ibex <- redisltraj(na.omit(ibex), 14400, type = "time"))
ibex <- removeinfo(ibex)
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)
## TRUE
In practice, there are two ways to subset a trajectory: by querying its parameters (or infolocs) on a specific condition, or by sub-sampling the trajectory at regular intervals.
## 1. Subset on given parameters
ibex <- ibex.ref
## We work on the data frame from the trajectory, which we subset, and
## then rebuild the ltraj without recomputing trajectory parameters;
## this is essentially what 'hab::subset' does.
## Note that the steps are not continuous any more.
ibex <- ld(ibex)
ibex <- droplevels(ibex[ibex$dist < 400 & !is.na(ibex$dist), ])
dlfast <- function(x) {
trajnam <- c("x", "y", "date", "dx", "dy", "dist", "dt",
"R2n", "abs.angle", "rel.angle")
idd <- tapply(as.character(x$id), x$burst, unique)
traj <- split(x[, names(x) %in% trajnam], x$burst)
names(traj) <- NULL
class(traj) <- c("ltraj", "list")
attr(traj, "typeII") <- TRUE
attr(traj, "regular") <- is.regular(traj)
for (i in (1:length(traj))) {
attr(traj[[i]], "id") <- as.character(idd[i])
attr(traj[[i]], "burst") <- names(idd[i])
}
return(traj)
}
ibex <- dlfast(ibex)
head(ibex[[1]])
attr(ibex, "proj4string") <- CRS()
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)
## 2. Subsample on the temporal sequence
ibex <- ibex.ref
(ibex <- subsample(ibex, 14400*2))
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)
Sometimes, it is useful to cut a trajectory into sub-bursts, based on a given condition assessed on the trajectory parameter. For instance, we may want to cut into different bursts when steps are too large:
## 1. Cut if there is a step greater than 3000 m
ibex <- ibex.ref
(ibex <- cutltraj(ibex, "dist > 3000"))
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)
The opposite process is to bind bursts from a unique individual into a single burst:
## 2. Bind back by individual:
(ibex <- bindltraj(ibex)) # Note that this adds "infolocs" to the ltraj
# which are also stored in the pgtraj data structure
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)
The structure of a ltraj allows to combine different ltraj objects (or selection of bursts); in other word, a ltraj is a collection of bursts (technically a list
), which can be manipulated just as any R list
:
ibex <- ibex.ref
ibex2 <- ibex
burst(ibex2) <- paste(burst(ibex2), "2", sep = "-")
(ibex <- c(ibex, ibex2)[order(id(c(ibex, ibex2)))])
attr(ibex, "proj4string") <- CRS()
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)