Use cases for the rpostgisLT package

At the core, we are dealing with trajectories, which are loosely defined objects, but essentially a sequence of points built into successive steps. That's the starting point. Now, together PostgreSQL and R provides tools to process, manage and analyse trajectories. Good practice and the strengths of both tools would tend towards using PostgreSQL for processing and management, and R for analysis, but there is no strict border, and both allow some (or all) of the other processes as well. In R, adehabitatLT actually allows for pretty much everything using the ltraj class, which is already formally defined. On top of this thus comes rpostgisLT: the essence of this package is to allow for bidirectional transfer, at any times between PostgreSQL and R, with no data loss or data alteration. This works by establishing the corresponding structure of data in PostgreSQL (pgtraj objects, see the vignette of the package) that stores (non destructively) all information from ltraj objects in the database, and allows for the use of PostgreSQL tools (notably PostGIS) on the data.

Initialization

A typical session will start by initializing the connection to the database, using PostgreSQL credentials through the RPostgreSQL package (note that loading rpostgisLT automatically loads RPostgreSQL too):

library(rpostgisLT)
con <- dbConnect("PostgreSQL", dbname = <dbname>, host = <host>, user = <user>, password = <password>)

The next step will be to check whether the intended pgtraj schema is ready to use with the function pgtrajSchema (note that by default, the function checks and/or create the schema "traj", which can be changed with the schema argument):

pgtrajSchema(con)

If it is successful, the function should return TRUE, together with a message like:

The pgtraj schema 'traj' was successfully created in the database.

Or in the case of an existing schema:

The schema 'traj' already exists in the database, and is a valid pgtraj schema.

Basic transfer

The most basic feature demonstrates a simple transfer from R to PostgreSQL and back to R: the resulting object should equal the original.

data(ibexraw)
ibexraw
is.regular(ibexraw)
## FALSE

## Note that there is an issue with the time zone. In 'ibexraw', the
## time zone is not set:
attr(ld(ibexraw)$date, "tzone")
## This means that it is assumed to be UTC, and is thus converted to
## local time zone on display (EDT2EST for me):
head(ld(ibexraw)$date)                    # Note that the first timestamp
                                          # should be '2003-06-01
                                          # 00:00:56'
## We need to fix that upfront:
ibex <- ld(ibexraw)
attr(ibex$date, "tzone") <- "Europe/Paris"
ibex <- dl(ibex)

ltraj2pgtraj(con, ibex)                   # Default should be in schema
                                          # 'traj' and use ltraj name
                                          # ('ibex') as pgtraj name.
ibexTest <- pgtraj2ltraj(con, "ibex")     # Default should look into
                                          # 'traj' schema.
all.equal(ibex, ibexTest)
## TRUE

Note that changes were implemented to the ltraj data structure in adehabitatLT v0.3.21: 1) row names are character strings; 2) there is an additional attribute proj4string in an ltraj that stores the projection reference. The adehabitatLT package must be updated to that version in order to install rpostgisLT; however, old ltrajs that were created with a previous version of adehabitatLT should still work with rpostgisLT, and can be manually updated to include a proj4string using a valid CRS object as follows:

attr(ltraj, "proj4string") <- CRS("+proj=longlat +datum=WGS84")

Now there are many ways to alter a trajectory (in R or PostgreSQL), but each of them should run smoothly on either side while being transferable at any time to the other side. Each of these modifications should end up with the same test, i.e. that all.equal between the original R object and the one that has been stored in PostgreSQL and retrieved in R returns TRUE.

Missing steps [seq, t, dt]

An ltraj can include NAs in their sequence (i.e. missing relocations), but still provide a record of them with their timestamp but no coordinates (as is the case with the example dataset puechcirc). On the other hand, ibexraw only provides a record when coordinates are available. We can add missing relocations using setNA:

refda <- strptime("2003-06-01 00:00", "%Y-%m-%d %H:%M", tz = "Europe/Paris")
(ibex <- setNA(ibex, refda, 4, units = "hour"))
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)
## TRUE

Regularize [t, dt]

The next logical step is to regularize the trajectory, by "rounding" timestamps to their expected values using sett0':

(ibex <- sett0(ibex, refda, 4, units = "hour"))
ibex.ref <- ibex                        # At this stage, 'ibex' is our
                                        # reference data
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)
## TRUE

Interpolate [seq, geom, t, dt]

Two types of interpolation can be computed: in space, i.e. rebuilding a trajectory based on a given step length; and in time, i.e. linearly interpolate missing data.

## 1. In space
summary(ld(ibex)$dist)
(ibex <- redisltraj(ibex, 400))
ibex <- removeinfo(ibex)
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)
## R uses fractional seconds (PostGIS doesn't), so dates are not exactly equal

## 2. In time
ibex <- ibex.ref
(ibex <- redisltraj(na.omit(ibex), 14400, type = "time"))
ibex <- removeinfo(ibex)
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)
## TRUE

Subset the trajectory [seq, (geom), (dt)]

In practice, there are two ways to subset a trajectory: by querying its parameters (or infolocs) on a specific condition, or by sub-sampling the trajectory at regular intervals.

## 1. Subset on given parameters
ibex <- ibex.ref
## We work on the data frame from the trajectory, which we subset, and
## then rebuild the ltraj without recomputing trajectory parameters;
## this is essentially what 'hab::subset' does.
## Note that the steps are not continuous any more.
ibex <- ld(ibex)
ibex <- droplevels(ibex[ibex$dist < 400 & !is.na(ibex$dist), ])
dlfast <- function(x) {
    trajnam <- c("x", "y", "date", "dx", "dy", "dist", "dt",
        "R2n", "abs.angle", "rel.angle")
    idd <- tapply(as.character(x$id), x$burst, unique)
    traj <- split(x[, names(x) %in% trajnam], x$burst)
    names(traj) <- NULL
    class(traj) <- c("ltraj", "list")
    attr(traj, "typeII") <- TRUE
    attr(traj, "regular") <- is.regular(traj)
    for (i in (1:length(traj))) {
        attr(traj[[i]], "id") <- as.character(idd[i])
        attr(traj[[i]], "burst") <- names(idd[i])
    }
    return(traj)
}
ibex <- dlfast(ibex)
head(ibex[[1]])
attr(ibex, "proj4string") <- CRS()
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)

## 2. Subsample on the temporal sequence
ibex <- ibex.ref
(ibex <- subsample(ibex, 14400*2))
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)

Cut, bind bursts [burst]

Sometimes, it is useful to cut a trajectory into sub-bursts, based on a given condition assessed on the trajectory parameter. For instance, we may want to cut into different bursts when steps are too large:

## 1. Cut if there is a step greater than 3000 m
ibex <- ibex.ref
(ibex <- cutltraj(ibex, "dist > 3000"))
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)

The opposite process is to bind bursts from a unique individual into a single burst:

## 2. Bind back by individual:
(ibex <- bindltraj(ibex))   # Note that this adds "infolocs" to the ltraj
                            # which are also stored in the pgtraj data structure
ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)

Combine trajectories or bursts [burst, traj]

The structure of a ltraj allows to combine different ltraj objects (or selection of bursts); in other word, a ltraj is a collection of bursts (technically a list), which can be manipulated just as any R list:

ibex <- ibex.ref
ibex2 <- ibex
burst(ibex2) <- paste(burst(ibex2), "2", sep = "-")
(ibex <- c(ibex, ibex2)[order(id(c(ibex, ibex2)))])
attr(ibex, "proj4string") <- CRS()

ltraj2pgtraj(con, ibex, overwrite = TRUE)
ibexTest <- pgtraj2ltraj(con, "ibex")
all.equal(ibex, ibexTest)

Use cases for the rpostgisLT package - mablab/rpostgisLT GitHub Wiki