Segmentor3IsBack - rstats-gsoc/gsoc2018 GitHub Wiki
The Segmentor3IsBack package provides a fast functional pruning algorithm for computing the optimal changepoints using several likelihood models (normal heteroscedastic, normal homoscedastic, Poisson, etc).
However there are several issues with the current code:
- it ERRORs on Solaris.
- it has no tests.
- packages which use it have been known to crash on windows. for example penaltyLearning uses Segmentor in its vignette, which crashes when re-building on windows.
The changepoint package provides cpt.mean for the normal homoscedastic model, and cpt.var for for the normal heteroscedastic model, but does not provide a solver for the other models (e.g. Poisson).
The jointseg package provides the Fpsn
function which computes the normal homoscedastic model.
Get Segmentor3IsBack passing all CRAN checks on solaris/windows.
- Setup a github repo for Segmentor3IsBack, with TravisCI for GNU/Linux testing, Appveyor for windows testing and Coveralls for code coverage.
- Add some trivial edge cases as tests, for example Segmentor(c(1,2,2), Kmax=3, model=1) should return a valid 3-segment model, but it currently does not.
- Write some extensive test cases using library(testthat) and library(neuroblastoma). Goal: 100% coverage in both R and C++ code by the end of summer.
- Can test on windows via win-builder.
- Compile package with GCC 6, and fix warnings report from CRAN repository.
- Figure out a way to access a solaris machine for testing.
This project will increase the portability and test coverage of the Segmentor3IsBack package.
- Alice Cleynen <[email protected]>, one of the authors of Segmentor3IsBack who knows its R/C++ code.
- Toby Dylan Hocking <[email protected]> is a user of Segmentor3IsBack, is familiar with Travis/Coveralls, and can suggest some tests.
MENTORS: please think of some tests for prospective students.
- Easy: create an Rmd web page in which you demonstrate how Segmentor can be used to find change points in different types of data (e.g. normal model for real-valued data, Poisson model for count data). For some example data sets, see data(neuroblastoma, package=”neuroblastoma”) for real-valued data, and data(chr11ChIPseq, package=”PeakSegDP”) for count data. Plot the data and the segmentation models.
- Medium: copy the Segmentor3IsBack package to one of your GitHub repos. Setup Travis, Appveyor and Coveralls, and create a README with badges.
- Hard: Write a test that fails on windows. Show that the package fails on both Appveyor and win-builder.
Students, please post a link to your test results here.