Segmentor3IsBack - rstats-gsoc/gsoc2018 GitHub Wiki

Background

The Segmentor3IsBack package provides a fast functional pruning algorithm for computing the optimal changepoints using several likelihood models (normal heteroscedastic, normal homoscedastic, Poisson, etc).

However there are several issues with the current code:

Related work

The changepoint package provides cpt.mean for the normal homoscedastic model, and cpt.var for for the normal heteroscedastic model, but does not provide a solver for the other models (e.g. Poisson).

The jointseg package provides the Fpsn function which computes the normal homoscedastic model.

Details of your coding project

Get Segmentor3IsBack passing all CRAN checks on solaris/windows.

  • Setup a github repo for Segmentor3IsBack, with TravisCI for GNU/Linux testing, Appveyor for windows testing and Coveralls for code coverage.
  • Add some trivial edge cases as tests, for example Segmentor(c(1,2,2), Kmax=3, model=1) should return a valid 3-segment model, but it currently does not.
  • Write some extensive test cases using library(testthat) and library(neuroblastoma). Goal: 100% coverage in both R and C++ code by the end of summer.
  • Can test on windows via win-builder.
  • Compile package with GCC 6, and fix warnings report from CRAN repository.
  • Figure out a way to access a solaris machine for testing.

Expected impact

This project will increase the portability and test coverage of the Segmentor3IsBack package.

Mentors

  • Alice Cleynen <[email protected]>, one of the authors of Segmentor3IsBack who knows its R/C++ code.
  • Toby Dylan Hocking <[email protected]> is a user of Segmentor3IsBack, is familiar with Travis/Coveralls, and can suggest some tests.

Tests

MENTORS: please think of some tests for prospective students.

  • Easy: create an Rmd web page in which you demonstrate how Segmentor can be used to find change points in different types of data (e.g. normal model for real-valued data, Poisson model for count data). For some example data sets, see data(neuroblastoma, package=”neuroblastoma”) for real-valued data, and data(chr11ChIPseq, package=”PeakSegDP”) for count data. Plot the data and the segmentation models.
  • Medium: copy the Segmentor3IsBack package to one of your GitHub repos. Setup Travis, Appveyor and Coveralls, and create a README with badges.
  • Hard: Write a test that fails on windows. Show that the package fails on both Appveyor and win-builder.

Solutions of tests

Students, please post a link to your test results here.

⚠️ **GitHub.com Fallback** ⚠️