Summer of Code 2015 proposal - wlevine/nmatrix GitHub Wiki

Motivation

NMatrix is too hard to install. The installation page is long, with too many steps. On my laptop (Ubuntu), I needed to do this weird dependency dance to build it. If I had tried to install it on my work machine (Red Hat), I would have had no guidance. Other projects are reluctant to use NMatrix because of the installation process (https://github.com/jekyll/classifier-reborn/issues/14). One big issue in the installation process is the installation of ATLAS. So I want to remove the dependency on ATLAS without removing any features.

Summary

I propose two different ideas. The first idea is to separate NMatrix into two gems, one of which contains basic math functionality and has no external dependencies on ATLAS or any other linear algebra package. The second gem (let's call it nmatrix-atlas) would contain more advanced linear algebra functions and would make use of an external linear algebra package. This would make installation easier for users who are not interested in advanced features. The second idea is to allow NMatrix to run with any implementation of liblapack and libblas, rather than being limited to ATLAS. This would simplify installation by allowing users to use whatever version of LAPACK is easily available from their OS or package manger. At the same time, it would allow users who are concerned about performance to use whatever tuned version of LAPACK works best for them.

Part 1 (nmatrix-atlas gem) details

The plain nmatrix gem should include basic math functions like matrix multiplication, inversion, and determinants. The nmatrix-atlas gem will provide additional functions provided by LAPACK.

I started to write some code to see what issues I would run into when I tried to separate the ATLAS-dependent code from the rest of nmatrix: see https://github.com/wlevine/nmatrix/tree/test_two_gems

I didn't get to the point of actually packaging two separate gems, but I was able to remove the ATLAS dependencies from nmatrix.so and make a separate nmatrix_atlas.so that extended NMatrix to implement one of the ATLAS functions (getri).

Here are the lessons I learned from this: nmatrix_atlas will need access to the nmatrix header files. As a quick solution I just symlinked all the headers from ext/nmatrix to ext/nmatrix_atlas. This is probably not a good idea in the long run, other solutions are possible, probably what should happen is that building plain nmatrix should install these headers somewhere where nmatrix_atlas (or other potential extensions) can see them.

The nmatrix header files contain some information that shouldn't be exposed to external libraries. For example, nmatrix.h defines a lot of important structs and macros which will be needed in nmatrix_atlas to read data, but it also defines Init_nmatrix() and some other functions which are irrelevant. I don't know the full extent of this issue, but I may need to break up the header files into parts that are relevant to external code and parts that should only be exposed internally.

All I needed to get nmatrix_atlas working was the header files from nmatrix, none of the c files. This is a good thing.

I plan to develop both gems in the same repository so that they don't get out-of-sync. Code in ext/nmatrix makes the nmatrix.so libary, while code in ext/nmatrix_atlas will make the nmatrix_atlas.so extension. The ruby files for the two gems can live in two different subdirectories of lib/.

Should be able to generate two gems from same repository: see http://opensoul.org/2012/05/30/releasing-multiple-gems-from-one-repository/

If only nmatrix is built, the LAPACK-specific specs should not be tested. If nmatrix-atlas is built, any function that is reimplemented by nmatrix-atlas should be tested twice, with and without nmatrix-atlas.

I also will need to update all the build instructions and documentation to explain the situation.

I don't have access to an OS X machine, so I will need help testing anything that changes the build process.

Part 2 (allow alternatives to ATLAS) details

Currently if an NMatrix user wants to use external library to provide BLAS/LAPACK functions, ATLAS is the only choice. It would be better if nmatrix interfaced directly with liblapack and libblas, so that the user could use any implementation they preferred (ATLAS, OpenBLAS, Intel, etc.). This is how numpy works (http://www.scipy.org/scipylib/building/linux.html).

The main obstacle here seems to be a C interface for LAPACK. NMatrix currently uses CLAPACK as provided by ATLAS. I propose replacing this with LAPACKE (http://www.netlib.org/lapack/lapacke.html).

Again, I've done a little work to investigate feasibility: see https://github.com/wlevine/nmatrix/tree/test_no_atlas

I got everything building without ATLAS and reimplemented one function (sgetri) using LAPACKE instead of CLAPACK. I tested that it works using three standard packages from the Ubuntu repositories: liblapack3 (= the reference implementation), libatlas3, and libopenblas.

The approach I'm currently taking is to copy a lot of code from LAPACKE into the nmatrix repository. Normally, I wouldn't think this is a good idea, but I think its okay here for several reasons:

  1. Compatible license. LAPACKE is under 3-clause BSD license.
  2. The code is stable. Not a lot of worry about continually having to update the code.
  3. LAPACKE is just a thin layer that provides an interface from C to the underlying LAPACK functions that are provided elsewhere. So it shouldn't be hard to build and there are no heavy-duty functions that need be highly optimized.
  4. Not copying the code would result in an additional dependency that wouldn't really have a benefit.

I think replacing CLAPACK with LAPACKE should be fairly straightforward. Again I will need to update the documentation and I will need help testing on other platforms.

Testing and Travis-CI

Travis-CI allows you to set up different build configurations and add them to the "build matrix" to be tested. Different configuration are specified by different environment variables (see http://docs.travis-ci.com/user/build-configuration/#The-Build-Matrix ). Also Travis runs on Ubuntu, so it should be possible to use the update-alternatives command to painlessly switch between different versions of LAPACK. So it should be possible to test a single gem with multiple, different external libraries. It would work something like:

Replace env section of .travis.yml:

env:
  - USE_ATLAS=1
  - USE_OPENBLAS=1
  - NO_EXTERNAL_LIB=1

Install all additional external dependencies in before_install section.

Use script command to launch external script instead of launching tests directly:

script: ./travis.sh

Where travis.sh contains something like:

if [ -n "$USE_ATLAS" ]
then
  sudo update-alternatives --set liblapack.so.3 /usr/lib/atlas-base/atlas/liblapack.so.3
  [set up other stuff if necessary]
  bundle exec rake compile && bundle exec rake spec
fi

if [ -n "$USE_OPENBLAS" ]
then
  sudo update-alternatives --set liblapack.so.3 /usr/lib/openblas-base/liblapack.so.3
  [...]
  bundle exec rake compile && bundle exec rake spec
fi

[other cases]

If we need to test multiple gems providing the same functions, I think the strategy would be pretty much the same. Set up several configurations, each one building and testing a single gem. The one complication I can think of is that the spec will need to know exactly what file to require, whether it should require nmatrix-atlas or nmatrix-whatever. This could be specified by the tester, or we could test all available gems.