Converting to the new gpu back end(gpuarray) - Theano/Theano GitHub Wiki
MILA will stop developing Theano.
This page describes how to use the new gpu back-end instead of the current/old one.
Installation:
- We strongly recommend that you use conda/anaconda to install Theano and pygpu, especially on Windows.
- Both are available with conda
conda install theano pygpu
. - RECOMMENDED: You can install the latest beta, release candidate or release like this
conda install -c mila-udem -c mila-udem/label/pre theano pygpu
Windows cleanup:
- Remove any previous install of gcc/mingw.
- Remove Visual C++ for python (or any MSVC that you installed for Theano).
- Remove previous installs of Theano and Python.
Note that we only support clean install with conda/anaconda on windows. You are welcome to try another configuration, but we won't help you make it work.
Code changes:
- If you use:
conv3d2d
,dnn_conv3d
, replace with the new 3d abstract conv:theano.tensor.nnet.conv3d()
- If you use:
dnn_conv2d
: replace with the 2d abstract conv:theano.tensor.nnet.conv2d()
- If you use:
dnn_pool
: replace with the new 3d pooling interface. (It wasn't useful for 2d pooling):theano.tensor.signal.pool.pool_{2d,3d}
- If you use:
dnn_batch_normalization_train()
ordnn_batch_normalization_test()
, usetheano.tensor.nnet.bn.batch_normalization_{train,test}
instead. - grep for "
import theano.sandbox.cuda
" in your files.- If you find such import, they will need to be converted. In many cases, you can stop using a gpu interface and use the CPU interface. This will make your code work with both CPU and GPU back-ends.
- Now all convolution are available on the CPU
- Now all pooling are available on the CPU
- If there are others, check in the CPU interface, otherwise, you can probably change
theano.sandbox.cuda
totheano.gpuarray
Config changes:
- The following Theano config keys sections don't have any effect on the new backend and should be removed:
nvcc.*
cuda.root
lib.cnmem
(replace bygpuarray.preallocate
) Important: The default changed to be faster, but cause more memory fragmentation. To keep the speed and remove the fragmentation, use the flaggpuarray.preallocate=1
(or any value greater then 0, see the dot. To have the old default of Theano, use the flag:gpuarray.preallocate=-1
Safety checks:
- Check that it still trains and has the same speed (we don't expect problems, but it is better to be safe!)
What to expect?
- Maybe a small run time speed up (0-10%).
- Maybe a run time slow down if you are using one of the op not yet ported (we did 98+%)
- A compilation speed up.
- Support for multiple dtypes including float16 for many ops.
- cuDNN RNN wrapper (It need that you use it manually)
- float16 for storage (computation in float32 for now, so work even on non Pascal GPU) See https://github.com/Theano/Theano/issues/2908 for exact status.