Faster Caffe Training - BVLC/caffe GitHub Wiki
Faster Caffe Training
Stanford University CS231n Lecture 11 gives useful tips for speeding up your training process:
- Use LMDB (seek times and no image decompression ie. jpeg).
- Use cuBLAS. This is a CUDA version of BLAS and will be faster than CPU optimized BLAS.
- Use cuDNN.
- Use 32bit floating point precision (when writing new layers), as they compute faster.