Faster Caffe Training - BVLC/caffe GitHub Wiki

Faster Caffe Training

Stanford University CS231n Lecture 11 gives useful tips for speeding up your training process:

Use LMDB (seek times and no image decompression ie. jpeg).
Use cuBLAS. This is a CUDA version of BLAS and will be faster than CPU optimized BLAS.
Use cuDNN.
Use 32bit floating point precision (when writing new layers), as they compute faster.