Backlog - ParallelComputing2017/CNN GitHub Wiki

CNN

  • Create abstract factory for layer creation
  • Refactor Tensor to class
  • MNIST class
  • Replace vector with Neural Network class
  • Implement Downpour SGD
  • To makefile project

Pthreads

  • Parallel testing

OpenMP

  • Parallel testing

CUDA

  • Create array wrapper ( size 1 to n)
  • Use unified memory (see CUDA introduction)
  • Create device pointers in conv layer constructor
  • Free device memory on conv layer destructor
  • Update parameters on active function of conv layer
  • Full conv layer on device
  • Allocate device memory on cudaTensor constructor

OpenCL

  • Prepare program arguments
  • Implement Conv layer activation
  • Read kernel from file

MPI

  • Run the sequential version on each host
  • Run with mini batchs
  • Receive the trained model from each host

Experiment

  • Test Error vs weights updates (iterations)
  • Test Error vs epochs
  • CUDA vs OpenGL
  • Max speed up by implementation

Paper

  • Add link to source code repository
  • Review more references (8/10)
⚠️ **GitHub.com Fallback** ⚠️