Data Formats - AshokBhat/ml GitHub Wiki
| Notation | Mapping |
|---|---|
| N | Batch |
| C | Channels / Feature Map |
| D | Depth |
| H | Height |
| W | Width |
| Format | Commonly used |
|---|---|
| NCHW | Optimal to use with cuDNN |
| NHWC | Default format in TensorFlow, sometimes faster on CPU |
| nChw(x)c | Blocking format used by oneDNN - NCHW16c for AV512 and NCHW8c for SSE4.1 |
| CHWN |
Blocking format used by oneDNN
- In order to achieve better vectorization and cache re-usage oneDNN introduces blocked layout that splits one or several dimensions into the blocks of fixed size.
- Rationale is in the paper Distributed Deep Learning Using Synchronous Stochastic Gradient Descent
| Format | Commonly used |
|---|---|
| nChw(x)c | Blocking format used by oneDNN |
| nChw16c | Blocking format used by oneDNN for AV512 |
| nChw8c | Blocking format used by oneDNN for SSE4.1 |
- nChw16c and nChw8c
- nChw16c on AVX512+ systems - Block of channels and block size of 16
- nChw8c on SSE4.1+ systems - Block of channels and block size of 8
- Blocks of 8 channels are kept contiguously in memory.
- Pixel by pixel the spatial domain is covered.
- Then next slice covers the subsequent 8 channels (i.e. moving from c=0..7 to c=8..15).
- Once all channel blocks are covered the next image in the batch appears.
- oneDNN | TensorFlow | CuDNN
