2016 08 01 - HenglinShi/LSTM_LIP_READING GitHub Wiki
2016-08-01
- One problem about the data size.
Currently the data has been split into two part: 80% training data (about 42 persons by 30 sequences per person), and 20% testing data ( bout 10 persons by 30 sequences per person).
However, there a problem comes:
-
Assuming we have batch_size as 80, then there will be 4 sequences needed for feeding the network for one iteration.
-
But we only have 42 by 30 sequences which is about 1200.
-
So that at most we can only have about 300 iterations.
- The loss does not converge
The problem could be that we were training samples were not shuffled enough, which means that firstly we put a bunch of data with label 1 to feed the network, them label 2, and so on.