2. Convolutional Neural Network - SummerBigData/Iceberg GitHub Wiki

I quickly realized that I needed to look at other's code to learn how to get a working solution. I saw this post, which stepped me through how to get a functioning implementation of keras. The first step was to add a third band to convert the images into RGB format. I used the average of the first two bands for this. Next, the following model architecture was implemented:

#Building the model
gmodel=Sequential()
#Conv Layer 1
gmodel.add(Conv2D(64, kernel_size=(3, 3),activation='relu', input_shape=(g.imgsize, g.imgsize, 3)))
gmodel.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
gmodel.add(Dropout(0.2))
	
#Conv Layer 2
gmodel.add(Conv2D(128, kernel_size=(3, 3), activation='relu' ))
gmodel.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
gmodel.add(Dropout(0.2))

#Conv Layer 3
gmodel.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
gmodel.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
gmodel.add(Dropout(0.2))
	
#Conv Layer 4
gmodel.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
gmodel.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
gmodel.add(Dropout(0.2))

#Flatten the data for upcoming dense layers
gmodel.add(Flatten())
	
#Dense Layers
gmodel.add(Dense(512))
gmodel.add(Activation('relu'))
gmodel.add(Dropout(0.2))

#Dense Layer 2
gmodel.add(Dense(256))
gmodel.add(Activation('relu'))
gmodel.add(Dropout(0.2))
	
#Sigmoid Layer
gmodel.add(Dense(1))
gmodel.add(Activation('sigmoid'))

mypotim=Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
gmodel.compile(loss='binary_crossentropy',
	optimizer=mypotim,
	metrics=['accuracy'])

As we can see, there are 4 convolution and pooling layers with dropouts of 20%, two dense layers with dropouts of 20%, and a sigmoid layer for the final binary classification. I also implemented callbacks with early stopping and best weight recording based on the validation data's accuracy using the following:

def get_callbacks(filepath, patience=8):	
	es = EarlyStopping('val_acc', patience=patience, mode="max")
	msave = ModelCheckpoint(filepath, monitor='val_acc',save_best_only=True,save_weights_only=True)
	return [es, msave]

And for a more graphic representation:

With all this, I was able to get 88.25%, which is pretty close to what the guy got (89.03%).I should probably mention that I divided my testing and training data as 1000 for training, 604 for testing, whereas he and most others did 75% / 25%. This would probably give me a percent or two once I switch to his method. The reason I haven't switched yet is that a larger testing set gives me more accurate probabilities, which will help me identify the best model and data augmentation methods more easily before maximizing it.