Lab 4 - Gnkhakimova/CS5542-BigData_LabAssignments GitHub Wiki

1.Spark Programming:

  • Data set contains images of cats, pandas, stairs, lighthouses, dogs, strawberries, helicopters, zebras and etc.Data set is divided into two parts: train data, which includes 70% of images and test data, which includes 30% of images.
  • In image classification, an image is classified according to its visual content. For example, does it contain a lighthouse or not. An important application is image retrieval - searching through an image dataset to obtain those images with particular visual content.