404. SageMaker Image Classification 101 - qyjohn/AWS_Tutorials GitHub Wiki

This is with reference to the following tutorial, but in Python running on an EC2 instance running Ubuntu 18.04. Note that I have Python 2.7.12 on the operating system.

Image Classification 1st Format

Make sure that you have the AWS SDK for Python (boto3), as well as the Python SDK for SageMaker, MXNet, OpenCV installed:

sudo pip install boto3
sudo pip install sagemaker
sudo pip install mxnet
sudo pip install opencv-python
sudo apt install libsm-dev

Then run the following script to download the Caltech 256 dataset and

#!/bin/bash
wget http://www.vision.caltech.edu/Image_Datasets/Caltech256/256_ObjectCategories.tar
tar -xf 256_ObjectCategories.tar
wget https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/im2rec.py
mkdir -p caltech_256_train_60
for i in 256_ObjectCategories/*; do
    c=`basename $i`
    mkdir -p caltech_256_train_60/$c
    for j in `ls $i/*.jpg | shuf | head -n 60`; do
        mv $j caltech_256_train_60/$c/
    done
done

python im2rec.py --list --recursive caltech-256-60-train caltech_256_train_60/
python im2rec.py --list --recursive caltech-256-60-val 256_ObjectCategories/
head -n 3 ./caltech-256-60-train.lst > example.lst