404. SageMaker Image Classification 101 - qyjohn/AWS_Tutorials GitHub Wiki

This is with reference to the following tutorial, but in Python running on an EC2 instance running Ubuntu 18.04. Note that I have Python 2.7.12 on the operating system.

Make sure that you have the AWS SDK for Python (boto3), as well as the Python SDK for SageMaker, MXNet, OpenCV installed:

sudo pip install boto3
sudo pip install sagemaker
sudo pip install mxnet
sudo pip install opencv-python
sudo apt install libsm-dev

Then run the following script to download the Caltech 256 dataset and

#!/bin/bash
wget http://www.vision.caltech.edu/Image_Datasets/Caltech256/256_ObjectCategories.tar
tar -xf 256_ObjectCategories.tar
wget https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/im2rec.py
mkdir -p caltech_256_train_60
for i in 256_ObjectCategories/*; do
    c=`basename $i`
    mkdir -p caltech_256_train_60/$c
    for j in `ls $i/*.jpg | shuf | head -n 60`; do
        mv $j caltech_256_train_60/$c/
    done
done

python im2rec.py --list --recursive caltech-256-60-train caltech_256_train_60/
python im2rec.py --list --recursive caltech-256-60-val 256_ObjectCategories/
head -n 3 ./caltech-256-60-train.lst > example.lst