404. SageMaker Image Classification 101 - qyjohn/AWS_Tutorials GitHub Wiki
This is with reference to the following tutorial, but in Python running on an EC2 instance running Ubuntu 18.04. Note that I have Python 2.7.12 on the operating system.
Make sure that you have the AWS SDK for Python (boto3), as well as the Python SDK for SageMaker, MXNet, OpenCV installed:
sudo pip install boto3
sudo pip install sagemaker
sudo pip install mxnet
sudo pip install opencv-python
sudo apt install libsm-dev
Then run the following script to download the Caltech 256 dataset and
#!/bin/bash
wget http://www.vision.caltech.edu/Image_Datasets/Caltech256/256_ObjectCategories.tar
tar -xf 256_ObjectCategories.tar
wget https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/im2rec.py
mkdir -p caltech_256_train_60
for i in 256_ObjectCategories/*; do
c=`basename $i`
mkdir -p caltech_256_train_60/$c
for j in `ls $i/*.jpg | shuf | head -n 60`; do
mv $j caltech_256_train_60/$c/
done
done
python im2rec.py --list --recursive caltech-256-60-train caltech_256_train_60/
python im2rec.py --list --recursive caltech-256-60-val 256_ObjectCategories/
head -n 3 ./caltech-256-60-train.lst > example.lst