Play with OpenFace 2.0 - tammyyang/simdat GitHub Wiki
What is OpenFace?
What are the differences in OpenFace 2.0?
- More details see blog post here
- The improved accuracy goes from 76.1% to 92.9%
Why customization?
- The OpenFace API handles face alignment, crop, representation and classification/comparison in one go.
- However, these steps can be better isolated to be more efficient.
- The openface in simdat allows you to calculate the representations for face images and save them in a json db.
- Several json dbs can be used together as a representation database which makes classification/comparison more easily.
Major Dependencies
- CUDA (optional)
- docker (optional)
- dlib
- OpenCV
- scikit-learn, scikit-image
- torch
Dockerfile
To make life easier, it is recommended to use docker for OpenFace. The customized Dockerfile can be found here, the only difference is the CUDA/GPU support. Follow the instruction of nvidia-docker to launch your container with CUDA support. The following steps are based on the assumption that you use docker built from the cudnn-openface Dockerfile.
Setup
Some steps are required before using simdat and openface.
- Edit ~/.bashrc and set PYTHONPATH properly by adding the following line
export PYTHONPATH=${UPPER_DIR_OF_simdat}:${SOURCE_OF_openface}:$PYTHONPATH
For example, assuming simdat and openface are both in ~/SOURCES,
UPPER_DIR_OF_simdat=~/SOURCES
SOURCE_OF_openface=~/SOURCES/openface
- Copy matplotlibrc
#cp $SOURCE_OF_simdat/core/matplotlibrc ~/.config/matplotlib/
That's all!
Calculate representations
It is recommended you calculate representations of ALL face images. You may think this step as a process to create a local database for your images.
- Go to the directory which includes face images
- Call
simdat.openface.oftools.OpenFace.get_reps
. You can find examples insimdat/examples/openface_demo.py
- Results will be saved in result.json with each image data converted into a dictionary. The key is the md5sum of the image and the class of the image is automatically decided by the base of the image path.
{"88e751fedd20712c4dd452dcb9e70f2d": {"path": "/tammy/www/database/face_30/training/original/18/20160201_gnfwvkffbphp72l8oqp54kuigzpjj2.jpg", "rep": [0.072481222450733, 0.035279173403978, -0.0061078974977136, 0.10955177247524, 0.055204018950462, 0.05951888486743, 0.066506460309029, 0.040445055812597, 0.049749683588743, -0.0072148903273046, 0.1999783217907, 0.030023569241166, 0.08411843329668, -0.092975914478302, -0.015142595395446, -0.013621752150357, 0.02025786228478, 0.18571685254574, -0.21004873514175, 0.080910600721836, 0.13924884796143, -0.09523206949234, 0.045325726270676, 0.16569949686527, 0.0064035002142191, -0.24062170088291, -0.096905738115311, -0.11416623741388, -0.046716339886189, 0.037295438349247, -0.00092278135707602, 0.014558205381036, -0.0031084765214473, 0.12721459567547, -0.090774521231651, 0.022244231775403, -0.082555264234543, -0.076722465455532, -0.14420263469219, 0.061461009085178, 0.068978041410446, -0.035017032176256, -0.086971439421177, 0.059624295681715, -0.10437279194593, -0.082797534763813, 0.0059043993242085, -0.0030142012983561, -0.051937859505415, 0.056615620851517, 0.064181081950665, -0.054000616073608, -0.18801526725292, -0.031473025679588, -0.07335414737463, 0.052137400954962, -0.049266632646322, -0.033435340970755, 0.045479472726583, 0.11156196892262, -0.04114655405283, -0.011207944713533, -0.035342574119568, -0.071747973561287, 0.0045371535234153, -0.14897377789021, -0.01110427454114, -0.010491293855011, -0.056608375161886, 0.11968926340342, 0.037683088332415, 0.041804388165474, 0.020289255306125, -0.060640808194876, -0.045285277068615, -0.0062034870497882, -0.0094162663444877, 0.13307191431522, -0.11814764887094, -0.011314208619297, -0.12936660647392, -0.016838409006596, -0.0647267177701, -0.029260950163007, -0.11791966110468, 0.27000239491463, 0.065477751195431, -0.15051570534706, -0.13305127620697, 0.0030921890866011, 0.014698518440127, -0.14670325815678, -0.081282749772072, -0.11348696798086, -0.054417010396719, -0.053645230829716, -0.17789220809937, -0.07947962731123, -0.019784811884165, 0.034701380878687, 0.023038975894451, -0.011150492355227, 0.058300741016865, 0.07286561280489, 0.11993412673473, 0.0029367518145591, -0.032502721995115, 0.14177139103413, -0.027466395869851, -0.03957999125123, -0.12508596479893, 0.014171047136188, -0.00013356539420784, -0.032780714333057, -0.015531815588474, 0.012248504906893, -0.012924991548061, -0.051655564457178, 0.041887577623129, 0.14672385156155, -0.025783395394683, 0.040888536721468, -0.1793325394392, 0.22483262419701, 0.041527006775141, -0.013345115818083, -0.03429989144206, 0.0260079652071], "class": "18", "pos": [-30, 74, 177, 260], "dim": 96}
Once the re-presentation is done, you can calculate the Euclidean distance of two faces, classify faces and make predictions. simdat.openface.oftools.OpenFace.pick_reps(dbs)
can be used to pick representations for the local images.
Train SVM
simdat.core.ml
provides GridSearch of some basic machine learning methods. The simplest way of using representations is to train a SVM model for classification. Here are some example codes:
of = oftools.OpenFace()
def pick_images():
root = '/www/database/db/'
dbs = [root + 'face_30_training_original.json',
root + 'face_30_val_original.json',
root + 'face_others.json',
root + 'face_30_val_roi.json',]
return of.pick_reps(dbs)
df = pick_images()
res = of.read_df(df, dtype='train', group=False)
mf = ml.run(res['data'], res['target'])
The model will be saved to 'SVC.pkl', use simdat.core.ml.test
or simdat.core.ml.predict
to validate or to predict. More examples see simdat/examples/openface_demo.py
Step-by-step instruction
Calculate representations
Store your face images (for both training and testing) by categories
root@XX:/www/database# ls
001 002
Execute the demo script to calculate representations and generate db
root@XX:/www/database#python openface_demo.py -a reps
Processing /www/database/001/A758POL02-02_1448618921829_432601_ver1.0.jpg
Processing /www/database/001/600_phpENf47R.jpg
Processing /www/database/001/C1431415478650.jpg
...
Copy dbs to a good place so the demo script can find them during training and testing
root@XX:/www/database#cp result.json /www/database/db
Training
Go to the directory with all training images you are going to use.
root@XX:/www/database/training# ls
001 002
Make sure the representations are generated before and the db files are copied to /www/database/db
. Execute the demo script to train.
root@XX:/www/database/training#python openface_demo.py -a train --dbpath="/www/database/db"
[TOOLS] DataFrame is written to ./picked_rep.json
Map of target - int is written to ./mapping.json
[ML] GridSearchCV for: [{'kernel': ['rbf'], 'C': [0.1, 1, 10, 100, 1000], 'gamma': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.1]}]
0.42888 seconds to find best parameters - train
[ML] Best parameters are: {'kernel': 'rbf', 'C': 1, 'gamma': 0.1}
precision recall f1-score support
0 0.95 0.91 0.93 22
1 0.93 0.96 0.95 27
avg / total 0.94 0.94 0.94 49
[ML] Accuracy: 0.93878 (+/- 0.07070)
[ML] Re-fit model with the full dataset
[ML] Model is saved to ./SVC.pkl
simdat.core.ml will split the data into a training set (70%) and a validation set (30%) automatically. After finding the best parameters using both the training and validation sets with GridSearchCV, the whole set of data is used to re-train the model with the best parameters.
mapping.json stores a dictionary of true label used during training vs class fetched from image path. To re-use the trained model, keep SVC.pkl and mapping.json to somewhere safe. You can also choose different classifiers other than SVM.SVC.
Testing
Demo.act_test will try to find all faces in one image and return the calculated recall and precision rates. Before you start testing, again, make sure the representation db is generated for the testing images.
Go to the directory with all testing images you are going to use.
root@XX:/www/database/testing# ls
001 002
Execute the demo script to test.
root@XX:/www/database/testing#python openface_demo.py -a test --dbpath="/www/database/db" --mpf='/www/database/training/mapping.json' --model-path="/www/database/training/" -p 0.3
[openface_demo] Recall = 0.99
[openface_demo] Precision = 0.97
[openface_demo] Images with roi are saved to /www/experiments/20160316
Prediction
There is no need to pre-generate facenet representation for the face images used for prediction. It assume the number of images feed into the demo script for prediction will not be large (use -a test if you have a big batch of images).
Put all face images you are going to predict in one directory.
root@XX:/www/database/prediction# ls
490dc1b86a952.jpg
5020842581465.jpg
Execute the demo script to predict.
root@XX:/www/database/testing#python openface_demo.py -a predict --mpf='/www/database/training/mapping.json' --model-path="/www/database/training/" --workdir='/www/database/prediction'
[oftools] Calculating rep for /www/database/test/490dc1b86a952.jpg
[oftools] Calculating rep for /www/database/test/5020842581465.jpg
0.00025 seconds to predict 128 data entries
[openface_demo] Parsing /www/database/test/490dc1b86a952.jpg
2
0.00017 seconds to predict 128 data entries
[openface_demo] Parsing /www/database/test/5020842581465.jpg
2
Compare face images
There are two ways to compare face images, to compare all images in the given directory (method 1) or to compare two images with given paths (method 2).
[method 1] Store face images you want to compare in a directory
root@XX:/www/database/compare# ls
490dc1b86a952.jpg
5020842581465.jpg
root@XX:/www/database/compare# python openface_demo.py -a compare_dir --workdir='/www/database/compare'
[method 2]
python openface_demo.py -a compare --img1='/www/fun_image_1.jpg' --img2='/www/fun_image_2.jpg'
Use profiles
If you follow the step-by-step instruction described above, you might see some warnings which remind you the missing profiles, such as
[Args] WARNING: File ml.json does not exist
[Args] WARNING: File openface.json does not exist
These designed to further customize scikit-learn or openface. You are totally safe to ignore these warnings. simdat was designed to take care of the best default parameters used for machine learning and data analysis. However, if you are interested to make more customization, check OFArgs
class in simdat/openface/oftools.py for setting arguments for openface and SVMArgs
(or ${CLASSIFIER}Args for other classifier) in simdat/core/ml.py for setting arguments for scikit-learn.