Play with OpenFace 2.0 - tammyyang/simdat GitHub Wiki

What is OpenFace?

What are the differences in OpenFace 2.0?

More details see blog post here
The improved accuracy goes from 76.1% to 92.9%

Why customization?

The OpenFace API handles face alignment, crop, representation and classification/comparison in one go.
However, these steps can be better isolated to be more efficient.
The openface in simdat allows you to calculate the representations for face images and save them in a json db.
Several json dbs can be used together as a representation database which makes classification/comparison more easily.

Major Dependencies

CUDA (optional)
docker (optional)
dlib
OpenCV
scikit-learn, scikit-image
torch

Dockerfile

To make life easier, it is recommended to use docker for OpenFace. The customized Dockerfile can be found here, the only difference is the CUDA/GPU support. Follow the instruction of nvidia-docker to launch your container with CUDA support. The following steps are based on the assumption that you use docker built from the cudnn-openface Dockerfile.

Setup

Some steps are required before using simdat and openface.

Edit ~/.bashrc and set PYTHONPATH properly by adding the following line

export PYTHONPATH=${UPPER_DIR_OF_simdat}:${SOURCE_OF_openface}:$PYTHONPATH

For example, assuming simdat and openface are both in ~/SOURCES,

UPPER_DIR_OF_simdat=~/SOURCES
SOURCE_OF_openface=~/SOURCES/openface

Copy matplotlibrc

#cp $SOURCE_OF_simdat/core/matplotlibrc ~/.config/matplotlib/

That's all!

Calculate representations

It is recommended you calculate representations of ALL face images. You may think this step as a process to create a local database for your images.

Go to the directory which includes face images
Call simdat.openface.oftools.OpenFace.get_reps. You can find examples in simdat/examples/openface_demo.py
Results will be saved in result.json with each image data converted into a dictionary. The key is the md5sum of the image and the class of the image is automatically decided by the base of the image path.

{"88e751fedd20712c4dd452dcb9e70f2d": {"path": "/tammy/www/database/face_30/training/original/18/20160201_gnfwvkffbphp72l8oqp54kuigzpjj2.jpg", "rep": [0.072481222450733, 0.035279173403978, -0.0061078974977136, 0.10955177247524, 0.055204018950462, 0.05951888486743, 0.066506460309029, 0.040445055812597, 0.049749683588743, -0.0072148903273046, 0.1999783217907, 0.030023569241166, 0.08411843329668, -0.092975914478302, -0.015142595395446, -0.013621752150357, 0.02025786228478, 0.18571685254574, -0.21004873514175, 0.080910600721836, 0.13924884796143, -0.09523206949234, 0.045325726270676, 0.16569949686527, 0.0064035002142191, -0.24062170088291, -0.096905738115311, -0.11416623741388, -0.046716339886189, 0.037295438349247, -0.00092278135707602, 0.014558205381036, -0.0031084765214473, 0.12721459567547, -0.090774521231651, 0.022244231775403, -0.082555264234543, -0.076722465455532, -0.14420263469219, 0.061461009085178, 0.068978041410446, -0.035017032176256, -0.086971439421177, 0.059624295681715, -0.10437279194593, -0.082797534763813, 0.0059043993242085, -0.0030142012983561, -0.051937859505415, 0.056615620851517, 0.064181081950665, -0.054000616073608, -0.18801526725292, -0.031473025679588, -0.07335414737463, 0.052137400954962, -0.049266632646322, -0.033435340970755, 0.045479472726583, 0.11156196892262, -0.04114655405283, -0.011207944713533, -0.035342574119568, -0.071747973561287, 0.0045371535234153, -0.14897377789021, -0.01110427454114, -0.010491293855011, -0.056608375161886, 0.11968926340342, 0.037683088332415, 0.041804388165474, 0.020289255306125, -0.060640808194876, -0.045285277068615, -0.0062034870497882, -0.0094162663444877, 0.13307191431522, -0.11814764887094, -0.011314208619297, -0.12936660647392, -0.016838409006596, -0.0647267177701, -0.029260950163007, -0.11791966110468, 0.27000239491463, 0.065477751195431, -0.15051570534706, -0.13305127620697, 0.0030921890866011, 0.014698518440127, -0.14670325815678, -0.081282749772072, -0.11348696798086, -0.054417010396719, -0.053645230829716, -0.17789220809937, -0.07947962731123, -0.019784811884165, 0.034701380878687, 0.023038975894451, -0.011150492355227, 0.058300741016865, 0.07286561280489, 0.11993412673473, 0.0029367518145591, -0.032502721995115, 0.14177139103413, -0.027466395869851, -0.03957999125123, -0.12508596479893, 0.014171047136188, -0.00013356539420784, -0.032780714333057, -0.015531815588474, 0.012248504906893, -0.012924991548061, -0.051655564457178, 0.041887577623129, 0.14672385156155, -0.025783395394683, 0.040888536721468, -0.1793325394392, 0.22483262419701, 0.041527006775141, -0.013345115818083, -0.03429989144206, 0.0260079652071], "class": "18", "pos": [-30, 74, 177, 260], "dim": 96}

Once the re-presentation is done, you can calculate the Euclidean distance of two faces, classify faces and make predictions. simdat.openface.oftools.OpenFace.pick_reps(dbs) can be used to pick representations for the local images.

Train SVM

simdat.core.ml provides GridSearch of some basic machine learning methods. The simplest way of using representations is to train a SVM model for classification. Here are some example codes:

of = oftools.OpenFace()

def pick_images():                                                              
    root = '/www/database/db/'                                                                          
    dbs = [root + 'face_30_training_original.json',                             
           root + 'face_30_val_original.json',                                  
           root + 'face_others.json',                                           
           root + 'face_30_val_roi.json',]                                      
    return of.pick_reps(dbs)              

df = pick_images()                                                          
res = of.read_df(df, dtype='train', group=False)                            
mf = ml.run(res['data'], res['target'])

The model will be saved to 'SVC.pkl', use simdat.core.ml.test or simdat.core.ml.predict to validate or to predict. More examples see simdat/examples/openface_demo.py

Step-by-step instruction

Calculate representations

Store your face images (for both training and testing) by categories

root@XX:/www/database# ls   
001  002

Execute the demo script to calculate representations and generate db

root@XX:/www/database#python openface_demo.py -a reps
Processing /www/database/001/A758POL02-02_1448618921829_432601_ver1.0.jpg
Processing /www/database/001/600_phpENf47R.jpg
Processing /www/database/001/C1431415478650.jpg
...

Copy dbs to a good place so the demo script can find them during training and testing

root@XX:/www/database#cp result.json /www/database/db

Training

Go to the directory with all training images you are going to use.

root@XX:/www/database/training# ls   
001  002

Make sure the representations are generated before and the db files are copied to /www/database/db. Execute the demo script to train.

root@XX:/www/database/training#python openface_demo.py -a train --dbpath="/www/database/db"
[TOOLS] DataFrame is written to ./picked_rep.json
Map of target - int is written to ./mapping.json
[ML] GridSearchCV for: [{'kernel': ['rbf'], 'C': [0.1, 1, 10, 100, 1000], 'gamma': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.1]}]
0.42888 seconds to find best parameters - train
[ML] Best parameters are: {'kernel': 'rbf', 'C': 1, 'gamma': 0.1}
             precision    recall  f1-score   support

          0       0.95      0.91      0.93        22
          1       0.93      0.96      0.95        27

avg / total       0.94      0.94      0.94        49

[ML] Accuracy: 0.93878 (+/- 0.07070)
[ML] Re-fit model with the full dataset
[ML] Model is saved to ./SVC.pkl

simdat.core.ml will split the data into a training set (70%) and a validation set (30%) automatically. After finding the best parameters using both the training and validation sets with GridSearchCV, the whole set of data is used to re-train the model with the best parameters.

mapping.json stores a dictionary of true label used during training vs class fetched from image path. To re-use the trained model, keep SVC.pkl and mapping.json to somewhere safe. You can also choose different classifiers other than SVM.SVC.

Testing

Demo.act_test will try to find all faces in one image and return the calculated recall and precision rates. Before you start testing, again, make sure the representation db is generated for the testing images.

Go to the directory with all testing images you are going to use.

root@XX:/www/database/testing# ls   
001  002

Execute the demo script to test.

root@XX:/www/database/testing#python openface_demo.py -a test --dbpath="/www/database/db" --mpf='/www/database/training/mapping.json' --model-path="/www/database/training/" -p 0.3

[openface_demo] Recall = 0.99
[openface_demo] Precision = 0.97
[openface_demo] Images with roi are saved to /www/experiments/20160316

Prediction

There is no need to pre-generate facenet representation for the face images used for prediction. It assume the number of images feed into the demo script for prediction will not be large (use -a test if you have a big batch of images).

Put all face images you are going to predict in one directory.

root@XX:/www/database/prediction# ls   
490dc1b86a952.jpg 
5020842581465.jpg

Execute the demo script to predict.

root@XX:/www/database/testing#python openface_demo.py -a predict --mpf='/www/database/training/mapping.json' --model-path="/www/database/training/" --workdir='/www/database/prediction'

[oftools] Calculating rep for /www/database/test/490dc1b86a952.jpg
[oftools] Calculating rep for /www/database/test/5020842581465.jpg
0.00025 seconds to predict 128 data entries
[openface_demo] Parsing /www/database/test/490dc1b86a952.jpg
2
0.00017 seconds to predict 128 data entries
[openface_demo] Parsing /www/database/test/5020842581465.jpg
2

Compare face images

There are two ways to compare face images, to compare all images in the given directory (method 1) or to compare two images with given paths (method 2).

[method 1] Store face images you want to compare in a directory

root@XX:/www/database/compare# ls   
490dc1b86a952.jpg 
5020842581465.jpg
root@XX:/www/database/compare# python openface_demo.py -a compare_dir --workdir='/www/database/compare'

[method 2]

python openface_demo.py -a compare --img1='/www/fun_image_1.jpg' --img2='/www/fun_image_2.jpg'

Use profiles

If you follow the step-by-step instruction described above, you might see some warnings which remind you the missing profiles, such as

[Args] WARNING: File ml.json does not exist
[Args] WARNING: File openface.json does not exist

These designed to further customize scikit-learn or openface. You are totally safe to ignore these warnings. simdat was designed to take care of the best default parameters used for machine learning and data analysis. However, if you are interested to make more customization, check OFArgs class in simdat/openface/oftools.py for setting arguments for openface and SVMArgs (or ${CLASSIFIER}Args for other classifier) in simdat/core/ml.py for setting arguments for scikit-learn.