Machine Learning

Practitioner

MLPractitioner.com

We are on your path to a practice in Machine Learning and Artificial Intelligence

Consider MLP your  resource and information site for news, tools and techniques to maintain your skillset

Facial Emotion Detection

I found the idea of facial recognition in itself controversial but facial emotion detection takes that even farther. By a slight stretch of the imagination, we can see how the addition of a captured, categorized emotion, associated with buying behavior could take an already over-surveilled populace into a brave new world indeed.

There are commercial cloud services, ‘Emotion as a Service’, EaaS for short that are offering APIs and back-end AI services to accomplish this in the chain of e-commerce today. If micro-targeting eeked you out, try adding  emotion-augmented micro-targeting for grins.

How AI Does This

If one wanted to do this from scratch, you could take a dataset of face imagery, get humans to label each one manually and when you had enough, use this to train a neural network to model and predict one of the seven standard emotions from an unlabeled, unforeseen test image dataset.

anger disgust fear happiness sadness surprise neutral

 

How this works

There are different approaches being taken but most endeavors appear to use the FER2013 dataset as a starting point as it consists of 30,000 labeled face images.  The labels follow the seven standard emotions above. Right now there really is no easy way to use unsupervised approaches to auto-label face data categorized by emotion. The fastest way to get this started would be to construct and train a chosen model on an existing, labeled face dataset.

Deep Learning Approach

The type of neural network, which activation functions, densities and such have been intense subjects of discussion on Kaggle and other data science communities. We have not made an exhaustive study but found commonalities in the highest scoring competitors on the Kaggle Facial Expression Recognition Challenge. We selected a couple of those as an evaluation and were able to successfully run them on Colab. All use Keras, a deep convolutional neural network and the FER2013 dataset. Thanks to Colab’s free GPU, it didn’t take long to train these models.

Test Data Preparation

An unlabeled test set undergoes CRNO : Convert, Reshape, Normalize, One-hot encoding. Then, the resulting images are submitted to a trained model that has seen many different facial expressions and scored reasonably well in accuracy and validation passes. CRNO is a practice that combines several pre-processing steps used in preparing images for deep-learning models.

Architecture of the Deep Learning Model

The Convolutional Neural Network (CNN) Architecture consists of a Sequential model. (From Kaggler Lx Yuan)

model = Sequential()

#module 1
model.add(Conv2D(2*2*num_features, kernel_size=(3, 3), 
input_shape=(width, height, 1), data_format='channels_last'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Conv2D(2*2*num_features, kernel_size=(3, 3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

#module 2
model.add(Conv2D(2*num_features, kernel_size=(3, 3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Conv2D(2*num_features, kernel_size=(3, 3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

#module 3
model.add(Conv2D(num_features, kernel_size=(3, 3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Conv2D(num_features, kernel_size=(3, 3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

#flatten
model.add(Flatten())

#dense 1
model.add(Dense(2*2*2*num_features))
model.add(BatchNormalization())
model.add(Activation('relu'))

#dense 2
model.add(Dense(2*2*num_features))
model.add(BatchNormalization())
model.add(Activation('relu'))

#dense 3
model.add(Dense(2*num_features))
model.add(BatchNormalization())
model.add(Activation('relu'))

#output layer
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss='categorical_crossentropy', 
              optimizer=Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-7), 
              metrics=['accuracy'])

model.summary()
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 46, 46, 256)       2560      
_________________________________________________________________
batch_normalization_1 (Batch (None, 46, 46, 256)       1024      
_________________________________________________________________
activation_1 (Activation)    (None, 46, 46, 256)       0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 46, 46, 256)       590080    
_________________________________________________________________
batch_normalization_2 (Batch (None, 46, 46, 256)       1024      
_________________________________________________________________
activation_2 (Activation)    (None, 46, 46, 256)       0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 23, 23, 256)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 23, 23, 128)       295040    
_________________________________________________________________
batch_normalization_3 (Batch (None, 23, 23, 128)       512       
_________________________________________________________________
activation_3 (Activation)    (None, 23, 23, 128)       0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 23, 23, 128)       147584    
_________________________________________________________________
batch_normalization_4 (Batch (None, 23, 23, 128)       512       
_________________________________________________________________
activation_4 (Activation)    (None, 23, 23, 128)       0         
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 11, 11, 128)       0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 11, 11, 64)        73792     
_________________________________________________________________
batch_normalization_5 (Batch (None, 11, 11, 64)        256       
_________________________________________________________________
activation_5 (Activation)    (None, 11, 11, 64)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 11, 11, 64)        36928     
_________________________________________________________________
batch_normalization_6 (Batch (None, 11, 11, 64)        256       
_________________________________________________________________
activation_6 (Activation)    (None, 11, 11, 64)        0         
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 1600)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 512)               819712    
_________________________________________________________________
batch_normalization_7 (Batch (None, 512)               2048      
_________________________________________________________________
activation_7 (Activation)    (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 256)               131328    
_________________________________________________________________
batch_normalization_8 (Batch (None, 256)               1024      
_________________________________________________________________
activation_8 (Activation)    (None, 256)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 128)               32896     
_________________________________________________________________
batch_normalization_9 (Batch (None, 128)               512       
_________________________________________________________________
activation_9 (Activation)    (None, 128)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 7)                 903       
=================================================================
Total params: 2,137,991
Trainable params: 2,134,407
Non-trainable params: 3,584
____________________________________________________________

 

Evaluation

 

This is not bad for a CNN with batch normalization.  In the competition, the score was .66.

This falls right into the ‘squishy’ zone of predictive analytics and is highly subjective.  However, kaggle contestants have gone as high as .71 accuracy which could be considered usable. It most likely will improve as models undergo optimization.

Prediction of Unlabeled Faces

Currently, I have been able to use Priya Dwvedi’s CNN for emotions to run some Trump faces through for detection. Thankfully, her model was saved at the highest accuracy and I could load it into a notebook in Colab very easily without having to re-train the model. At .56 accuracy, it got some of Trump’s facial expressions right.  I had to convert to grayscale, resize and reshape the faces so the model accepted them.  The featured image of this post represents the images I submitted.

My notebook (a fork of Priya’s base) of the Trump emotions test is on github here.   I will say that you need to work out a good way to submit images for prediction to work with this on Colab. There is a well-known mechanism to mount your Gdrive within a notebook instance (yes, its temporary).  I did that and also unzipped the image archive into a folder whose path is defined in the code.

If you wish to run my code in colab click the CO button below. The Trump images, 16 of them zipped is on this page in the ‘Data’ section of this site. You will have to upload that to your Google Drive, mount the drive in colab and !unzip trum_16_faces2.zip in a cell before using them. You will also need Priya’s CNN model of emotion-detection that was used to find targets for new images. That also is on the Data section of this site and will need to be unzipped in a cell and then run as follows:

model = load_model(model_path+"model_v6_23.hdf5")

Colab Notebook

Trump’s Face

Since Potus is everywhere and has a reasonably small set of facial expressions which represent an even smaller set of emotions, he makes a good subject to try and predict the emotion based on the expression. A Kaggler was kind enough (Muhammed Buyukkinaci) to collect these but not label them so I picked 16 and cleaned them up to pass through Priya’s model.

One can see that we are sitting at around 50% accuracy which accounts for the targets being off.

Extending The Facial Emotion Recognition Experiment

Perhaps a deeper neural network would bump the accuracy up to a more usable level as was done by Kaggler Lx Yuan). Although his network is more dense than the one I used to predict emotions of Trump’s face, it uses fewer epochs made up of three banks consisting of cascading feature density input. I ran this training and saved the model but when submitting the same facial data as before, it failed due to pre-formatting issues. I hope to figure this out soon.

Commercial Implementations

Facial emotion detection using deep learning is amusing to experiment with.  There are some approaches that work to a degree and can produce repeatable results. There are many commercial and governmental applications that benefit from doing this right along with a handful of good companies offering this as a service. Neurodata lab, Affectiva, Microsoft Face Rec and Amazon Rekognition (yes, another cloud service EaaS – Emotion as a Service)

The practice has attained some notoriety from an ethical standpoint due to questions about the accuracy of labeling based on ’emojification’ of facial expressions. Of course, this is another A.I. technology that could easily be abused so we have to take good care.

Periscopic, a socially conscious data visualization firm, created a feather plot of the emotions of inaugural speeches from Reagan to Trump which is quite illustrative of the changing sentiment of the orators through this period. Emotions were derived using the Microsoft Emotion API.

They also used the MS Emotion API to create the interactive Trump Emoto-Coaster infographic which is very entertaining.

You can interactively slide the emotion scale and it will cue up the video segment that shows the emotion in question.

What’s Next?

If we were to take this down its logical path, we are going to see much more use by corporate recruiters for during remove interviews. We are already starting to see audience response to marketing campaigns using emotion recognition.  Governments and law-enforcement will make use of real-time facial recognition with emotion understanding to prevent terrorist acts or mass shootings.

Read this paper to see how emotion detection may be coming under theoretical challenge right now.  For more head-scratching on the subject, this engaging article in the Washington Post peels back some of the layers of doubt surrounding this.

 

 

 

 

Stories interesting to Practitioners of Machine Learning and A.I.

Emotion Detection

by | Nov 21, 2019 | Data-Stories, StoriesFeat | 0 comments

0 Comments

Submit a Comment

Your email address will not be published.