Those who have had their fair share of watching actions flicks in the 90’s, would surely remember Face/Off – the one where John Travolta and Nick Cage get their faces switched, leading to adrenaline-pumping mayhem. In the movie, this ‘face swapping’ scope was derived from some kind of plastic surgery wizardry. However, this time around, a new technology can seemingly achieve similar results without the need for invasive thingamabobs. We say similar, but not ‘same’, as this Stanford University contrived tech can transfer intrinsic facial expressions from face to another, instead of entirely swapping the physical features. To that end, the software is aptly christened as the ‘Real-time Expression Transfer for Facial Reenactment’.
In the video (shown below), the experiment is perfectly demonstrated by using two actors – the source actor on the right side who will emote his expressions and talk, and the target actor on the left side who will keep relatively calm. Now if you follow this source actor on the right side, you will notice how the person is showcasing emotional flair. But, when you see on the left screen, you can see these emotional ‘intensity’ and different gestures being mimicked by the avatar of the target actor. This result is seamlessly presented in spite of the actual target actor not showcasing any kind of comprehensible face contortion.
The really interesting part is, how this actual target actor is maintaining his slightly motional posture. Simply put, his face movement is not wholly static, as he moves his head and even shifts in his chair. In essence, the ‘Real-time Expression Transfer for Facial Reenactment’ works even when the subject is subtly moving. This entails the software’s synchronization with the real-time attribute of the target actor. Consequently, a flowing virtual animation is created that successfully denotes one’s persons emotional state (via facial expressions) with another person’s body – thus accounting for a composite ambit.
In terms of working scope, this hybrid composite animation is achieved by preliminary mapping of the two actors’ faces with the aid of a commodity RGB-D sensor. The technology utilizes different factors like identity, expression, and even skin reflectance; and these attributes are then matched with the data pertaining to color, depth and lighting. As a result, the advanced software is able to render near photorealistic representation of the subject in real-time circumstances, while replicating the behavior of the source actor. As the scientists at Stanford University said –
A major challenge is the convincing re-rendering of the synthesized target face into the corresponding video stream. This requires a careful consideration of the lighting and shading design, which both must correspond to the real-world environment. We demonstrate our method in a live setup, where we modify a video conference feed such that the facial expressions of a different person (e.g., translator) are matched in real-time.