AI for Synthetic Expression Synthesis

Back to Verticals page

For synthetic expression synthesis, our aim is to develop groundbreaking tools for creating realistic facial expressions. We propose advanced algorithms to leveraging AI to generate real looking expressions for virtual environments, human-computer interaction, and entertainment.

a. Facial Expression Synthesis In-The-Wild on Pencil-sketches, Animals-faces and Statue-faces

Many recently proposed conditional GANs exhibit excellent image-to-image translation results if the test image samples are close to the data distributions already learned during training. As the test samples become distant from the learned distributions, the performance of these GANs degrades. In the current work, we propose to use regression to produce an intermediate representation which results in significant generalization enhancement of the GAN. Our proposed Regression GAN (RegGAN) is trained for facial expression synthesis on real human face datasets. A ridge regression loss-based layer in the generator network is trained using local receptive fields by minimizing the least squared error, while the rest of the network is trained using adversarial loss optimization. The proposed RegGAN has shown excellent results on in-dataset images while for the out-of-dataset or in-the-wild images, RegGAN has performed significantly better than the current state-of-the-art GANs including Pix2Pix, CycleGAN, and StarGAN. Our wild test set consists of human facial images, pencil sketches, wild and domestic animal faces and statue faces. The results of the proposed RegGAN and existing state-of-the-art GANs are compared both quantitatively and qualitatively. In both types of comparisons, significant performance gains are observed over the wild test sets.

b Masked Linear Regression for Facial Expression Synthesis

This work introduces a novel approach for facial expression synthesis, specifically addressing challenges in high-dimensional mapping, especially for large resolution images. The proposed method leverages the sparsity and local correlation inherent in facial expressions. The key innovation is a constrained ridge regression model termed masked regression, which significantly reduces the number of parameters. The masked regression model allows for efficient training on larger image sizes, surpassing existing regression and GAN-based methods in terms of mean-squared-error, visual quality, and computational complexity. This approach offers a more effective solution for facial expression synthesis, particularly when dealing with high resolution images.

Related Publications