Variational autoencoder for anime face reconstruction

This repository is an exploratory example to train a variational autoencoder to extract meaningful feature representations of anime girl face images.

The code architecture is mostly borrowed and modified from Yann Dubois’s disentangling-vae repository. It has nice summarization and comparison of the different VAE models proposed recently.

Dataset

Anime Face Dataset contains 63,632 anime faces. (all rescaled to 64×64 in training).

Model

The model used is the one proposed in the paper Understanding disentangling in β-VAE, which is summarized below:

architecture

I used laplace as the target, distribution to calculate the reconstruction loss. From Yann’s code, it suggests that Bernoulli would generally be a better choice, but it looks like it converges slowly in my case. (I didn’t do a fair comparison to be conclusive)

The loss function used is β-VAEH from β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework.

Result

The latent feature number is set to 20 (10 gaussian means, 10 log gaussian variance). VAE model is trained for 100 epochs. All data is used for training, no validation and testing are applied.

Prior space traversal

Based on the face reconstruction result while traversing across the latent space, we may speculate the generative property of each latent as follows:

  1. Hair shade
  2. Hair length
  3. Face orientation
  4. Hair color
  5. Face rotation
  6. Bangs, face color
  7. Hair glossiness
  8. Unclear
  9. Eye size & color
  10. Bangs
test prior traversals

Original faces clustering

Original anime faces are clustered based on latent features (the selected feature is either below 1% (left 5) or above 99% (right 5) among all data points, while the rest latent features are close to each other). Visualization of the original images mostly confirms the speculation above.

test original traversals

Latent feature diagnosis

Learned latent features are all close to a standard normal distribution, and show minimum correlation.

latent diagnosis

Source: https://github.com/Minzhe/VAE_animeface?ref=pythonawesome.com

Default image
Lingaraj Senapati
Hey There! I am Lingaraj Senapati, the Co-founder of lingarajtechhub.com My skills are Freelance, Web Developer & Designer, Corporate Trainer, Digital Marketer & Youtuber.
Articles: 253

Newsletter Updates

Enter your email address below to subscribe to our newsletter