Variational autoencoder (VAE) for Lego minifig faces. I have a writeup with more detail here: https://www.echevarria.io/blog/lego-face-vae/
This repo contains a training set of images of Lego minifig faces and code to train a VAE on them.
Much of the code defining the VAE model is derived from David Foster's excellent book Generative Deep Learning: Teaching Machines to Paint, Write, Compose, and Play and from the book's accompanying repository.
NOTE (2021-12-23): This notebook no longer works out of the box on Colab because of issues with dependencies. Such is the joy of Python package management.
Google Colab is a free environment to run Jupyter Notebooks with the option of using GPU/TPU instances.
To run the notebook in Colab, first go to https://colab.research.google.com/github/iechevarria/lego-face-VAE/blob/master/VAE_colab.ipynb.
Next, run the following commands that appear in the section "Set up Colab environment":
!git clone https://github.com/iechevarria/lego-face-VAE
cd lego-face-VAE
!unzip dataset.zip
It should now be possible to run all the sections in the notebook. If you want to experiment with the pretrained model included in the repo, skip to the notebook section titled "Do the fun stuff with the VAE". If you want to train your own VAE, run the cells in the section titled "Train VAE on Lego faces".
The following table lays out directories/files and their purposes:
directory/file | description |
---|---|
dataset_scripts/ | Scripts to pull and process dataset images |
ml/utils.py | Utilities for loading data and for making plots |
ml/variational_autoencoder.py | Defines the VAE model |
trained_model/ | Pretrained model params and weights |
VAE_colab.ipynb | Notebook to train and evaluate models |
dataset.zip | Zipped directory of training images |
The following is a plot of random images that were reconstructed by the VAE. The top image is the input and the bottom image is the VAE's reconstruction:
The following are plots of the intermediate vectors between face encodings (face morph visualization):
The training data (approximately 2600 128x128 JPEG images of Lego minifig faces)
is contained in dataset.zip
. These images were pulled from Bricklink.
The scripts used to pull and process the data are contained in the
dataset_scripts
directory. I manually removed images that were low quality
or that did not contain a clear face.