A Report on unsupervised discovery of interpretable feature directions in pre-trained GANs. Made by Adrish Dey using Weights & Biases

Since its inception in 2014, Generative Adversarial Networks (aka GANs) has been at the forefront of Generative Modelling Research. GANs, with its millions of parameters, have found extensive usage in modeling probability densities of various complex data formats and generating ultra-realistic samples. These include photorealistic images, near-human speech synthesis, music generation, photorealistic video synthesis, etc. These, along with its likelihood free density estimation framework, GANs have found application in various challenging problems. One such area of interest involves modeling densities with interpretable conditionals, to generate samples having certain desired features. For example, controlling the intensity of certain features of images of human faces like -- smile, color, orientation, gender, facial structure, etc.

Among the various inherent problems in training GANs, this problem poses an additional challenge, i.e., lack of proper metric for atomically labeling intensities of sample features, which hinders the trivial solution of training the GAN under supervision of the conditional. In simpler terms, it is virtually impossible to have a dataset with a measure for the amount of smiles in a headshot, thus making it impossible to perform the training in a supervised setting.

This work studies a novel approach for discovering GAN controls in the Gaussian prior, in an unsupervised way. More importantly, this method doesn't need any retraining of GANs. This makes it convenient to generate controlled samples from pretrained state-of-the-art GAN architectures, like BigGAN, StyleGAN, StyleGAN2, etc.

It is not a surprise that real-world data arises from a remarkably complex probability distribution. This complicated nature renders any likelihood-based density estimator useless. More formally, the likelihood integral $\int P(X | z) P(z) dz$ of the Bayesian framework for posterior estimation becomes intractable to evaluate.

One trivial solution for this problem is to avoid the exact evaluation of the integral and approximate it using a simpler distribution, $Q(z)$, called a variational lower bound. For example, VAEs by Kingma et. al. uses $\mathcal{N}(\mu, \Sigma^2)$ as the variational lower bound. However, this approximation doesn't capture the intricacies of the likelihood distribution properly resulting in poor quality samples.

GANs, with its novel game-theoretic framework, solves this problem by avoiding the likelihood evaluation step by introducing a neural network between the prior and the posterior. First proposed by Goodfellow et.al. in his seminal paper "Generative Adversarial Networks", GANs model the density estimation process as a zero-sum game between two competing players. One of the players, the generator G, is modelled as a parametric function $G_\theta$ between the prior $P(z)$ (commonly isotropic gaussian $\mathcal{N}(0, I)$) and posterior space $P(X | z)$. The discriminator D, parameterized by $\phi$, tries to classify the samples between real and synthetic. For each optimization step, a player takes turns to improve its result, until a Nash equilibrium is attained. The formal objective is defined as follows.

$\underset{G}{min} , \underset{D}{max} , V(G, D; \theta, \phi)$

Among the various canonical forms of matrix representation, the eigenvector-eigenvalue form finds the most usage in machine learning literature. This is evident by its prevalence in recommendation systems literature, especially in dimensionality reduction. Eigenvalues have also found extensive usage in invertible neural networks, where singular values of weight matrices play a crucial role in preserving the isometry of layers.

*Given a square matrix $\mathbf{M}$ considered as a linear map from some vector space $V$ onto itself, i.e. $\mathbf{M}: V \rightarrow V$, the eigenvector $\mathbb{v}$ of the vector space, is defined as the vector whose linear transformation using $\mathbf{M}$, results only in a scale transformation by some constant $\lambda$. i.e., $\mathbf{M}\mathbb{v} = \lambda \mathbb{v}$, where $\lambda$ is the known as the eigenvalue.
For a set of all $r$ eigenvectors and eigenvalues,
$\mathbf{M}\mathbf{Q} = \mathbf{Q} \Lambda$, where $r$ is the rank of the matrix $M$, $Q$ is the matrix of eigenvectors whose $i^{th}$ column denotes the $i^{th}$ eigenvector on $V$, $\Lambda$ is a diagonal matrix, where each entry $\lambda_i$ is the eigenvalue of the corresponding $i^{th}$ eigenvector.*

Therefore, $\mathbf{M} = \mathbf{Q}\Lambda\mathbf{Q}^{-1}$ is defined as the eigen decomposition of matrix $\mathbf{M}$.

Despite the intuitive nature and widespread popularity of eigendecomposition, in practice, a generalized version of eigendecomposition, called singular value decomposition (SVD) is used. SVD removes the square matrix limitation of the eigendecomposition, by considering two vector spaces, row space($R$) and column space ($C$), instead of one. The final decomposition provides orthonormal basis vectors(analogous to eigenvectors) of the row space and column space, and the singular values(analogous to eigenvalues) accompanying it. This orthonormality allows accurate calculation of matrix inverses, making the decomposition computationally efficient.

*Let $\mathbf{M}$ be a linear map from a vector space $R$ to a vector space $C$, i.e., $\mathbf{M}: R \rightarrow C$. Let $\mathbf{U}$ be the matrix of singular vectors in $R$, and $\mathbf{V}$ be the matrix of singular vectors in $C$. Without loss of generality from the previous defintion, $\mathbf{MV} = \mathbf{U}\Sigma$, where $\Sigma$ is a diagonal matrix of singular values. Thus the singular value decomposition is defined as $\mathbf{M} = \mathbf{U}\Sigma\mathbf{V}^{-1}$ Since $\mathbf{V}$ is an orthogonal matrix, hence, the final decomposition is defined as: $\mathbf{M} = \mathbf{U}\Sigma\mathbf{V}^T$*

The methods discussed above decomposes a matrix in linearly independent vectors. Hence, for a full-rank matrix decomposition, the eigenvectors/singular vectors form a basis on the vector space(s). One important observation is that, since the vectors are normalized and form a basis, they can be used to represent any vector in the space, by performing an arbitrary linear combination of the vectors.

**Note**: The top-k column space singular vectors, (i.e., $k$ column space singular vectors associated with the $k$ largest singular values), are called the k-principal components. The process representing a matrix decomposition, by selecting dominant singular vectors is also known as *Principal Component Analysis (PCA)*.

Interpretable GAN control discovery, aka, latent space disentanglement in GANs, has been a heavily studied problem. The most notable work by Chen et. al., called InfoGAN, used mutual information between the posterior $Y \sim P(X|z, C)$ and a conditional vector $C$ passed along with prior. However, this required retraining the GAN with an additional entropy loss along with the adversarial loss.

In this work, instead of conditional training, the author explores a novel approach for finding "directions" of each principal features in the prior space of the GAN. This means, increasing or decreasing the magnitude of the prior $z$ in that direction, would result in increase or decrease of corresponding feature intensity, in the generated sample. More formally,

*Given a directional vector $\mathbb{v}_i$ for the $i^{th}$ feature, with intensity $x_i$, the new prior $z$ is re-defined as:*

*$z^\prime = z + \sum\limits_{i} x_i \cdot \mathbb{v}_i$*

*This can redefined as:*

*$z^\prime = z + \mathbf{V}\mathbf{x}$*

*where $\mathbf{V}$ is the matrix of directional vectors $v_i$, and $\mathbf{x}$ is a vector of intensities $x_i$ corresponding to each basis vector.*

One of the primary observations presented in this work, is the correlation of individual feature intensities, with the Principal Components of Feature Tensors in the early layers of GANs. In other words, the authors observed decomposing the feature tensors, disentangles certain features along each column space singular vector, where dominant singular vector encodes dominant features.

This raises an obvious question, "How does principal components in feature space, assist in finding principal directions in the prior space $z$?". For this purpose the author studies two architectures, one having isotropic prior, (ex. BigGAN) and other with learned feature vectors as the prior. (ex. StyleGAN, StyleGAN2).

This process is trivial in StyleGAN based architectures, where encoded feature tensors ($w = M(z, c) |_{z\sim P(z)}$) is used as a prior for the GAN, where $M$ is the feature encoder and $c$ is the class ID of the sample to generate.

Next, the principal components $\mathbf{V}$ obtained from decomposing $w$, is used for interpolating on feature intensities, $\mathbf{x}$. Formally,

$w^\prime = w + \mathbf{V}\mathbf{x}$

$y^\prime = G(w^\prime; \theta)$

The following visualization is created by interpolating on the magnitudes of $\mathbf{x}$ for a desired direction vector in $\mathbf{V}$.

For isotropic priors, the principal directions discovery is a bit complicated than learned feature-based priors. This is obvious since the isotropic prior $z$ is not a learned latent space, and encodes no information about the features of data samples.

To work around this challenge, the authors uses a projection method, to project the principal components at the $i^{th}$ layer back to the prior space. More formally, $N$ samples, $z_{1:N}$ is sampled from the prior $P(z)$. The calculation of the $N$ feature vectors $y_{i,{1:N}}$ at the $i^{th}$ layer is performed as $y_{i, j} = \hat{G}*i(z_j), \forall j \in [1:N]$, where $\hat{G} i(y_i) = G_i(y{i - 1})$ and $G_i$ denotes the $i^{th}$ layer of $G$. Calculating the principal component of the tensor $y*{i,{1:N}}$ creates a low-rank basis matrix $\mathbf{V}$, which is then used for approximating the basis vectors of $y_j$ as follows: $x_j = \mathbf{V}^T(y_j - \mu) \forall j \in [1:N]$ where $\mu$ is the mean of $\mathbf{V}$.

The basis vectors $x_{1:N}$ obtained is then projected to prior space, by use of linear regression. In other words, matrix $U$, where each column, $u_k$, denotes the basis vector in $z$, can be found by solving:

$\mathbf{U} = \underset{\mathbf{U}}{argmin} \sum\limits_j^N | \mathbf{U}x_j - y_j|$

Now the interpolation on $z$ can be defined as: $z^\prime = z + \mathbf{Ux}$ where $x^k$ in denotes the intensity of $k^{th}$ dominant feature.

The vector representing a feature $j$ (for example rotation, zoom, background, etc.) is chosen by interpolating on the directional magnitude scalar $\mathbf{x}_j$ associated with the directional vector $\mathbf{v}_j$ in $\mathbf{V}$ (for StyleGAN2) and $\mathbf{u}_j$ in $\mathbf{U}$ (for BigGAN). The limits for edits are also chosen in the same trial-and-error method. Let E$(\mathbf{v}_j, \mathbf{R})$ denote edit operation along direction $\mathbf{v}_j$ or $\mathbf{u}_j$ by a factor of $\mathbf{x}_j$, such that $\mathbf{x}_j$ falls in the range $\mathbf{R}$. The edits for each property for BigGAN and StyleGAN, as found by trial-and-error, is shown in the *bottom-left* of the figure below.

*Irish Setter Interpolation - BigGAN*

*Man Face Interpolation - StyleGAN*