Hire the author: Merishna S

Image from Source

Introduction

In this tutorial, we will look into one of the amazing applications of GANs by generating unique architectures. For doing this, you need no architectural skills or practice, but the knowledge about neural networks and how to train them. We will be going through the code and workflow for generating unique architectures using GANs as described in this GitHub project.

Generative Adversarial Networks (GANs) were introduced in 2014 by Goodfellow et al. which was considered as a breakthrough in the field of Neural networks and Generative models. The network today has grown and adapted into various forms along with state-of-the-art applications in various areas of the image, videos, and so on.

Motivation

GANs are a comparatively new class of neural networks compared to other neural networks commonly used in various Computer Vision tasks. Moreover, Yann LeCun, one of the prominent researchers in the Deep Learning field, described GANs as “the most interesting idea in the last ten years in Machine Learning”. Following the first release of the idea of GANs, several independent researchers have developed their versions of the network to perform extraordinary tasks from editing photos to DeepFakes.

Goal

In this tutorial, we will walk through the process of using GANs to generate unique architectures. They can be used in designing new forms of urban architecture. However, if you wish to, you can use this tutorial to generate almost any image you want based on your preference.

Glossary

  • Adversarial:  Involving two people or two sides who oppose each other.
  • Computer Vision: An interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. 

Prerequisites

  1. Programming knowledge in Python.
  2. Basic knowledge of Deep Learning, Tensorflow, and CNNs (Convolutional Neural Networks).

How can GANs be used to generate unique architectural designs?

GANs consist of 2 types of networks, Generative and Adversarial.

1. Generative Networks are a class of networks responsible for the generation. Hence, this network is also known as a generator.

2. Adversarial Networks are networks opposite of the Generative Network, i.e., they are responsible for classifying if the generated image is real or fake. This network is also known as a discriminator.

As we just discussed, the GAN has two networks competing against each other: a generator and a discriminator. A generator aims to generate new instances of an object based on a random noise sent to it as input, while the discriminator aims to determine whether the generated instance is real or fake by comparing the real and generated images.

How to generate unique architectures using GANs
Image from Source

Understanding it better

To understand it better, let’s suppose a customer is trying to use forged cash notes in a grocery store. Now it is up to the cashier to recognize if the cash is real or fake. If the cashier can recognize the forged cash, the customer is caught and might be even jailed. However, if the customer can replicate the cash note perfectly, there is less chance of being caught.

How to generate unique architectures using GANs
Image from Source

Here, consider the customer (generator) and the cashier (discriminator) to be competing against each other. In other words, the generator is trying to mimic the actual image in a way such that the discriminator should not be able to differentiate between the real and the fake ones. Over time, the discriminator gets better at detecting fake images. while the generator learns from its mistakes and gets better at generating more realistic images.

The generator and discriminator networks use convolutional neural networks (CNNs) to generate and predict the outcome of the network. Based on the architecture of the neural networks used, GANs are classified into various categories. In this tutorial, we will be using Deep Convolutional GANs (DCGANs) which uses Deep Convolutional Network architecture in its network.

Creating a DCGAN

DCGANs are essentially an improved version of a regular GAN. In this section, we will focus on the main elements of our model to generate unique architectures using the following,

  • The generator (G) takes in a random noise vector (z) as input and generates an image.
  • The generated image is fed into the discriminator (D), which compares the training set (real images) with our generated image.
  • Based on its predictions, the discriminator outputs a number between 0 (fake) and 1 (real). Here, the generator has no idea of what the real image data looks like, and learns to adjust its output based on the feedback of the discriminator.

All the steps that we are going to discuss hereafter are available as a notebook here.

Step 1: Getting the data

For the training data, we are using the data from wikiart.org. Download the dataset here and save it into a folder named “data”. However, you can choose to use any data you prefer. We will then resize all our images into a 128×128 image for training.

# Define the directory with real image data
data_dir = './data/' # Data
resized_data_dir = "./resized_data" # folder for saving resized data
# Resize images into 128x128
preprocess = True # set to False if no resizing
if preprocess == True:
# Create resized folder if not exist
if not os.path.exists(resized_data_dir):
os.mkdir(resized_data_dir)
for each in os.listdir(data_dir):
# Read the image
image = cv2.imread(os.path.join(data_dir, each))
image = cv2.resize(image, (128, 128))
cv2.imwrite(os.path.join(resized_data_dir, each), image)
# Explore the images
show_images = 5
data_images = helper.get_batch(glob(os.path.join(resized_data_dir, '*.jpg'))[:show_images], 64, 64, 'RGB')
plt.imshow(helper.images_square_grid(data_images, 'RGB'))
view raw image_resize.py hosted with ❤ by GitHub
How to generate unique architectures using GANs
Sample images

Step 2: Inputs for the model

The first step is to create the input placeholders: inputs_real, i.e., the real image dataset for the discriminator and inputs_z which is the random noise vector for the generator.

def gan_model_inputs(real_dim, z_dim):
"""
Creates the inputs for the model.
Arguments:
----------
:param real_dim: tuple containing width, height and channels
:param z_dim: The dimension of Z
----------
Returns:
Tuple of (tensor of real input images, tensor of z (noise) data, Generator learning rate, Discriminator learning rate)
"""
real_inputs = tf.placeholder(tf.float32, (None, *real_dim), name='real_inputs')
z_inputs = tf.placeholder(tf.float32, (None, z_dim), name="z_inputs")
generator_learning_rate = tf.placeholder(tf.float32, name="generator_learning_rate")
discriminator_learning_rate = tf.placeholder(tf.float32, name="discriminator_learning_rate")
return real_inputs, z_inputs, generator_learning_rate, discriminator_learning_rate

Step 3: The model architecture – Generator

A generator takes in a random noise vector (z) as input and outputs a fake image. We are using a de-convolutional neural network, whose architecture is the opposite of a conventional convolutional neural network. The idea behind doing this is that, at every layer of the network, as we halve the filter size, the size of the image is doubled, which finally results in generated images.

How to generate unique architectures using GANs
DC GAN Generator Architecture

As shown in the figure, we take in a random noise vector (z) of size 100 and pass it through a series of convolutional layers that finally outputs a 128×128 image. For the Leaky ReLu activation functions, we have used 0.3 and 0.2 as the alpha values.

def generator(z, output_channel_dim, is_train=True):
''' Building the generator network.
Arguments
---------
z : Input tensor for the generator
output_channel_dim : Shape of the generator output
n_units : Number of units in hidden layer
reuse : Reuse the variables with tf.variable_scope
alpha : leak parameter for leaky ReLU
Returns
-------
out:
'''
with tf.variable_scope("generator", reuse= not is_train):
# First FC layer --> 8x8x1024
fc1 = tf.layers.dense(z, 8*8*1024)
# Reshape the layer
fc1 = tf.reshape(fc1, (-1, 8, 8, 1024))
# Leaky ReLU Activation
fc1 = tf.nn.leaky_relu(fc1, alpha=alpha)
# Transposed conv 1 --> BatchNorm --> LeakyReLU
# 8x8x1024 --> 16x16x512
trans_conv1 = tf.layers.conv2d_transpose(inputs = fc1,
filters = 512,
kernel_size = [5,5],
strides = [2,2],
padding = "SAME",
kernel_initializer=tf.truncated_normal_initializer(stddev=0.02),
name="trans_conv1")
batch_trans_conv1 = tf.layers.batch_normalization(inputs = trans_conv1, training=is_train, epsilon=1e-5, name="batch_trans_conv1")
trans_conv1_out = tf.nn.leaky_relu(batch_trans_conv1, alpha=alpha, name="trans_conv1_out")
# Transposed conv 2 --> BatchNorm --> LeakyReLU
# 16x16x512 --> 32x32x256
trans_conv2 = tf.layers.conv2d_transpose(inputs = trans_conv1_out,
filters = 256,
kernel_size = [5,5],
strides = [2,2],
padding = "SAME",
kernel_initializer=tf.truncated_normal_initializer(stddev=0.02),
name="trans_conv2")
batch_trans_conv2 = tf.layers.batch_normalization(inputs = trans_conv2, training=is_train, epsilon=1e-5, name="batch_trans_conv2")
trans_conv2_out = tf.nn.leaky_relu(batch_trans_conv2, alpha=alpha, name="trans_conv2_out")
# Transposed conv 3 --> BatchNorm --> LeakyReLU
# 32x32x256 --> 64x64x128
trans_conv3 = tf.layers.conv2d_transpose(inputs = trans_conv2_out,
filters = 128,
kernel_size = [5,5],
strides = [2,2],
padding = "SAME",
kernel_initializer=tf.truncated_normal_initializer(stddev=0.02),
name="trans_conv3")
batch_trans_conv3 = tf.layers.batch_normalization(inputs = trans_conv3, training=is_train, epsilon=1e-5, name="batch_trans_conv3")
trans_conv3_out = tf.nn.leaky_relu(batch_trans_conv3, alpha=alpha, name="trans_conv3_out")
# Transposed conv 4 --> BatchNorm --> LeakyReLU
# 64x64x128 --> 128x128x64
trans_conv4 = tf.layers.conv2d_transpose(inputs = trans_conv3_out,
filters = 64,
kernel_size = [5,5],
strides = [2,2],
padding = "SAME",
kernel_initializer=tf.truncated_normal_initializer(stddev=0.02),
name="trans_conv4")
batch_trans_conv4 = tf.layers.batch_normalization(inputs = trans_conv4, training=is_train, epsilon=1e-5, name="batch_trans_conv4")
trans_conv4_out = tf.nn.leaky_relu(batch_trans_conv4, alpha=alpha, name="trans_conv4_out")
# Transposed conv 5 --> tanh
# 128x128x64 --> 128x128x3
logits = tf.layers.conv2d_transpose(inputs = trans_conv4_out,
filters = 3,
kernel_size = [5,5],
strides = [1,1],
padding = "SAME",
kernel_initializer=tf.truncated_normal_initializer(stddev=0.02),
name="logits")
out = tf.tanh(logits, name="out")
return out

Step 4: The model architecture – Discriminator

A discriminator takes in the real or generated image as an input and outputs a score based on its predictions. The network uses a CNN whose task is to classify the images from the training data set (real) and which come from the generator (fake).

  • Inputs: Image with three color channels and 128×128 pixels in size.
  • Outputs: Binary classification, to predict if the image is real (1) or fake (0).
How to generate unique architectures using GANs
DC GAN Discriminator Architecture

As shown in the figure, the discriminator model comprises a feed-forward network that takes in the real images as input and produces a sigmoid probability between 0 and 1 in an attempt to evaluate the given instance of the generated image to be real or fake.

def discriminator(x, is_reuse=False, alpha = 0.2):
''' Build the discriminator network.
Arguments
---------
x : Input tensor for the discriminator
n_units: Number of units in hidden layer
reuse : Reuse the variables with tf.variable_scope
alpha : leak parameter for leaky ReLU
Returns
-------
out, logits:
'''
with tf.variable_scope("discriminator", reuse = is_reuse):
# Input layer 128*128*3 --> 64x64x64
# Conv --> BatchNorm --> LeakyReLU
conv1 = tf.layers.conv2d(inputs = x,
filters = 64,
kernel_size = [5,5],
strides = [2,2],
padding = "SAME",
kernel_initializer=tf.truncated_normal_initializer(stddev=0.02),
name='conv1')
batch_norm1 = tf.layers.batch_normalization(conv1,
training = True,
epsilon = 1e-5,
name = 'batch_norm1')
conv1_out = tf.nn.leaky_relu(batch_norm1, alpha=alpha, name="conv1_out")
# 64x64x64--> 32x32x128
# Conv --> BatchNorm --> LeakyReLU
conv2 = tf.layers.conv2d(inputs = conv1_out,
filters = 128,
kernel_size = [5, 5],
strides = [2, 2],
padding = "SAME",
kernel_initializer=tf.truncated_normal_initializer(stddev=0.02),
name='conv2')
batch_norm2 = tf.layers.batch_normalization(conv2,
training = True,
epsilon = 1e-5,
name = 'batch_norm2')
conv2_out = tf.nn.leaky_relu(batch_norm2, alpha=alpha, name="conv2_out")
# 32x32x128 --> 16x16x256
# Conv --> BatchNorm --> LeakyReLU
conv3 = tf.layers.conv2d(inputs = conv2_out,
filters = 256,
kernel_size = [5, 5],
strides = [2, 2],
padding = "SAME",
kernel_initializer=tf.truncated_normal_initializer(stddev=0.02),
name='conv3')
batch_norm3 = tf.layers.batch_normalization(conv3,
training = True,
epsilon = 1e-5,
name = 'batch_norm3')
conv3_out = tf.nn.leaky_relu(batch_norm3, alpha=alpha, name="conv3_out")
# 16x16x256 --> 16x16x512
# Conv --> BatchNorm --> LeakyReLU
conv4 = tf.layers.conv2d(inputs = conv3_out,
filters = 512,
kernel_size = [5, 5],
strides = [1, 1],
padding = "SAME",
kernel_initializer=tf.truncated_normal_initializer(stddev=0.02),
name='conv4')
batch_norm4 = tf.layers.batch_normalization(conv4,
training = True,
epsilon = 1e-5,
name = 'batch_norm4')
conv4_out = tf.nn.leaky_relu(batch_norm4, alpha=alpha, name="conv4_out")
# 16x16x512 --> 8x8x1024
# Conv --> BatchNorm --> LeakyReLU
conv5 = tf.layers.conv2d(inputs = conv4_out,
filters = 1024,
kernel_size = [5, 5],
strides = [2, 2],
padding = "SAME",
kernel_initializer=tf.truncated_normal_initializer(stddev=0.02),
name='conv5')
batch_norm5 = tf.layers.batch_normalization(conv5,
training = True,
epsilon = 1e-5,
name = 'batch_norm5')
conv5_out = tf.nn.leaky_relu(batch_norm5, alpha=alpha, name="conv5_out")
# Flatten it
flatten = tf.reshape(conv5_out, (-1, 8*8*1024))
# Logits
logits = tf.layers.dense(inputs = flatten,
units = 1,
activation = None)
out = tf.sigmoid(logits)
return out, logits

Note: For both the generator and discriminator networks, we are using the tf.variable_scope so as to create the new variables for each and share and reuse the already created ones.

Step 5: Calculating discriminator and generator losses

The loss function of the DCGAN model contains two parts: the discriminator loss J(D) and the generator loss J(G).

Image for post
GAN Loss function

Being an adversarial network, ideally the sum of these two loss functions should ultimately be zero ,i.e., J(G) = -J(D).

The discriminator loss in itself is the sum of the loss for real and fake images:

d_loss = d_loss_real + d_loss_fake

where
d_loss_real is the loss when the discriminator predicts an image is fake, when in fact it was a real image.

d_loss_fake is the loss when the discriminator predict an image is real, when in fact is was a fake image.

The d_logits_fake from the discriminator is again fed to the generator loss function as the generator wants to learn how to fool the discriminator.

def gan_model_loss(input_real, input_z, output_channel_dim, alpha):
"""
Get the loss for the discriminator and generator
Arguments:
---------
:param input_real: Images from the real dataset
:param input_z: Z input
:param out_channel_dim: The number of channels in the output image
---------
Returns:
A tuple of (discriminator loss, generator loss)
"""
# Build the Generator network
g_model_output = build_generator(input_z, output_channel_dim)
# Build the discriminator network
# For real inputs
real_d_model, real_d_logits = build_discriminator(input_real, alpha=alpha)
# For fake inputs (generated output from the generator model)
fake_d_model, fake_d_logits = build_discriminator(g_model_output, is_reuse=True, alpha=alpha)
# Calculate losses for each network
d_loss_real = tf.reduce_mean(
tf.nn.sigmoid_cross_entropy_with_logits(logits=real_d_logits,
labels=tf.ones_like(real_d_model)))
d_loss_fake = tf.reduce_mean(
tf.nn.sigmoid_cross_entropy_with_logits(logits=fake_d_logits,
labels=tf.zeros_like(fake_d_model)))
# Discriminator loss is the sum of real and fake loss
d_loss = d_loss_real + d_loss_fake
# Generator loss
g_loss = tf.reduce_mean(
tf.nn.sigmoid_cross_entropy_with_logits(logits=fake_d_logits,
labels=tf.ones_like(fake_d_model)))
return d_loss, g_loss
view raw losses.py hosted with ❤ by GitHub

Step 6: Optimizing the model

After calculating the losses, we need to update the generator and discriminator separately.

To do this, we need to get the variables for each part by using tf.trainable_variables() which creates a list of all the variables we’ve defined in our graph.

def gan_model_optimizers(d_loss, g_loss, disc_lr, gen_lr, beta1):
"""
Get optimization operations
Arguments:
----------
:param d_loss: Discriminator loss Tensor
:param g_loss: Generator loss Tensor
:param disc_lr: Placeholder for Learning Rate for discriminator
:param gen_lr: Placeholder for Learning Rate for generator
:param beta1: The exponential decay rate for the 1st moment in the optimizer
----------
Returns:
A tuple of (discriminator training operation, generator training operation)
"""
# Get the trainable_variables, split into G and D parts
train_vars = tf.trainable_variables()
gen_vars = [var for var in train_vars if var.name.startswith("generator")]
disc_vars = [var for var in train_vars if var.name.startswith("discriminator")]
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
# Generator update
gen_updates = [op for op in update_ops if op.name.startswith('generator')]
# Optimizers
with tf.control_dependencies(gen_updates):
disc_train_opt = tf.train.AdamOptimizer(learning_rate = disc_lr, beta1 = beta1).minimize(d_loss, var_list = disc_vars)
gen_train_opt = tf.train.AdamOptimizer(learning_rate = gen_lr, beta1 = beta1).minimize(g_loss, var_list = gen_vars)
return disc_train_opt, gen_train_opt
view raw optimizers.py hosted with ❤ by GitHub

Step 7: Training the model

We will now be training the model with the hyperparameters such as epochs, batch size, latent vector dimensions, learning rate, exponential decay rate (beta1), etc.
Moreover, we are saving the model after every five epochs as well as the generated image in every ten batches of image training. Along with it, we are also calculating and displaying the g_loss and d_loss.

def train_gan_model(epoch, batch_size, z_dim, learning_rate_D, learning_rate_G, beta1, get_batches, data_shape, data_image_mode, alpha):
"""
Train the GAN model.
Arguments:
----------
:param epoch: Number of epochs
:param batch_size: Batch Size
:param z_dim: Z dimension
:param learning_rate: Learning Rate
:param beta1: The exponential decay rate for the 1st moment in the optimizer
:param get_batches: Function to get batches
:param data_shape: Shape of the data
:param data_image_mode: The image mode to use for images ("RGB" or "L")
----------
"""
# Create our input placeholders
input_images, input_z, lr_G, lr_D = gan_model_inputs(data_shape[1:], z_dim)
# getting the discriminator and generator losses
d_loss, g_loss = gan_model_loss(input_images, input_z, data_shape[3], alpha)
# Optimizers
d_opt, g_opt = gan_model_optimizers(d_loss, g_loss, lr_D, lr_G, beta1)
i = 0
version = "firstTrain"
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# Saving the model
saver = tf.train.Saver()
num_epoch = 0
print("Starting the model training...")
# If training from saved checkpoint
if from_checkpoint == True:
saver.restore(sess, "./models/model.ckpt")
# Save the generator output
image_path = "generated_images/generated_fromckpt.PNG"
generator_output(sess, 4, input_z, data_shape[3], data_image_mode, image_path)
else:
for epoch_i in range(epoch):
num_epoch += 1
print("Training model for epoch_", epoch_i)
if num_epoch % 5 == 0:
# Save model every 5 epochs
save_path = saver.save(sess, "./models/model.ckpt")
print("Model has been saved.")
for batch_images in get_batches(batch_size):
# Random noise
batch_z = np.random.uniform(-1, 1, size=(batch_size, z_dim))
i += 1
# Run optimizers
_ = sess.run(d_opt, feed_dict={input_images: batch_images, input_z: batch_z, lr_D: learning_rate_D})
_ = sess.run(g_opt, feed_dict={input_images: batch_images, input_z: batch_z, lr_G: learning_rate_G})
# Every 5 epochs
if i % 5 == 0:
# Calculate the training loss
train_loss_d = d_loss.eval({input_z: batch_z, input_images: batch_images})
train_loss_g = g_loss.eval({input_z: batch_z})
# path to save the generated image
image_name = str(i) + "_epoch_" + str(epoch_i) + ".jpg"
img_save_path = "./generated_images/"
# Create folder if not exist
if not os.path.exists(img_save_path):
os.makedirs(img_save_path)
image_path = img_save_path + "/" + image_name
# Print the values of epoch and losses
print("Epoch {}/{}...".format(epoch_i+1, epoch),
"Discriminator Loss: {:.4f}...".format(train_loss_d),
"Generator Loss: {:.4f}".format(train_loss_g))
# Save the generator output
generator_output(sess, 4, input_z, data_shape[3], data_image_mode, image_path)

Step 8: Generating images

The generator is a feed-forward neural network that takes in random noise and gradually transforms it into images of a certain size during training.

In other words, it learns to map from a latent space to a particular data distribution of images by training, while the discriminator classifies the instances produced by the generator as real or fake.

def generator_output(sess, n_images, input_z, output_channel_dim, image_mode, image_path):
"""
Save output from the generator.
Arguments:
----------
:param sess: TensorFlow session
:param n_images: Number of Images to display
:param input_z: Input Z Tensor (noise vector)
:param output_channel_dim: The number of channels in the output image
:param image_mode: The mode to use for images ("RGB" or "L")
:param image_path: Path to save the generated image
----------
"""
cmap = None if image_mode == 'RGB' else 'gray'
z_dimension = input_z.get_shape().as_list()[-1]
example_z = np.random.uniform(-1, 1, size=[n_images, z_dimension])
samples = sess.run(
build_generator(input_z, output_channel_dim, False),
feed_dict={input_z: example_z})
images_grid = helper.images_square_grid(samples, image_mode)
# Save image to the image path
images_grid.save(image_path, 'JPEG')

Step 9: Setting the hyperparameters and running the model

Hyperparameters are essential in the learning process of the model. This is because it defines factors such as the duration, batch of images, learning rate fed to the model based on which the model learns to train and decrease its loss.

# Size of latent (noise) vector to generator
z_dim = 100
# Learning ratess
learning_rate_D = .00005
learning_rate_G = 2e-4
# Batch size
batch_size = 4
# Number of epochs
num_epochs = 500
# decay rates
alpha = 0.2
beta1 = 0.5
# Load the training data
training_dataset = helper.Dataset(glob(os.path.join(resized_data_dir, '*.jpg')))
# Train the model
with tf.Graph().as_default():
train_gan_model(num_epochs, batch_size, z_dim, learning_rate_D, learning_rate_G, beta1, training_dataset.get_batches,
training_dataset.shape, training_dataset.image_mode, alpha)

Step 10: Plotting the generated images

Finally, we have plotted the generated images.

show_images = 5
# Plot the images from last epoch
data_images = helper.get_batch(glob(os.path.join("./generated_images/epoch_" + str(num_epochs-1) +"/", '*.jpg'))[:show_images], 64, 64, 'RGB')
plt.imshow(helper.images_square_grid(data_images, 'RGB'))

Learning Tools and Strategies

  1. The key to learning about neural networks effectively is to learn and visualize the whole architecture of the system. By doing this, we can easily understand how the data is being processed step-by-step.
  2. Also, it is a good practice to print or log important messages and errors to help with debugging.
  3. Like most neural networks, DCGANs are quite sensitive to hyperparameters. Therefore, it’s very important to tune them precisely as they can largely affect the model’s performance.

Reflective Analysis

This project was challenging as well as exhausting. Finding a proper image dataset for training took a lot of time initially. This was because the model would take a lot of time to generate satisfactory results. Additionally, I learned a lot more about GANs in general and also about the various architectures that can be easily modified according to the need of the problem, which makes them so versatile to use and create. Moreover, doing projects like these once in a while helps in discovering the undergoing functions behind complex architectures.

Conclusions and Future Directions

In conclusion, the results generated from the were quite comprehensive and to some extent have opened yet another opportunity for implementing GANs. Despite the generated images of the unique architectures aren’t of high-quality, these results prove that GANs can be quite helpful as a tool in the creative field. The following results were a result of training the model on a standard CPU for several hours. Upon training on a high-end GPU/TPU, the results can be expected to improve a lot.

Training results

Citations

Also, the code for this project on Generating unique architectures using GANs is available on GitHub.

Finally, you might also be interested in this project on How to build an INR value predictor against 1 USD using Brain.js.

Hire the author: Merishna S