I’ve written before about BigGAN, an image-generating neural net that Google trained recently. It generates its best images for each of the 1,000 different categories in the standard ImageNet dataset, from goldfish to planetarium to toilet tissue. And the images it produces are both beautifully textured and deeply weird. Some of the categories - scabbard, rocking chair, stopwatch - are delightfully aesthetic.
[scabbard, rocking chair, stopwatch]
Google has made the trained BigGAN model available to the research/art community, which is nice, since people have estimatedthat today it would take around $60k in cloud computing time to train one’s own.
But there’s more lurking in the BigGAN model besides the 1,000 ImageNet categories. The model thinks of each category as a big set of numbers that describes exactly how to smoosh and stretch and color random noise. Following one set of numbers will transform noise into a flower, while following another set will turn that same noise into a dog instead. But another thing a set of number is, is a position in space: latitude and longitude for example, or x,y,z coordinates - in math terms, we call the set of numbers a vector. And in machine learning, all the positions in space (granted, an approximately 100-dimensional space) that a model’s vectors can point to is called vector space.
So one set of numbers - the flower vector - points you to some location in vector space, and another set of numbers - the dog vector - points you to a different location.
[daisy, saluki dog]
But here is where it gets fun. The vectors are just numbers, which means you could, in theory, average them. What happens when you average together “saluki dog” and “daisy”? There’s no ImageNet category there, so what’s lurking in that spot in vector space, halfway between the two? Delightfully, dogflowers.
This, it turns out, is so cool. Joel Simon has put together an app called ganbreeder.app that lets you mix and match categories.