⇽ texts index

This short essay was first published as a preface to the book Late Art : Visions from OpenAI DALL-E 2.
Find the book here

Something on DALL-E 2

I've worked with AI art generation tools since 2018. That's when Google released its BigGAN image generator (1). BigGAN's possibility space is limited but with enough settings to make its output feel personal. I generated a bunch of images and edited them into a video.

So I jumped onboard pretty much when style transfer was becoming a thing. Turn this image into a painting by Van Gogh and so on. In a way not much has changed - image generators learn how to reproduce images so all they know is style imitation.

The introduction of CLIP (2) opened a new path and with CLIP hooked into a generator promptism emerged (3). Words offer a different way of exploring a generator's way of working. Artists can wax poetic and feel some ownership over the images they receive.

After an image generator is trained, all the possible images it can create already exist. This plane of existance is called the latent space. A prompt handled by CLIP works as a vector that points to a certain position within the latent space and thus to a certain image. In way these images are found, they are not created. This is how DALL-E works too.

Artist and filmmaker David O'Reilly has called DALL-E a scam (4). Partly I agree and its good to be critical, that's what art is good at. DALL-E has been trained on data which is scraped from the internet and without any authorial consent. Stolen, as much as digital theft is a thing. More than this electronic burglary I'm concerned with how boring it makes DALL-E. Many cool and beautiful images are used in creation of its latent space, of course its output can be like that too.

DALL-E doesn't allow prompts that imply violence or nudity. The user has to come up with workarounds and in my experience they don't yield the kinds of results I'm after. This prohibition makes DALL-E less crebidle as a tool, just imagine the history of art without nudity or violence.

I'm not that impressed with DALL-E and I find little magic in the images it creates. Call it overexposure. What remains to be seen is how it (and similar future implementations) lends itself as a tool into an artist's toolbox.

References

  1. https://colab.research.google.com/github/tensorflow/hub/blob/master/examples/colab/biggan_generation_with_tf_hub.ipynb
  2. https://openai.com/blog/clip/
  3. https://deeplearn.art/the-promptist-manifesto/
  4. https://www.instagram.com/p/CgSqRxhPF_X/ and continued in https://www.instagram.com/p/CgfDPBgPyFh/