Gurney Journey | category: Computer Graphics


Gurney Journey

This daily weblog by Dinotopia creator James Gurney is for illustrators, plein-air painters, sketchers, comic artists, animators, art students, and writers. You'll find practical studio tips, insights into the making of the Dinotopia books, and first-hand reports from art schools and museums.

Image-to-Image Style Transfer

Suppose you made this rough sketch and wanted to finish it in a photo-real or painterly style.

Image-to-Image Style Transfer

Twitter user @TomLikesRobots is a digital artist who did just that, using the new image generation tool called Stable Diffusion (#stablediffusion), the open-source model which has a powerful feature called Image to Image (#img2img). 

Image-to-Image Style Transfer

He uploaded the sketch and told the system to render it as "A black and white photo of a young woman, studio lighting, realistic, Ilford HP5 400." The hair style is a little different, and the ear is weird, but the basic pose and lighting is pretty close to the sketch.

Tom says: "Overall, composition is controlled by the sketch and the details are controlled by the prompt. 

Image-to-Image Style Transfer

This time he told it he wanted it to make the sketch look like "A portrait of a young woman by Norman Rockwell." 

It looks like a painting, but not much like Rockwell—more like a Victorian artist like Charles S. Lidderdale. On close scrutiny it doesn't hold up too well: the blue ribbon and the costume don't make sense, and her left eye has too many eyelashes.

Image-to-Image Style Transfer

Here's the sketch rendered in the style of Gustav Klimt. It's got a lot of drawing problems that Klimt never would have allowed, but it is somewhat reminiscent of his style.

Image-to-Image Style Transfer

And here's the sketch as painted by Vincent Van Gogh. Again, we can pick it apart, but it's in the ballpark.

Image-to-Image Style Transfer
And finally Alfonse Mucha. He is one of the hardest to emulate. It used soft internal transitions and clear outline.

The system was also able to translate the sketch into the features of famous celebrities:

Image-to-Image Style Transfer
Selena Gomez

Image-to-Image Style Transfer
Scarlett Johansson

Image-to-Image Style Transfer
Nicole Kidman

Image-to-Image Style Transfer
Emma Stone

All of them have big problems with the hair, but they're recognizable, and let's admit the tech is in its infancy and will only get better.

If you're an artist, you might find all this a little scary, threatening, or astounding. I do too! I feel like we've been introduced to a magical genie who can bring whatever we wish into existence.

Some have responded by calling for laws to ban the technology from creating images in the styles of living, working artists. What if someone created a print for sale that was supposedly painted by you or me?

I think we have to be careful how we respond to this apparent threat. Let's keep in mind that the technology is transformative, meaning it doesn't copy/paste images. It creates something new. And artistic styles can't be—and shouldn't be—copyrightable. These tools are here to stay. They're only going to get more advanced, and they're open-source. Also the people who are developing it are artists, too. 

I'm wary of laws or AI bots that restrict the growth of this new art form, or that drive the prompts underground. Instead of involving politicians and lawyers and AI bots, we should encourage a culture of mutual respect and fair play. Perhaps we should encourage generative artists to share their prompts when they use a living artist's name, and never to mislead their audience into thinking the work was actually painted by that artist. Basically, people should give credit where credit is due.

How do you think we should regulate or guide this new art industry?

A Good Explainer on AI Art

I'm honored that Vox media asked me to be part of this video about AI-generated artwork. 

If you don't want to watch the whole thing, I make a brief appearance at 10:25

Producer Joss Fong and her team came up with the brilliant explanation on what actually happens with a deep learning model. 

At 5:59 she explains multi-dimensionality with the example of yellow banana vs. red balloon. It's intriguing that we can't possibly know, in human terms, the criteria that the system is using to arrive at its results, or exactly what features it's extracting when a certain artist's name is used in the prompt.

There's a hidden bonus video that explores the reactions of various artists. 

Please add your comments:

• How do you feel this technology will affect the business and practice of art that you do?
• Do you want to use these tools?
• Will they change what you do or how you do it?  

I don't feel directly threatened by the tech, but I realize it will offer art buyers a cheap and fast method for generating editorial illustration, album cover art, and concept art. So it puts professional artists in those fields on notice and gives anyone the keys to becoming both an artist or an art director. 

As an art watcher, I have a kind of morbid curiosity to see where the technology is headed next, and I dread the onslaught of cheap surrealism that is already flooding social media. Another thing I've noticed is that there's a shelf life to each new set of tools, just as there is for each new type of VFX technique. Each new set of tools becomes old hat, as 

Robots with Flowers

How would you imagine a painting of a robot with flowers growing out of it?

Robots with Flowers

"A happy robot with flowers growing out of his head, clouds in the background, digital art." 

It's a whimsical idea that might make a fun concept for a children's book.  

Robots with Flowers

"A detailed painting of a rainbow colored robot with flowers growing out of its head."

Or it might be a promising pitch for an animated film.

Robots with Flowers

"A Rene Magritte painting of a robot head with flowers growing out of the top with clouds in the background."

It could also be a theme for a group exhibition of surrealistic gallery art.

Robots with Flowers
"A painting by Syd Mead of a bipedal robot with flowers growing out of the top of its head."

Designer Ben Barry used variations on this idea to generate over a thousand images in different styles. Mr. Barry is not an imagemaker in the usual sense. He is one of the lucky few who received beta access to the AI computer tool called Dall•E 2. 

Robots with Flowers
"A woodblock print of a bipedal robot with flowers growing out of the top of its head."

Mr. Barry came up with the instigating phrases or prompts, and Dall•E 2 did the rest, creating hundreds or even thousands of novel images in a few hours. The prompts sometimes used the names of dead artists to catalyze the results, but more often than not the prompts were just descriptive. 

These are all high resolution images, adequate for magazine reproduction. 

Robots with Flowers
"A painting by Caravaggio of a robot head with flowers growing out of the top."

Mr. Barry edited a digital book that you can check out for free called 1000 Robots on He chose the subject matter of flowers and robots because "I find the idea of an artificial intelligence painting robots to be simultaneously humorous and endearing."

Robots with Flowers

"A painting by Norman Rockwell of a robot head with flowers growing out of the top with clouds and a rainbow in a background, digital art"

The technology seems adept at understanding the artistic logic of the prompt, both in terms of style and content. But there are a few incongruous elements, such as the weird red cable that arcs over to the rainbow.

Mr. Barry says: "While the model is capable of generating other types of images, I found paintings to be the area where it truly excelled aesthetically."

Robots with Flowers

"A colorful painting by M.C. Escher of robot head with flowers growing out of the top"

The survey of styles resembles a Society of Illustrators exhibition or a professional illustrators' workbook. The foregoing two pages don't strike me as particularly reminiscent of Rockwell or Escher, but to me they score quite high on internal coherence and aesthetic appeal.  

Right now only a few people have access to this tool, but presumably it will soon be widely available essentially for free.

Robots with Flowers
"a dramatically lit brightly colored detailed painting of a robot artist painting a picture"

The power of this artificial intelligence gives me a mixture of feelings: I'm surprised, delighted, intimidated, and a bit breathless at the speed of the progress. 
Robots with Flowers

If you are an illustrator or gallery artist who paints surrealistic images in your particular style, it's a good time for soul-searching. 

You might consider:
1. How you would use these tools. 
2, How you will provide value for clients who have these tools.
3. How you will create artwork that these tools can't accomplish. 

This system of artificial intelligence won't eliminate traditional human artists—(and by "traditional" I include digital artists with those who use physical paint.)

But it will send shock waves through the illustration world, and it will replace a lot of jobs. Soon, anyone and everyone will be able to create images easily, cheaply, and quickly with simple prompts of natural language.


Learn more about Ben Barry's book called 1000 Robots at

How Smart is Dall-E 2?

Prompt: “Polymer clay dragons eating pizza in a boat”
Computer-generated image (Dall-e 2 by OpenAI) 

For a several years now, computers have been able to generate images based on a natural-language prompt. 

The resulting images have suffered from problems of logic and global coherence.

For example, here's what you get if you give the computer the prompt “A rabbit detective sitting on a park bench and reading a newspaper in a Victorian setting.” (Latent Diffusion LAION-400M via @loretoparisi)

Where are his legs? His hands? Are those books or newspapers? Is that a coffee table in front of his bench? 

The image doesn't make sense, and we might conclude that the problem comes from the computer not having any experience of living in a body or dealing with the real world. No matter how big the data sets, or how many layers of processing you bring to the task, you can't get past that limitation. 

Or can you? 

Open AI is one of the pioneers of generating realistic images and art from descriptions in natural language. They recently unveiled new software called Dall-e 2, which has pushed the boundaries of what's possible with this technology.

Here's what Dall-E 2 does with the same prompt: “A rabbit detective sitting on a park bench and reading a newspaper in a Victorian setting.” 

The overall logic is much better. Now he has legs and is really sitting on that bench, even casting a shadow. But the image is still not perfect. What's the black loop in his left hand? And why doesn't he seem to be holding the newspaper with his right hand? 

Here's one more example of how the technology is improving, using the prompt “teddy bears working on new AI research on the moon in the 1980s” 

The first version using older tech (laion400m) looks like a paste-up of unrelated elements.

Here's what Dall-e 2 came up with: a pretty believable image with consistent lighting. 

Open AI released this YouTube video to introduce the sofware.

This technology scares some working artists and illustrators. @VividVoid says: "DALL-E is breaking my heart. AI art is about to lay utter waste to traditional visual art forms. This will be so much more destructive than what the Internet did to music. It will be a technological conquest of one of the great human avenues of spiritual transformation."

AI skeptic Gary Marcus doubts whether the technology will ever replace artists because it is just crunching big data sets. It's not learning from embodied experience, nor does it understand symbolic or semantic concepts the way a human does. Marcus says: "This whole thread is weaponized cherry-picked PR; the antithesis of science."

Soon after Dall-E2 was released, OpenAI gave me beta access to try it out. On this YouTube video, I share my first experiments with it. (Link to YouTube)

Read more
Dall-e 2 at OpenAI
Podcast: Gary Marcus: Toward a Hybrid of Deep Learning and Symbolic AI 

Painting an Abandoned House -- in CGI

If you paint in traditional media you may not pay much attention to tutorials about 3D computer graphics.

I hope you'll make an exception for this demo by Andrew Price showing how to create an abandoned house in with the computer graphics software called Blender.

Price does a great job not only explaining the steps he takes, but also the thinking behind the steps. As a traditional painter I'm fascinated by all the tools and tweaks. 

Here's a 60 second version if you're pressed for time.

Google Cloud Vision

Google Cloud Vision is a free service that lets you harness the power of machine learning to analyze images. 

You can upload any picture. The algorithm will then compare the image to a vast database of labeled pictures and then make its best guess about what objects it sees. 
Google Cloud Vision

In this Tom Lovell illustration, GCV is very certain that it sees a single cat, and it's relatively certain that it sees a person. No mention of the other cat, the knitting, the blue chair and the white sweater. 

What happens if you give it a fantasy image that doesn't exist in the real world, such as a renegade warrior astride a Styracosaurus with a T.rex-tooth-helmet holding a saber-tooth cat skull on a staff? In this Dinotopia image it recognizes two generalized objects: "a person and an animal."
Google Cloud Vision

Clicking on the "labels" tab, you can see that it identifies general qualities of the image with decreasing certainty. It's wrong about hunting and it's wrong about a working animal, but it knows that it's an illustration of an extinct animal.

What happens if you input an image that has no analog in the real world because the image was itself generated by a machine-learning algorithm? Can it find something in the DNA of the image that could help it identify the word prompt that generated the image?

Google Cloud Vision

This picture was created by (VQGAN+Clip) with the prompt "Constructionist Typography." The properties that it finds are more general than that, but it's in the ballpark.
Try Google Cloud Vision yourself and let me know in the comments what you discover.

Mapping the Fruit-Fly Brain

 Scientists have succeeded in mapping the neurons and connections of a fruit fly brain.

Mapping the Fruit-Fly Brain
"A population of neurons that is responsible for updating the fly’s internal compass."

According to the New York Times, "their speck-size brains are tremendously complex, containing some 100,000 neurons and tens of millions of connections, or synapses, between them....The work, which is continuing, is time-consuming and expensive, even with the help of state-of-the-art machine-learning algorithms. But the data they have released so far is stunning in its detail, composing an atlas of tens of thousands of gnarled neurons in many crucial areas of the fly brain."

Using "By James Gurney" as a Style Prompt

A couple weeks ago I shared the results of some text-to-image experiments

Code wizards have been using machine-learning tools such as VQGAN + CLIP and BigSleep to create novel images that grow spontaneously from word prompts. 

Erfurt Latrine Disaster (Twitter @ErfurtLatrine) Prompt: "Towers" #VQGAN+#CLIP

The prompts can be simple, such as "Towers."

jbusted @jbusted1 "Forbidden Lands 5"
....Or the prompts can evoke a particular role-playing game, such as "Forbidden Lands."

The results develop a unusual style if you add a descriptor naming a studio, portfolio website, or rendering software, such as "from Studio Ghibli" or "trending on ArtStation" or "rendered in Unreal Engine"

  "The Grand Hall of the Sacred Library by James Gurney"

To my fascination and delight, some of them have gotten interesting results by including the phrase "by James Gurney." 

dzryk @dzryk
 "The tech bubble bursting by James Gurney"

Twitter user Ryan Moulton @moultano created a set of related images starting with the phrase 'The Hermit Alchemist’s and varying only the style cue: 

'The Hermit Alchemist’s Hut by James Gurney'

'The Hermit Alchemist’s Hut rendered in Unreal Engine'.

'The Hermit Alchemist’s Hut by Van Gogh'


"A castle built on the skeleton of a dead god by James Gurney"

Ryan Moulton @moultano "In the Woods, Gouache Painting." 

Using the phrase "In the Woods + Gouache Painting" (without an artist's name) yields something that appears painted in water media, like a Mary Blair concept painting, but with something weird about the kids' faces. 

Ryan Moulton @moultano "In the Woods by James Gurney"

All of the results have issues of basic logic and perspective. They never make sense or seem fully coherent, at least not yet. 

But some of them do suggest a recognizable style. Does this look like my style to you? I'm not sure; it feels both familiar and alien. It almost looks like something from a long lost sketchbook. 

New Tools for Text-to-Image Generation

Generating an image from a line of text entirely by means computer algorithms has been possible for the last few years. Newly invented tools are yielding results that keep getting more interesting.

The images can be hauntingly surrealistic, such as this one, which was generated by the phrase “when the wind blows.” 

New Tools for Text-to-Image Generation
Image courtesy The Big Sleep (source@advadnoun on Twitter)

It's a little blurry and out of focus, with tendrils of downy fluff waving in dim light. It seems more like a photograph than a painting, but really it's a new category of image, made by computer software drawing from big data sets. 

Lately people's imaginations have been captured by tools such as VQ-GAN and CLIP.

New Tools for Text-to-Image Generation

Prompt: “a face like an M.C. Escher drawing” from The Big Sleep (source: @advadnoun on Twitter)

Some of the results are compelling and intriguing, seemingly intelligent in a weird non-human way, as if you're looking into an alien's mind. Is that a face on its side, an eye, a nose, a mouth? Are those textures fingerprints? 

New Tools for Text-to-Image Generation
Prompt: “The Yellow Smoke That Rubs Its Muzzle On The Window-Panes” 
from VQ-GAN+CLIP (source: @RiversHaveWings on Twitter)

Each solution has a visual logic of theme and variation that's carried throughout the image. It's certainly not random. 

New Tools for Text-to-Image Generation
Prompt: “A Series Of Tubes” from VQ-GAN+CLIP (source@RiversHaveWings on Twitter)

Many of the images from this system have a surrealistic patchwork appearance resembling Cubism, where extracted fragments are juxtaposed across the picture plane, but the 3D space doesn't make sense as a real scene.

New Tools for Text-to-Image Generation

(source: @ak92501 on Twitter)

Some of the creativity of this enterprise derives from the odd juxtapositions of the words in the prompts. The results are often effective with long prompts. The phrase for the image above is “a small hut in a blizzard near the top of a mountain with one light turn on at dusk trending on artstation | unreal engine”

In recent weeks, people writing prompts realized you can get the system to yield a more detailed style if you say "trending on artstation."  

New Tools for Text-to-Image Generation
Prompt: "matte painting of someone reading papers and burning the midnight oil | trending on artstation" 
by Twitter user @ak92501

I expect that with time the results will be accepted alongside human efforts, beginning perhaps with categories like motel art, Twitter avatars, and corporate clip art. They will take their place on Instagram alongside painters and photographers. Many of the innovators in this field write their own code and come up with remarkably creative prompts, so it makes sense to think of them as artists.

As a viewer, I'm not quite sure how to respond emotionally to something that looks like art, but which didn't pass through a human consciousness.

As an artist, I'm not worried about my job. Maybe it's a vain hope, but I feel like people will always want to see images made by a human hand and filtered through a human brain rather than one made by an unfeeling machine. The question is whether eventually we'll be able to tell the difference.
Thanks, Chris!

Resources to learn more:
• UC Berkeley blog post, which is a good overview of techniques: Alien Dreams: An Emerging Art Scene
• Twitter account "Images.AI"  which plays with these natural language prompts and some of the same tools.

Ian Hubert's 'Dynamo Dream'

Ian Hubert spent about three years developing this short film called Dynamo Dream

The first episode called Salad Mug, is set in a lived-in science-fiction future. 

Hubert is a 3D digital artist, who creates his worlds mostly all by himself, but the visual effects are as impressive as a Hollywood film.

I was attracted to the relaxed tone and pacing, but I just wish it started immediately with stronger visuals and a clear, engaging story.


His minute-long "Lazy Tutorials" are popular with digital artists, but they might make sense to traditional painters as well.
Image-to-Image Style TransferA Good Explainer on AI ArtRobots with FlowersHow Smart is Dall-E 2?Painting an Abandoned House -- in CGIGoogle Cloud VisionMapping the Fruit-Fly BrainUsing "By James Gurney" as a Style PromptNew Tools for Text-to-Image GenerationIan Hubert's 'Dynamo Dream'

Report "Gurney Journey"

Are you sure you want to report this post for ?