I’m 80% complete with my three-volume novel, Robot Dawn. A couple of months ago I ran onto an article online that piqued my curiosity about artificial intelligence software programs that generate images from text. Sounded like an interesting idea, and since a trial subscription was free, I took the plunge. The software is called Midjourney, and it’s a bot hosted by Discord.com. The article discussion was mostly about artists and whether using AI to generate the work was fair. Was it cheating? I was interested in how it might expand my visualization skills to inspire me to find the words to provide the reader with a better physical feel for scenes within my novel and some of its strange characters. Long story short, it blew my mind. Now I’m exploring how to integrate Midjourney with my writing. Here’s what I have learned so far.
First of all, since Midjourney provides images from words, and I had an alien creature that was ambiguous as to physical attributes but well-defined in function, I needed some way to better visualize this beast. Here are the words I used and Midjourney’s response:
“woodchipper that eats dead bodies as the grim reaper and excretes wood chips”
I was intrigued by all four responses. As you can tell, the character immediately came to life, and I had more than I had imagined to deal with. After a lot of consideration, I requested variations of the lower-left image. I got the following:
After a few more variations, I finally settled on an image to represent my character.
Not only did Midjourney turn vague images into specifics, it provoked more character function, life history, and suggested a lifestyle that I could exploit in the narration. It opened up so many possibilities that my writing stalled. I couldn’t put into words the imaginative implications of my character. It had a story of its own. I could write a novel about him, it, her. Did it have a mate? Yes! Obviously. What was their life like? I was overwhelmed.
I also had a scene set in Yosemite near the end of the novel that I was more attuned to and thought I might see what Midjourney could do with it. Here are my words:
“12 Ahawhnechee warrior ghosts war dance around a camp fire at night in Yosemite below Half Dome”
And here’s the result after several variations and upscaling:
Midjourney had given me more than I expected. Yet, it wasn’t perfect. The Native Americans weren’t actually ghosts. They were silhouettes, but close enough! Also Half Dome isn’t perfectly portrayed. But again, the image is so beautiful that I just couldn’t resist it. That is the one thing about Midjourney that makes it so impressive. The quality of its output is almost always beyond expectations.
It can also provide surprising connections within the story. As an example, I was writing a scene where my character went to see his brother only to learn upon arrival that his house had burned to the ground. When he arrived, the house was still burning, and it was apparent that his brother’s family had perished inside the flames. I put some words in Midjourney to see what the scene of the burning home would trigger. After some manipulation, this is what I ended up with:
“burned out home with five human bodies in the ashes”
I had difficulty understanding why Midjourney would show the people walking around and standing inside the flames of a burning home. Then I considered the broader context. In the story, the entire city was encountering aliens that were walking about. They were golem like creatures made of sand and other elements and solidified by microwaves into functional human-like form. These “people” walking about in the fire were not the brother’s family. They had already died. These were the golems come to see what was going on. Of course, Midjourney knew none of this. But this realization changed my perception of what was happening, added depth to the story, and helped tie the tragic event to other elements of the story. Midjourney had added immensely to my own imagination and allowed me to create a more consistent and interesting story. It all just worked better with Midjourney’s help.
After two months of this sort of activity, I have started wondering if I might be able to illustrate my novel using Midjourney. Would seem to work great in digital format where the images could be displayed as they came out of Midjourney. Not sure how it would work in print. Particularly in color. Cost a fortune. But the problem is that you never get exactly what you’re looking for, and striving for connections doesn’t always work. I have considered using imprecise images as illustrations and telling the reader in the caption that the image is from “Midjourney, non-literal”, i.e., not an exact representation of the scene. Still, Midjourney’s results can be so captivating that they can add another dimension to the storytelling. You sort of have your version of the novel, and Midjourney has its own interpretation of what’s happening. At least it can be interpreted that way when looking for resonances with the storyline. Which is a little like your readers. No reader ever reads the story you wrote, or what you thought you wrote. That’s one of the things that makes reader discussion groups so interesting. You get to hear other people’s interpretation of the novel. Here, using Midjourney, you get AI’s interpretation of what’s happening—or can be view that way for creativity’s sake. It’s always skewed but sort of interesting. At least to me.
But frequently Midjourney’s results are so far from what you want that the images aren’t usable. Also note that you don’t use the actual story text. You provide what you believe are the salient features to provoke Midjourney. Here’s an example of how it didn’t work for me.
“traditional Japanese home in Tomioka with a large treehouse in a ginkgo tree at night”
I did get a traditional Japanese home with a ginkgo, but not a treehouse. I didn’t expect it to be perfect, but this wasn’t even close, even though the images were interesting. I made another effort.
“traditional Japanese home in Tomioka, a large ginkgo tree with treehouse out back, at night”
Interesting, gorgeous images but too far from what I had written about in the text. I tried again.
“traditional Japanese home in Tomioka, a large ginkgo tree with a large treehouse out back, at night”
“outside a traditional Japanese home in Tomioka among ginkgo trees at sunset”
After this, I gave up. I might come back to it later, but this particular setting might be better served with text and no illustration.
Another thing that happens with Midjourney is serendipity. A while back, I was experimenting, just dumping things into Midjourney and seeing what popped out. You can use images to seed the AI bot, and on a whim, I fed the full cover image of Story Alchemy into Midjourney without text. Here’s the cover:
And here’s what Midjourney returned:
How it came up with that black and white image of a girl’s face was beyond me. But I was intrigued. I did some variations of the one image:
I immediately selected the bottom right image and upscaled it to max. This was the result:
At first I couldn’t understand why I was so intrigued by this one image. And then I realized. I had been struggling for seven years to get a good perception of my protagonist in Robot Dawn. She is a 12 year old girl prodigy from the future, 2070. Looks a little older than her age. Into hacking and some other stuff she shouldn’t be doing. She has a Greek heritage. I could never get a handle on her hair. I finally realized that this image produced by Midjourney was my protagonist, Daisy Daniels. The image is a little goth which fits the character perfectly. Couldn’t have imagined a better fit. Pure serendipity.
The image isn’t flawless. Although you might not be able to tell in the above image, in larger versions her right eyebrow seems to become separated from her head and float off a little. Plus, Daisy has a few freckles across the bridge of her nose, which aren’t shown here. She’s self conscious about them. I could have my son, the illustrator, add them sometime in the future. How the image of Daisy could be so perfect is a mystery to me.
So, here I am only a couple of months into using Midjourney to enhance my writing and still working on my relationship with it. I find it too easy to get stalled out working with it because it is so easy. Plus, you get to sit back and watch it work. Constant surprises. After working with it a while, I find it difficult to drag myself away from Midjourney and get back to putting words on the page. I anticipate that when I get used to working with Midjourney and get it fully integrated into my method, it will be a tremendous asset. Right now, it’s both distracting and disruptive to my writing process. And I’m wondering what to do with the images. Surely, I can’t just abandon them after I complete the novel.
What does all this mean? I believe this is a watershed moment in storytelling. An author working with a visualization tool like Midjourney could take the entire endeavor to a new level. I was a long ways into Robot Dawn before I started using AI. What if I had used Midjourney as a creative tool from the very beginning? Words and images. This could be the beginning of storytelling as a collaboration between the author and an AI bot that enables the author to produce irresistible stories using both the power of the written word and the fascination of supremely correlated—if a little skewed—images. I am intrigued by the idea that the illustrations not being a literal interpretation of the story can add another dimension to pull in the reader. Plus, the images are so easy to create that the narration can be littered with them. How many to use? That would be another choice left up to the author.
My trial subscription to Midjourney soon ran out, and I upgraded to a paid account with full privacy. Costs can be as low as $10/month, but I went for broke, $50/month for some of its more advanced features. I don’t foresee dumping it anytime in the near future. I have been seduced, and it looks like a long romance. I hope to provide updates to this continuing story of my conflict, tribulations, and amazing collaboration with Midjourney. Stay tuned.