Desk.

How I learned to stop worrying and love the robot...maybe?

A landscape size book

Enter style reference and edit

It’s finally here! Midjourney has gone from a parlor trick to actual software, with real (albeit specific) applications.

I've been checking in every few months to see what the state of the tool is, and this is the first time that it has seemed worth paying for, and in fact perhaps indispensible.

View prototype

The test:

In this prototype, I explored building a branded world. This is something that I would use for blog headers, mood boards, story telling, growth ads — nothing that requires an extremely specific and unexpected situation, or an extremely specific style. I would 100% use this for generating a brand vibe.

Using 6 different reference images, I created 6 different takes on a set of assets:

  • One young lady looking at her phone, surrounded by candy. Close up.
  • A cliff in the foreground at sunset. Low angle camera, landscape with sun sinking below ocean
  • A bowl of fruit formed by geometric shapes
  • An urban landscape with small townhomes and lots of pedestrians.

In order to generate these different looks, I used the --sref prompt and pointed it to an image url. I chose a wide range of images: a 70's airbrush style album cover to a 3d render, and a treated advertising photograph. I purposefully chose images that did not have any of the same content so it became about the style and not the specific artifact.

Image

Indispensable software features:

  • Style reference - such a leap from text description, especially for visually oriented people.
  • Edit mode - remove objects or expand frame without starting over from a net new generation. Using an eraser tool, you can specify which part of the image to regenerate. You can also generate a number of elements that can be comped in with traditional photoshop or animation techniques. Below I generated a blank plate and added back the cloud and car to quickly create a simple animation.
Image

Nice to have features:

  • Midjourney appears to generate great compositions with good foreground, background and focal points
  • It generates imaginative human figures with lively, active poses

Distaste with the output:

While these elements could there own post – and may yet get one - below find a few negative reactions in brief:

  • That ai look: generic big lipped freckled young women
  • Weirdly porn-ish poses - I understand the figurative training data is apparently quite dependent on porn imagery, and it can show
  • Difficulty generating diverse ages and appearances in people.
  • You can't look too closely at figures in the background
  • Bizarre and seemingly encoded insistences — I could not get rid of picture frame elements on the wall of one bowl of fruit version until I finally used photoshop manually (Photoshop's own generation feature was also insistent). Ornate crown molding was a must?
Image


AI qualms

As a photoshop native, nothing particularly surprising here — many of the results reminded me of stuff I’d come up with from compositing stock photos and using pixel transfer modes to obtain specific lighting results. The reference images I used were in my long time reference folder — some have been an inspiration for 3-d scenarios I’ve put together with C4d. While it is a different workflow, it presents similar results. By using unrelated images it does give results that are quite different than the original reference, thereby alleviating some feelings of guilt from referencing another’s work.

With the exception of the Wes Anderson set, none of the worlds felt attibutable to a specific author, and even that one did not actually source from Wes Andersen work. Instead, it was a still from a commercial that was already riffing on that popular style -– it's not the tool, it's the intent. I could have easily pushed it to be more differentiated, but it was useful as a stylistic point of comparison.

One thing that might keep me up at night is wondering how the face generation works...while it's possible to swap faces really easily using the edit feature, is there some photo that exists that is being referenced that has a specific individual that I'm now featuring? With compositing elements in photoshop, at least the origin was more certain. Even understanding the image averaging concept, it gives me pause. Below, output from me trying to evoke a slightly less geriatric version of a "40-something"...

Image

The guilt from the amount of (possibly fossil fuel) energy it takes to generate an image? That's a topic for another day.

VIEW PROTOTYPE