Down þe Id of þe Internet wiþ Stable Diffusion & Camera
What can Generative ImageBot models tell us about the state of humanity—or at least of that part of humanity compelled by mercenary, addiction, or egocentric reasons to post on the Internet?
Text-producing Generative ChatBots are, in a sense, somewhat tuned (by actual human feedback) Internet simulators: they show you what the Internet would be likely to say in response to the prompt that you have fed it, based on its assessment of what pages on the Internet are “close” to your prompt. (Much of the magic is in the definition and metric of “close” that the neural network constructs for itself—a metric that is largely inaccessible and largely incomprehensible to humans. But I digress.) That makes text-producing Generative ChatBots a reasonable way of taking the temperature of the conventional wisdom of humanity, or rather of that part of humanity that is compelled by mercenary, addiction, or egocentric reasons to write on the Internet.
But what are picture-producing Generative ImageBots doing. Are they too Internet simulators? Are they a way of taking the temperature of… not the conventional wisdom… rather the id of humanity, or at least of that portion of humanity compelled by mercenary, addiction, or egocentric reasons to put pictures on the internet and write captions for them?
I thought: maybe. And I wanted a picture of the Fox News viewer’s id’s version of “chaos on America’s southern border”, as a representation of what Ayaan Ali Hirsi is telling them the callow wine-sipping Kamala Harris and the unmanly bicycle-riding Joe Biden are failing to protect them from. (As opposed to what Ayaan Ali Hirsi claims that Man’s Manly Man Donald Trump did when he was IN CHARGE.)
So I fed Stable Diffusion the prompt: “Immigration chaos on America’s southern border, as covered by Fox News TV babes. It gave me sixteen choices. None of them were really what I wanted. But I wanted to publish, I wanted an image, and so I took the bundle of all 16 and ran with it.
As you can see, the emphasis is definitely on the babes, with secondary emphasis on “southern border” and “immigration”. But “chaos”? Not so much. The vision of the southern border as in immediate crisis requiring that Joe Biden and Kamala Harris cancel all their other activities and spend all their time wringing their hands—that is not the dominant feature of the DreamTime of humanity’s id, at least not as captured by pictures that have been placed on the Internet and then captioned by those compelled to do so by mercenary, addiction, or egocentric reasons:

So, afterwards, I went back: could I, by changing the prompt, tickle Stable Diffusion to produce the kind of picture I had in my mind’s eye? Maybe the problem was that I had asked it for more than one “babe”?
Dropping the “s” produced pictures where the primary emphasis was indeed “babe” (with “TV”, “southern border”, and perhaps “immigration” as secondary emphases:

Nope. Not much that would make one think one had to vote for a neofascist Man’s Manly Man if one wanted to avoid being murdered in one’s bed…
Keep reading with a 7-day free trial
Subscribe to Brad DeLong's Grasping Reality to keep reading this post and get 7 days of free access to the full post archives.