The Evolution of AI-Generated Architecture

From Mystery to Mundanity

The progression of AI-generated images is undeniable. From the pioneering days of DALL-E in 2021 to the advancements seen in Midjourney's latest model, the trajectory is evident. While these realistic images may be a visual treat for viewers and a marvel for users, there's a growing sentiment that AI images are becoming more predictable and less astonishing. When did the allure fade? What happened to the enigma and charm of the early AI-generated visuals?

In recent times, AI developers have been diligently curbing AI's tendency to "hallucinate." For expansive language models like ChatGPT, this means ensuring the AI doesn't reference fictitious scientific papers or legal verdicts. In the realm of image generation, it translates to steering clear of surreal and eerie visuals. As our reliance on these tools grows, especially for factual purposes like article writing or stock image generation, it's understandable to want the AI to stay within certain boundaries. However, when the objective is creativity and innovation, these tools seem to be losing their edge. Has the magic vanished alongside the hallucinations?

A Brief History of Hallucinations

The term "AI hallucination" first emerged in the year 2000 within the computer vision sector. It referred to the creation of new pixels in surveillance camera images to enhance their resolution, with the algorithm drawing from existing pixels to produce believable new ones.

Output by Google DeepDream. It Sucks.

Prairie Pod, Le Corbusier, Modernism, geometric, Grass, Natural, Grass roof, Greens & earth, Natural with shadows, Prairie, Late morning, Grounde

It Sucks.

Fast forward to 2015, and Google introduced DeepDream. Rooted in facial recognition technology, DeepDream flipped the script. Instead of identifying faces, it embedded them into any image, resulting in a blend of the bizarre and the psychedelic. Here, "hallucination" took on a dual meaning: the machine's creation of new pixels and the human experience of perceiving the nonexistent.

Initially, "hallucination" was perceived positively, a phenomenon to be harnessed. Yet, its connotation has since shifted. Today, Wikipedia describes it as “a confident response by an AI that does not seem to be justified by its training data.” So, are these mere glitches?

Decoding Midjourney's Imagery

To understand the evolution of AI imagery, I embarked on a series of tests with Midjourney. I experimented with various versions of the software, from its inception to the latest v5.2 using the same prompts from a previous project aimed at generating architectural visuals.

Two distinct shifts emerged. The first, between Versions 3 and 4, marked a noticeable decline in the images' poetic essence. The once abstract and peculiar visuals were replaced by more literal and polished representations. The subsequent shift, from Version 4 to 5.2, saw a return of the oddities, but this time with a photorealistic touch. The resulting images, while technically impressive, felt eerily familiar.

This familiarity stems from Midjourney's dual commitment to visual and conceptual realism. In its quest to avoid the uncanny, the AI prioritizes consistency, leading to images that, while photorealistic, lack soul and imagination.

A Plea for Dreaming Machines

The question then arises: With their vast datasets and potent algorithms, why are AI image generators defaulting to photorealism? Tools like Midjourney and DALL-E 3 are not mere photographic devices; they're creators of novel visual content, potentially enriched by hallucinations.

For developers, controlling these hallucinations might feel like relinquishing power. But it's precisely this unpredictability that captivates many: images that transcend their creators' intentions. Hallucinations, akin to glitches, offer a rare glimpse into the machine's inner workings.

To recapture the lost mystique, we could venture beyond standard images using intricate prompts. Midjourney's "weirdness" parameter seems promising, producing unexpected yet photorealistic results. Alternatively, we could envision AI tools that defy the default photorealistic mold, allowing users to harness hallucinations more intuitively. Tools unafraid to challenge their own creation process and, in turn, challenge us. Tools that pioneer new pixels and pave the way for innovative visual languages.

Conclusion

In the vast expanse of the AI-generated visual universe, we stand at a crossroads. Just as Nietzsche once proclaimed, "And those who were seen dancing were thought to be insane by those who could not hear the music," we must decide whether to let our AI tools dance in the realm of the unknown or confine them to the predictable beats of the known. For in the dance of the unpredictable lies the true essence of creation.

Previous
Previous

Redefining Cities: Why Inventing Problems is Key