The standard critique of AI-generated architecture imagery is that it looks like anything and nothing. That's usually true. When you give the model a vague brief, it returns to its mean, swooping concrete, dramatic cantilevers, light arriving from everywhere at once. It has absorbed every award-winning project ever photographed, and it defaults to the aggregate of them.
That's not what happens when an architect is running it.
We spent three weeks asking six studios, ranging from a one-person practice in Glasgow to a sixteen-person firm in São Paulo, to document exactly how they were using AI tools in live project work. What came back was ninety renders, forty-odd workflow screenshots, and a clearer picture of the gap between AI-generated architecture and architecture generated with AI than we've seen written about anywhere.
The difference isn't the tool. It's the brief.
The problem with the generic prompt
When a marketing team uses Midjourney to generate "an inspiring office building," they get the moodboard. Sky-lit atria, glass curtain walls, the kind of space that exists in pitch decks rather than buildings. It isn't bad. It's averaged. The model is doing exactly what it was asked, returning a plausible version of an inspiring office building as understood by every inspirational office building ever rendered.
The prompt encodes no site. No structural logic. No programme. No client. It has no constraints, so the model fills in the gaps with everything it's learned, and what it's learned is the visual vocabulary of architectural photography: impressive, frameable, technically impossible in several places, and utterly unspecific to any actual piece of land or practice or client brief.
You can spot this work instantly. It doesn't look fake, it often looks beautiful. But it doesn't look like anything anyone would actually build. The light is wrong in a particular way: too even, too flattering, as if photographed at the golden hour of a day that doesn't exist on this site's orientation. The geometry is dramatic but not useful. The materiality is expressive rather than constructional.
Architects know how to read this. Most of us find it mildly embarrassing.
What architects bring to the brief
Architects think in specifics that non-specialists don't know to think in. A prompt like "warehouse conversion, east London, three-storey residential above a bakery, 1950s brick, thermal mass, south-facing courtyard" isn't just more detailed than "an inspiring building." It's a different kind of request. It encodes site, structure, material logic, programmatic pressure, and light, all before you've pressed return.
The model doesn't understand what those constraints mean architecturally. But it has been trained on enough documentation from buildings that had to solve those same problems that it responds differently when you name them. The thermal mass clause pulls it toward heavy-wall typologies. The south-facing courtyard changes the likely section. The 1950s brick pushes it toward load-bearing rather than frame logic.
None of this is explicit. The model isn't reasoning architecturally. But the output, when you architect the brief, is architecturally literate in a way the generic prompt never is.
We ran this test directly. The same massing in FLUX.1 Dev, five iterations with incrementally added specificity. The early outputs were technically correct but generic, a building that could be anywhere. By the fourth iteration, you could see the south wall thickening. The fifth was spatially particular in a way that's difficult to explain without seeing it: you could sense the building being on a specific site, rather than on no site at all.
"The model doesn't care about the building. It doesn't understand what a door is, or why a column needs to be there. Those remain the architect's jobs, and they're the jobs worth keeping."
From the field notes · Issue 042Midjourney as a conceptual instrument
Midjourney's weakness in architectural work is documented: geometry failures, structural inventions, doors that open into other doors. We know this. We've all made the mistake of trusting it with technical things.
But its strength, which is rarely discussed with the same clarity, is atmosphere. Specifically, light. Midjourney is exceptional at light in the way that matters most at concept stage: it understands the qualitative difference between types of daylight, between diffused overcast and direct raking, between a high clerestory and a low window, in a way that generates spatially meaningful results when the brief asks for it directly.
Architects spend years learning to describe light conditions. North-facing, south-facing, the difference between a deep recess and a thin jamb, the way a masonry reveal changes the character of shadow. That trained eye makes us competent directors of Midjourney in a way that non-architects often aren't.
When we use Midjourney for early-stage work, we aren't asking it to draw buildings. We're asking it to describe light conditions. "A reading room. Thick walls. North light only. Concrete. Make the silence visible." The model returns something spatially coherent, not because it understands the programme, but because our brief asked it to solve a specific atmospheric problem that has its own architectural grammar.
Two of the studios we worked with use this as a disciplined first step: Midjourney to find the right register of light and material, never presented to a client, living in the pinboard as a reference the design has to earn its way back to. They're not using it to generate architecture. They're using it to describe what they're trying to make.
ComfyUI and the design of the pipeline
Later in the process, the tools change. Midjourney's looseness is an asset at concept stage and a liability once geometry is set. By the time you're producing presentation material, you've moved to ComfyUI, and this is where the gap between architect-directed and non-architect-directed AI work becomes most legible.
ComfyUI requires you to think about an image as a set of decisions. ControlNet for geometry. LoRA for material character. Denoise for the ratio of model invention to source fidelity. Upscale for resolution. These aren't just technical choices. They're design choices. Getting the denoise setting right on a complex exterior requires the same judgment as getting a material specification right, you're deciding how much of the building should be explicit and how much should remain suggested.
Architects make these judgments fluently because they are variations on the judgment we make throughout a project: how much to show, how much to leave unresolved, what drawing to present at which meeting and in what state of completion. ComfyUI externalises the render pipeline in a way that maps onto how we already think about design information.
Non-architects find ComfyUI notoriously difficult, not because the software is opaque, but because the decisions it requires map onto a spatial intelligence they haven't been trained in. You can't get the denoise right if you haven't developed an opinion about what the right level of resolution means for this building at this stage. The tool is neutral. The judgment is professional.
FLUX and material intelligence
FLUX.1 Dev has changed how we handle material exploration in early design. Its understanding of material properties, the way light reads differently off polished concrete than off board-marked concrete, the thermal connotations of brick versus CLT, the difference between a factory-finished aluminium panel and one that has weathered for three seasons, is better than anything that existed before.
We've found it most useful when we give it the same brief we'd give a materials consultant: not "make this look like wood" but "oak cross-laminated structural panel, factory edge trim, left to weather without treatment for three years, south-facing. How does it read at noon in September." The model returns something technically wrong in the details but atmospherically accurate in a way that's useful for client conversation.
fal.ai has made FLUX fast enough to use interactively. You can run eight or ten material variations in the time it used to take to set up a single render. That speed changes how you use it. It stops being a tool for final images and starts behaving like a fast sketchpad, testing material decisions at a speed closer to a pencil than to a render farm. Several studios we spoke with now do their first materials review entirely in FLUX before touching a proper render pipeline.
"The studios that have integrated AI well aren't the ones with the most powerful tools. They're the ones where the people using the tools were already good at the underlying judgment the tools externalise."
Observed across six practices · April 2026What this means for the profession
Three years of watching AI enter architectural practice has produced a fairly clear pattern. The firms that have integrated it well are not the ones with the most powerful tools or the biggest technology budgets. They're the ones where the people using the tools were already good at the underlying judgments the tools externalise.
The architect who uses Midjourney to find the right light condition for a project is the same architect who could find it by studying precedents for a week. The one who gets the ComfyUI denoise right is the one whose hand drawings always had the right level of resolution for the stage of the project. The tool doesn't create judgment. It amplifies it, which means it also amplifies the absence of judgment, in the familiar way that a faster car is more dangerous in inexperienced hands.
The outputs that embarrass architecture, the Instagram-optimised renders with no logic, the forms that couldn't be built, the light that doesn't belong to any hemisphere, are real, and they're a problem. But they're almost exclusively produced by people who weren't trained architects, or who are using the tools as substitutes for design thought rather than extensions of it. The model doesn't know which you're doing. It returns what you ask for.
There's a related concern worth naming, because it shows up in studio conversations but rarely in published writing. The risk isn't that AI replaces architectural judgment. The risk is that it makes weak architectural judgment look better than it is. A poorly conceived design that is beautifully rendered is a problem we've always had, AI has just lowered the cost of the render. The answer is the same as it's always been: develop genuine judgment, then use the tools in service of it.
The essay's actual answer
When architects are in charge, AI builds the space between the idea and the object. Not a shortcut to the finished building, the productive friction of working through what you're actually trying to make, faster than before, with a collaborator that has absorbed more visual knowledge than any human could hold.
What we found across six studios was consistent enough to be worth stating plainly: the outputs that were worth anything, the ones that advanced the design, informed the client conversation, changed the way someone thought about a building, were the ones where the architect was in the room in the way they're always in the room: making specific, constrained, opinionated demands on a system that is good at fulfilling them but has no opinion of its own about what they should be.
The model doesn't care about the building. It doesn't understand what a door is, or why a column needs to be in that position, or what a load path demands. Those remain the architect's jobs, and they're the jobs worth keeping.
What the model can do is hold a brief long enough to return something worth arguing with. In practice, that turns out to be exactly what you need in the first week of a project, when the best move is often to have something imperfect in front of you so you can say, with precision, what it's not.
Field notes compiled April 2026. Six studios across five countries. No affiliate relationships. No sponsored placements. Edited slowly.