Which is better for floor plans, Nano Banana 2 or GPT Image 2?

For client-facing renders that have to match the building you actually drew, Nano Banana 2 wins clearly. NB2 preserves layout, room count, and aspect ratio at near 1:1 fidelity to the source plan. GPT Image 2 reinterprets the plan into something more cinematic but structurally different, with extra rooms and changed proportions. For mood-boards, atmosphere studies, or pitch decks where a real building doesn't have to exist yet, GPT Image 2's stronger lighting and material rendering can be the better tool. The right answer depends on what stage of the project you're in.

Does GPT Image 2 preserve floor plan geometry?

In our testing, no. Even with a structured JSON prompt that explicitly instructed the model to match the source layout 1:1 and not add or remove rooms, GPT Image 2 produced renders with hallucinated rooms, changed wall proportions, and a squared-off aspect ratio that did not match the source plan's footprint. NB2 obeyed the same JSON spec much more faithfully on the same source.

How much does GPT Image 2 cost on fal.ai?

fal.ai's gpt-image-2/edit endpoint runs against a credit pool rather than a per-image price. Our two-variant test ran in 132 seconds per variant and consumed standard fal credits. Direct OpenAI access was blocked for us by a hard billing cap, which is why we used fal.ai as the fallback. NB2 by comparison ran at roughly $0.04 per variant in 37 seconds, almost four times faster and at lower per-image cost.

When should I use GPT Image 2 over NB2 for architecture renders?

Use GPT Image 2 when you need atmosphere, mood, or stylistic richness on a building that does not yet have a fixed model: early concept boards, competition narratives, marketing imagery for a project that is still in design. Stay with NB2 (or Veras 4.0, which runs NB2 underneath) when the render must depict the model you actually drew, room for room. The two tools answer different questions.

Does JSON prompt format produce better floor plan renders than prose?

JSON prompt formatting helps both NB2 and GPT Image 2 stay closer to the source spec, particularly on lists of rules like 'do not add rooms not in source' or 'preserve every wall thickness.' But it does not turn either model into a literal interpreter. NB2 uses the JSON cues to stay genuinely close to the plan; GPT Image 2 reads the structure but still privileges its own composition logic. JSON helps. It does not override the model's underlying biases.

What aspect ratio does GPT Image 2 output by default?

Both fal.ai variants of GPT Image 2 we ran came back with an aspect close to square (roughly 1:1) regardless of the source plan's wide horizontal footprint. NB2 in the same test preserved the source plan's wide aspect at 1567x688. The squared output of GPT Image 2 was one of the most visible differences in the side-by-side.

Which model is faster for floor plan renders?

NB2 is roughly 3.5x faster on the same prompt and same source. Our wall-clock timings were 37 seconds per NB2 variant against 132 seconds per GPT Image 2 variant via fal.ai. For batch iteration during a deadline, that gap matters. For a single hero image where you can wait two minutes, less so.

Nano Banana 2 vs GPT Image 2 for Floor Plans: We Ran Both. Here's What Held Up.

Two folders sit side by side in the project tree. iter-05 and iter-06. Same 2D source plan, same JSON spec, byte-identical down to the line breaks. The folder on the left was rendered through Nano Banana 2. The folder on the right was rendered through GPT Image 2 about an hour ago. There's a client meeting at 9am tomorrow and the renders need to go out tonight. Decision time. We open both.

What was supposed to be a sanity check turns into a quietly important experiment. Same building, two models, two different buildings.

★ The Verdict

NB2 holds your building. GPT Image 2 makes a different building, just prettier.

For client-facing renders that have to match the model you drew, Nano Banana 2 wins on every axis that matters: layout fidelity, room count, aspect ratio, speed, cost. For mood-board and atmosphere work where the building doesn't yet have to exist, GPT Image 2's stronger lighting and material rendering earn it a place in the stack. The newer model is not the better model for this job.

The setup, in five lines

The source is a second-floor plan from a current Vista Studios multi-unit residential project, anonymised for publication. Wide horizontal aspect, roughly 1,560 by 690 pixels, fully detailed: every apartment unit drawn through, every fixture in the bathrooms, the gym equipment laid out, the central courtyard and its fire pit at the building's heart. The plan that goes to the printer, not the marketing concept.

One JSON spec, frozen from iter-05. 2,467 characters of structured object, wrapped in a 2,781-character prose preamble. The spec encodes every fidelity rule we'd normally hold in our heads and quietly violate at iteration four: preserve every wall, do not add rooms not in source, do not change wall thickness, only add furniture and decor. The discipline is the spec. Without it, every iter drifts a little, and by iter-04 you're presenting a building that looks like the building you drew if you squint.

Two variants per model, four renders total. Same input image, same prompt payload, same edit endpoint. The only changes between iter-05 and iter-06 are the model and the prompt encoding (NB2 took the prose form, GPT Image 2 took the structured-JSON form). Everything else is locked. If the renders come back different, the model is what made them different.

The source plan

2D residential floor plan showing courtyard with central fire pit, gym, multiple bedrooms, and an empty side room, drawn in standard architectural notation. — Source 2D floor plan / Vista Studios residential, anonymised / The input passed to both models

A close-up view through textured ribbed glass showing golden morning light casting geometric shadows on concrete and wood interiors. — Generated · Gemini Absolute layout discipline looks surprisingly good in dense morning fog.

Nano Banana 2, iter-05

Two variants, 37 seconds each, $0.04 a piece. The output comes back at 1567 by 688, the source aspect ratio preserved nearly to the pixel. NB2 reads the JSON structure and produces, with very little drama, the building you drew. Roof off. Walls intact. The fire pit glowing in the centre of the courtyard where you put it. The gym is the gym, the apartments are the apartments, the garden running through the lower spine is unmistakably the garden running through the lower spine of the source plan. NB2 does not surprise you, which is the entire point.

Nano Banana 2 isometric render variant 1: 3D dollhouse view of the residential plan with intact walls, central courtyard with fire pit, geometric stone pavers, and warm interior lighting. — NB2 v1 / 37s / aspect preserved

Nano Banana 2 isometric render variant 2: same residential plan with slight variation in furniture density and lighting, layout still 1:1 with source. — NB2 v2 / 37s / second variant

GPT Image 2, iter-06

Same JSON spec. Same source plan. Different model, different endpoint, different result, in roughly that order of importance. Two variants at 132 seconds each, three and a half times slower than NB2. The output comes back compressed from the source's wide horizontal footprint into something close to a square, the way a phone photo of a long building always ends up wrong on the short side. The materials read more believable at the surface. The lighting is warmer, the shadows softer, the mood frankly cinematic. If we'd never seen the source plan, we'd have called it a beautiful render.

We have seen the source plan. It is a different building.

GPT Image 2 isometric render variant 1: a 3D residential rendering with strong lighting and material quality, but altered footprint and reinterpreted room layout compared to the source plan. — GPT-2 v1 / 132s / aspect compressed to ~1:1

GPT Image 2 isometric render variant 2: same prompt, similar atmospheric quality, with hallucinated rooms and structural changes from the source plan. — GPT-2 v2 / 132s / hallucinated rooms visible

Look at the courtyard. The fire pit is there. The geometric paver pattern is gone, replaced by something closer to terrazzo tile in a hotel lobby. The empty side room we wrote into the spec twice, in capital letters, ONE room not two, no internal partition, comes back split with a partition the source plan never had. Walls move. Apartment count drifts. The whole footprint has been compressed to fit the model's preferred aspect, which is the most damning thing about the result, because that compression is a decision the building never asked the model to make.

None of which makes GPT Image 2 a bad model. It makes it a different one. Asked to interpret a 2D plan as a starting point for a more cinematic version of the same idea, it does that, well. Asked to preserve the plan, it edits it. The mistake in interpreting this output isn't the model's. It's the architect's, the moment they treat the render as a faithful version of the building they drew.

Six axes, scored

Axis	NB2 (iter-05)	GPT-2 (iter-06)	Winner
Layout fidelity	High (1:1 with 2D)	Low (reinterprets)	NB2
Aspect preservation	Yes (1567×688)	No (squared off)	NB2
Wall / room count	Match	Hallucinated extras	NB2
Mood / cinematic feel	Good	Stronger	GPT-2
Wall-time per variant	~37s	~132s (3.5× slower)	NB2
Cost per variant	~$0.04	fal credit pool	NB2

Five axes to NB2, one to GPT-2. The single category GPT Image 2 wins is real, though, and worth saying clearly: GPT Image 2 makes a more seductive image. If your job that afternoon is pitch-deck atmosphere rather than client-presentation accuracy, the one win can quietly outweigh the other five. We've sent GPT-2 outputs to clients on purpose, when the brief was mood and not measurement. The trick is knowing which job you're on before you choose the model.

What's actually happening here, beyond the table, is that the AI rendering category just stopped having a single winner. The honest read on 2026 isn't that AI rendering got better. It got plural. There is no longer a default model an architect can grab without thinking about what stage of the project they're on. NB2, GPT Image 2, FLUX architectural fine-tunes, the next thing on fal.ai next month, they aren't converging on one answer. They're specialising into different jobs. The architects making this look easy are the ones who quietly built a folder structure that lets them swap models per iter, instead of arguing about which one is best.

The JSON spec, and why it matters

Both models received the same prompt object, expressed as a JSON structure inside a short prose wrapper. A flavor of what's in there:

// excerpted from the iter-05 / iter-06 spec, ~2,500 chars structured
{
  "task": "convert_2d_architectural_floor_plan_to_3d_isometric_render",
  "perspective": "isometric_dollhouse_cutaway_with_roof_removed",
  "render_style": "photorealistic",
  "fidelity_rules": {
    "match_source_layout": "1:1",
    "preserve_every": [
      "wall", "room_boundary", "door_opening",
      "window_position", "corridor", "staircase"
    ],
    "do_not_add_rooms_not_in_source": true,
    "do_not_remove_rooms_in_source": true,
    "do_not_change_wall_thickness_or_angles": true,
    "only_add": ["furniture", "doors", "windows", "decor"]
  },
  "rooms": [
    { "id": "courtyard_left_room", /* one room, no partition */ },
    { "id": "gym", /* equipment, mirrors, padding */ },
    { "id": "main_residence_bedrooms", /* count locked to source */ }
  ]
}
    

We've been telling these models in plain English to preserve our buildings for two years. They've been politely ignoring us for two years. JSON is what happens when you stop asking and start specifying. NB2 reads the structure and treats the fidelity rules as constraints rather than suggestions, and the gap between iter-04 (prose) and iter-05 (JSON) was bigger than the gap between any two prose iters that came before it. Worth restructuring your prompt library for.

JSON does not override a model's biases, though. GPT Image 2 read the same structured rules and produced renders with hallucinated rooms anyway. The model's underlying composition logic does what it was trained to do, which is produce a beautiful image. The plan is the input. It was never the law. The first model that treats a structured prompt the way an API treats an OpenAPI schema, as a contract rather than a vibe, will rewrite this comparison. Until then: structure your prompts, but don't expect the structure to fight the model's training for you.

A low-angle ground-level photo looking up at a modern building facade with its complete reflection mirrored in a standing rain puddle. — Generated · Gemini When atmospheric mood board vibes outweigh structural accuracy.

Picking the right model, per stage

The question that matters is not which model is better. It is which model is right for the stage of your project.

Concept and competition imagery. Building doesn't fully exist yet. You need atmosphere, mood, an image that sells a feeling. GPT Image 2 is fine here. It will produce a striking render even from a rough massing. Just don't tell yourself the result is "your" building.

Schematic-design coordination. Layout matters but tolerances are loose. Either model works, and the choice is workflow-driven (whichever endpoint your pipeline already calls). NB2 is faster and cheaper, so default there unless you specifically want GPT Image 2's lighting.

Client-facing renders, design-development through CD. The render must depict the building you actually drew, room for room, wall for wall. NB2, full stop. Or use Veras 4.0 (which runs NB2 underneath, see our NB2 explainer for what's actually in there) inside SketchUp or Revit and skip the prompt-engineering layer entirely.

The reproducible workflow

If you want to run this same comparison on your own plan, the steps are short:

Freeze your spec as a JSON object with explicit fidelity rules. Wrap it in a one-paragraph prose preamble.
Submit the same payload to both edit endpoints. NB2 via fal.ai or Chaos developer access; GPT Image 2 via the OpenAI Images API or fal.ai's fal-ai/gpt-image-2/edit route.
Run two variants per model. Save both with a manifest of timings, costs, and the input URLs.
Score the six axes above, 1 to 5 each. Layout fidelity, aspect preservation, room count, mood, time, cost.
Map the winner to your project stage. The same comparison flips conclusion depending on whether you're producing concept boards or client renders.

The whole sequence takes about 15 minutes once the test rig is wired. We keep the test rig in a project sub-folder named floor-plan-renders/test-run-NNN/iter-NN-MODEL/ with a small run.py per iter that pulls from the same source-plan and writes a manifest.json. Cheap to reproduce, cheap to compare.

The right rendering model isn't a quarterly subscription decision. It's a per-job pick, made against the spec you can defend at the next client meeting.

What's worth watching

OpenAI's direct billing came back online for us shortly after this run, which means future iterations can skip the fal.ai middle layer for GPT Image 2 work. Expect modest speed gains and possibly different output as direct OpenAI bypasses any fal-side defaults. We will publish the delta if the gap is meaningful.

Open-weights models continue to close the gap on geometry fidelity. FLUX and successors keep landing architectural-grade outputs in ComfyUI workflows. If a downloadable model lands that matches NB2 on layout preservation, the licensing economics of the proprietary stack shift. We are tracking quarterly. Nothing has flipped yet.

And the prompt-format frontier stays open. Structured JSON beat prose in our test for both models, but neither model has been fine-tuned to obey a JSON schema in the way an API does. The first model that treats a structured prompt like a contract rather than a suggestion will rewrite this comparison.

For now, the honest read is short. Newer model. Slower render. Wrong building. Quietly beautiful. The rendering category just stopped having a winner. Pick yours per project, not per press release, and keep both folders in the tree.

Tested by Vista Studios on a live residential project, 2026-05-04. Source plan anonymised. No affiliate relationships with OpenAI, Google, or fal.ai.

Nano Banana 2 vs GPT Image 2 for floor plans