Human-written article

Claude Opus 4.7 vs 4.6: Unity World Generation Benchmark

Claude Opus 4.7 vs 4.6: Unity World Generation Benchmark

TL;DR

I ran a one-shot Unity world generation benchmark comparing Claude Opus 4.6 and 4.7 through the UnityMCP bridge. Same prompt, same project, no follow-ups. Opus 4.7 wins three out of four categories: better terrain displacement, a proper ocean surface instead of grid water, and pine trees that actually render. Both models still fail at generating rivers inside the terrain. If you are using MCP to build Unity scenes today, upgrade to 4.7.

Everyone is talking about the new Anthropic Opus 4.7 model. You can barely open X or a dev Discord without someone claiming it writes better code, reasons harder, one-shots bigger tasks. I wanted to see how that lands in Unity, not on a toy algorithm problem, so I set up a One-Shot World Generation Benchmark: one prompt, no follow-ups, and the model has to produce a full procedurally generated Unity scene from scratch.

The bridge between Claude and the Unity Editor is UnityMCP, the MCP server that lets a model create GameObjects, edit scripts, and place terrain without me touching the Editor. Same prompt, same project, same MCP setup. I ran it against Opus 4.6 and Opus 4.7 and let each model build its world in one shot. This is a follow-up to my earlier Opus 4.6 vs GPT-5.4 one-prompt Unity world generation test. In the next sections I will compare what each model actually shipped, side by side.

The Two Worlds, Side by Side

We will not count this as a real comparison, because the terrain is random-seed generated and it really depends on luck how the world will look. From far away, Opus 4.6 did make the more beautiful world, but only because it rolled better seed numbers. See the images below and compare them yourself.

Unity world generated one-shot by Claude Opus 4.7, mountain and lake under scattered clouds

Opus 4.7 world, viewed from above.

Unity world generated one-shot by Claude Opus 4.6, dense forest with multiple lakes and snow-capped peaks

Opus 4.6 world, viewed from above.

Which Model Paints Better Textures and Terrain Detail?

One thing I noticed immediately with Opus 4.7 is how much more terrain detail it generates. It adds these little peaks, fine ridges that sit on top of the base shape and break up every slope. Opus 4.6, by comparison, looks smooth. Almost rolled flat. The shapes are there, but the surface stays clean and simple.

It is the kind of difference you catch in the first second of flying the camera over the scene. Zoom in on a slope in 4.6 and you see a soft curve. Zoom in on the same slope in 4.7 and you see a dozen small ridges Claude decided the terrain needed. Neither is wrong, but 4.7 feels like it layered in an extra pass of noise detail where 4.6 stopped at the first.

See the two screenshots below. 4.6 is the flatter one, 4.7 has the fine peaks layered in.

Opus 4.6 Unity terrain close-up showing smooth low-poly slopes with trees

Opus 4.6, zoomed in. Soft, rolling slopes. No micro-detail on the surface.

Opus 4.7 Unity terrain close-up showing jagged micro-peaks covering every slope

Opus 4.7, zoomed in. Small peaks and ridges layered across the terrain.

How Do the Models Handle Oceans and Rivers?

This one has a clear winner, but neither model actually nailed it. Both Opus 4.6 and 4.7 can generate a big ocean around the terrain. What neither one did is generate rivers or streams cutting through the land. That is still a gap. Both models skipped water inside the terrain and only placed it around the edges.

What is funny is that in a previous benchmark where I used Caveman on Opus 4.6, the model did generate river water running inside the terrain. Multi-agent scaffolding bought the older model something the newer one cannot one-shot on its own. So it is not that 4.6 cannot do rivers. It is that neither model reaches for them when you run it one-shot through UnityMCP. If you want broader context on this model in Unity, see my write-up on how Opus 4.6 affects Unity game development.

4.7 still wins this section, because 4.6 did not even finish the ocean correctly. Look closely at the first screenshot and you can see the water plane is grid-based. A flat, tiled checkerboard reads as debug geometry, not water. It breaks immersion instantly. 4.7's ocean, in the second screenshot, is a smooth continuous surface that sits against the shoreline the way you expect water to.

Opus 4.6 ocean rendered as a flat grid-tiled plane next to the terrain, breaking immersion

Opus 4.6 ocean. The water is a flat grid plane, not a proper water surface.

Opus 4.7 ocean blending smoothly against the forested shoreline with natural water tone

Opus 4.7 ocean. Smooth, continuous water surface that reads as actual sea against the coastline.

Which Model Renders a Proper Pine Tree?

This one is not close. Opus 4.7 wins clearly.

If you look closely at the 4.6 forest, something is off with the trees themselves. They come out dark, almost black, and the shading never catches the light the way pine geometry should. My guess is the mesh faces are flipped. The normals probably point inward, which means Unity's backface culling renders the wrong side of each polygon. You get flat silhouette trees instead of lit pine trees. It is the kind of one-shot bug you would expect to hit and then spend half an hour debugging.

Opus 4.7 gets this right. The trees render with correct winding, lighting hits them from the right side, and you end up with the classic low-poly green pine look that actually reads as a forest. Same prompt, same scene setup, clean output. No flipped normals, no black shadow-trees, just a proper pine forest.

See the two screenshots below.

Opus 4.6 pine trees rendering dark and flat with broken shading, likely flipped mesh normals

Opus 4.6 pine trees. Dark, unlit, almost silhouette-only. Flipped normals are the most likely cause.

Opus 4.7 low-poly green pine trees with clean shading and proper light response

Opus 4.7 pine trees. Correct winding, clean shading, a proper low-poly forest.

What the Numbers Actually Say

If you tally the categories, Opus 4.7 takes three out of four. The far-view beauty shot went to 4.6 because the seed rolled better, but we already said that one does not count. Texture and terrain detail went to 4.7 with its layered displacement. Oceans went to 4.7 because 4.6 shipped grid-water. Pine trees went to 4.7 because 4.6 shipped flipped-normal meshes.

But what does it look like under the hood? Here are the raw generation stats from both runs. Same prompt, same project, same UnityMCP setup.

Benchmark table comparing Opus 4.7 and 4.6 Unity world generation metrics side by side

A couple of things stand out. 4.6 placed twice the trees of 4.7 and still lost the forest, because quantity does not help when the meshes render black. 4.6 also split water into twelve small lakes against 4.7's two big ones. Totally different design instinct, same prompt. And 4.7 used a couple more execute_code calls, which means it reached for Unity more times to land its result. Whether that is "thinking more" or "over-engineering" depends on how you squint at it.

The takeaway is simple. Opus 4.7 one-shots a Unity world that looks like a game you would actually build. Opus 4.6 one-shots a world that looks like a first pass you would still be cleaning up. For raw MCP-driven generation at this stage, 4.7 is the clear win.

The benchmark itself is open source. The prompt, the Unity scene, and the full setup are all in the one-shot-prompt-world-generation-unity repo on GitHub. If you want to run the benchmark yourself, add a new model, or push it further, contributions are welcome.