📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper from Google argues that in AI-driven software development, the model itself accounts for only 10% of system behavior. The key to success lies in harness design and context engineering, shifting focus from models to configuration and verification.

A new whitepaper from Google, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the AI model accounts for only about 10% of the behavior in AI-driven systems. The report emphasizes that the real value comes from the harness—tools, prompts, rules, and context—that surround the model, shifting the focus of AI development and deployment.

The whitepaper, titled The New SDLC With Vibe Coding, argues that the dominant challenge in AI-assisted software engineering is not the model itself but how developers configure, verify, and guide its outputs. It cites experiments where tweaking only the harness or context improved agent performance significantly, while changing the model had minimal impact. For example, one team moved a coding agent from outside the Top 30 to the Top 5 on a benchmark by adjusting the harness alone.

Furthermore, the authors differentiate between ‘vibe coding’—quick prompts with minimal oversight—and ‘agentic engineering,’ which involves structured, verified, and monitored AI workflows. They stress that the costs and risks associated with unstructured, prompt-based AI use are high, including token waste, security vulnerabilities, and maintenance burdens. The report suggests that investing in harness and context engineering offers a more sustainable and cost-effective approach.

At a glance
reportWhen: published March 2026
The developmentThe Google whitepaper highlights that the core of AI-assisted development is not the model but the surrounding harness and context, fundamentally changing software engineering strategies.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Impact of Harness and Context on AI Development Success

This shift in understanding has major implications for organizations adopting AI. It suggests that building robust harnesses and managing context will determine the quality, reliability, and cost-efficiency of AI systems, rather than focusing solely on accessing the latest models. Companies that master configuration and verification can gain a durable competitive advantage, while those fixated on model improvements may see diminishing returns.

Software Testing with Generative AI

Software Testing with Generative AI

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Evolution of AI Coding Practices and Industry Insights

The whitepaper builds on ongoing trends where AI is increasingly integrated into software workflows. As of early 2026, reports indicate that 85% of developers use AI coding agents, with more than half doing so daily. The industry has moved from vibe coding—quick, minimal oversight—to more disciplined, structured approaches. Prior to this, the focus was primarily on model capabilities, but recent experiments and benchmarks underscore that configuration, scaffolding, and context management are now the key differentiators.

This perspective aligns with broader industry shifts emphasizing verification, testing, and cost management over raw model performance, reflecting a maturation in AI integration practices.

“The behavior you experience in AI tools is dominated by scaffolding you can build, own, and improve, not the model itself.”

— Addy Osmani

LEAN PROGRAMMING FOR FORMAL SOFTWARE VERIFICATION: Mathematical proof systems and logical frameworks for verified computation

LEAN PROGRAMMING FOR FORMAL SOFTWARE VERIFICATION: Mathematical proof systems and logical frameworks for verified computation

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unclear Aspects of Model-Harness Dynamics

It remains unclear how universally applicable these findings are across different AI tasks and industries. The specific impact of harness design versus model improvements in real-world, large-scale deployments needs further empirical validation. Additionally, the long-term effects of this paradigm shift on AI model development strategies are still emerging.

AI Prompt Engineering Bible (7 Books in 1): Beginner-to-Pro System to Master ChatGPT and Generative AI for Powerful Results and Real Income (The Generative AI Creator Series)

AI Prompt Engineering Bible (7 Books in 1): Beginner-to-Pro System to Master ChatGPT and Generative AI for Powerful Results and Real Income (The Generative AI Creator Series)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Future Directions in AI System Engineering

Organizations are likely to invest more in developing sophisticated harnesses, context management, and verification frameworks. Further research will explore best practices for scalable harness design and the integration of dynamic context loading. Monitoring how this shift influences AI model development and industry standards will be key in the coming months.

Designing Instruction with Generative AI: 24/7 Support for Optimizing Teaching and Learning

Designing Instruction with Generative AI: 24/7 Support for Optimizing Teaching and Learning

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system behavior?

The whitepaper shows that the surrounding harness—prompts, rules, tools, and context—has a much larger influence on the AI’s output than the model itself, which accounts for roughly 10%.

How does this change AI development strategies?

It shifts focus from chasing better models to designing better harnesses, managing context, and implementing verification to improve performance and reduce costs.

What are the risks of vibe coding versus agentic engineering?

Vibe coding, which relies on quick prompts and minimal oversight, can lead to high token costs, security vulnerabilities, and maintenance issues. Agentic engineering emphasizes structured, verified workflows, which are more cost-effective long-term.

Will this approach work for all AI tasks?

It is still uncertain how universally applicable this paradigm is across different domains. More research and real-world testing are needed to validate its effectiveness broadly.

Source: ThorstenMeyerAI.com

You May Also Like

Wi‑Fi 7 vs. Wi‑Fi 6E: Do You Really Need the Upgrade?

Wi-Fi 7 is nearly five times faster than Wi-Fi 6E, offering improved…

“This is going to be a niche device” – Analysts react to the $1,000+ Steam Machine price reveal

Experts say the new Steam Machine’s high price suggests it will appeal to a niche market, limiting mainstream adoption. Details remain uncertain.

Yoke vs Stick: The Flight Sim Choice That Changes Everything

A comprehensive comparison of yoke and stick controls reveals how your flight simulation experience can be transformed—discover which option is right for you.

Portable Laptop Desks: A Back to school Guide

Discover how portable laptop desks boost comfort, mobility, and style. Find out which features and designs suit your needs best.