The Model Is Only 10%: The Real Lesson of the New SDLC

📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper argues that in AI-driven software development, the model accounts for only 10% of system behavior. The focus should be on harness and context engineering, which drive performance and cost-efficiency.

A new Google whitepaper titled The New SDLC With Vibe Coding emphasizes that the AI model accounts for only 10% of the system’s behavior, shifting focus to harness and context engineering as the key to effective AI development. This challenges conventional wisdom that model improvements alone drive progress, highlighting instead the importance of configuration, verification, and strategic context management.

The paper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, reports that 85% of professional developers use AI coding agents regularly, with over half using them daily. It states that 41% of new code is AI-generated, but the most significant insight is that the model’s influence is only about 10% of overall system behavior. The remaining 90% depends on the harness — prompts, tools, rules, and observability — which can be configured and optimized to improve results without changing the underlying model.

Concrete experiments cited in the paper show that by modifying only the harness, teams moved models from lower to top-tier performance on benchmarks. For example, one team improved their coding agent’s ranking by adjusting prompts and tools, not the model itself. The authors argue that failures in AI agents are often due to misconfiguration or missing tools, rather than model deficiencies.

The paper advocates for a shift in strategy: investing in harness and context engineering, which can be owned and improved by organizations, rather than chasing the latest model upgrades. This approach also offers better economic efficiency, as ad-hoc prompting can lead to higher token costs and maintenance burdens over time.

At a glance
reportWhen: published early 2026
The developmentThe new SDLC framework reveals that the core of effective AI development is not the model itself but the surrounding harness and context management.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Impact of Harness and Context on AI Development

This insight fundamentally changes how organizations should approach AI integration. Instead of focusing solely on acquiring or developing larger models, companies can achieve better results by optimizing their harness — prompts, tools, and verification processes. This shift can lead to significant cost savings and more reliable AI systems, especially as the model itself remains relatively static and less impactful than the surrounding infrastructure.

Furthermore, this perspective empowers organizations to build durable competitive advantages through configuration, testing, and strategic context management, rather than relying solely on model innovation. It also highlights the importance of skills in context engineering, which will become a core competency in AI-driven software development.

lweiyupeixx Press Model Separator Press Type Automatic Model Parts Detacher Part Separation Tool Hobby Assembling Model Ergonomic

lweiyupeixx Press Model Separator Press Type Automatic Model Parts Detacher Part Separation Tool Hobby Assembling Model Ergonomic

Effortlessly separate model components with our Press Type Model Separator, enhances efficiency and minimize damage risk.

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background on the Shift Toward Context Engineering

Historically, improvements in AI have been driven by larger models and better training data. However, recent developments show that the biggest gains now come from how models are integrated and managed within systems. The paper references experiments where tweaking prompts, tools, and rules significantly outperformed simply upgrading models.

As of early 2026, AI adoption in professional development is widespread, with 85% of developers using AI agents regularly. Despite rapid model improvements, the industry is recognizing that the surrounding infrastructure — the harness — is where the real value and differentiation lie. This is a notable departure from earlier focus areas and indicates a maturation in AI system engineering.

“The model accounts for only 10% of what determines behavior; the harness is 90%.”

— Addy Osmani

AI Engineering: Building Applications with Foundation Models

AI Engineering: Building Applications with Foundation Models

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Uncertainties About Practical Implementation

While the paper provides compelling evidence and experiments, it remains to be seen how broadly these insights will be adopted across different industries and system types. It is also unclear how organizations will handle the transition from model-centric to harness-centric development, especially in legacy systems or less mature teams. Additionally, the long-term impact of this shift on AI model innovation and competition is still developing.

AI-Powered Observability: From Noise to Insight: Transforming How We Monitor, Detect, and Respond

AI-Powered Observability: From Noise to Insight: Transforming How We Monitor, Detect, and Respond

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for Organizations and Developers

Organizations should evaluate their current AI workflows, focusing on harness design, prompt engineering, and verification processes. Building expertise in context engineering will be critical, alongside developing tools and frameworks that support configurable harnesses. Industry-wide, expect a shift toward more disciplined AI system engineering practices, with increased emphasis on configuration, testing, and cost management. Further research and case studies are anticipated to validate and expand on these findings.

Building AI-Powered Products: The Essential Guide to AI and GenAI Product Management

Building AI-Powered Products: The Essential Guide to AI and GenAI Product Management

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

Experiments and benchmarks show that the surrounding harness — prompts, tools, rules, and observability — has a much larger influence on AI performance than the model itself. Proper configuration and management can significantly improve results without changing the underlying model.

How does this change AI development strategies?

Organizations should prioritize building and optimizing their harness and context management systems rather than solely investing in larger or newer models. This approach offers better control, cost efficiency, and reliability.

What skills will be important in this new SDLC?

Skills in context engineering, prompt design, system configuration, and verification will become core competencies, enabling teams to effectively manage and improve AI systems.

Does this mean model improvements are no longer valuable?

Model improvements remain important, but their impact is now comparatively smaller. The focus shifts to how models are integrated and managed within the system, which can yield more immediate and cost-effective results.

What are the risks of focusing on harness and context?

Over-reliance on configuration and context management could lead to complexity and maintenance challenges. Ensuring robustness and security in these areas will be critical to avoid vulnerabilities and system failures.

Source: ThorstenMeyerAI.com

You May Also Like

Review response quality coach for local service businesses

A new review response quality coach is being tested to help local service businesses craft better, compliant, and professional replies to public reviews.

Forezai · Polybot: When the AI Disagrees With the Odds

Polybot, an open-source AI trading experiment, compares independent probability estimates to market prices, highlighting when and how AI might diverge from market consensus.

The Menu: What Ten Answers Reveal

An analysis of ten jurisdictions’ approaches to automation, income, and AI, revealing patterns and challenges in managing the post-labor transition.

The Local-First Agentic Operator

A single operator, leveraging agentic AI, now builds and manages multiple complex products independently, challenging traditional organizational models.