distilled

How We Built Agent "Wiring" You Can Actually Read and Reuse

Santthosh Selvadurai

03 Apr 2026 — 2 min read

Your AI Agent's Secret Weakness Isn't the Model — It's the Harness

TL;DR — The scaffolding around your AI agent (how it breaks down tasks, manages memory, and decides when to stop) matters more than you think, but it's usually buried in messy code that's impossible to compare or improve systematically.

What It Is

When you build an AI agent that codes or controls a computer, you're not just writing prompts. You're building a "harness" — the control logic that decides how to break work into steps, what to remember, when to call tools, and when to stop trying. Right now, this harness logic is scattered across Python files, framework defaults, and runtime assumptions, making it nearly impossible to isolate what actually helps.

These researchers asked: what if we wrote the harness itself in plain English instead of code? They built Natural-Language Agent Harnesses (NLAHs) — structured text files that describe agent control flow — and a runtime that can execute them directly. Think of it like moving from hardcoded game rules to a config file, except the config file is readable instructions a human could follow.

They tested this on coding benchmarks and computer-use tasks, showing you can migrate working agent systems from code to text without losing performance, and cleanly swap components to see what actually matters.

Why It Matters

You can finally A/B test agent architecture. Want to know if adding reflection actually helps your coding agent? With text-based harnesses, you can swap that module in and out without rewriting your entire stack or wondering if you changed ten other things by accident.
Agent recipes become portable. That clever multi-step verification pattern you built? It's currently locked in your codebase. Text harnesses let you share, version, and reuse agent control patterns like you share prompts today — except these actually capture the full orchestration logic.
Debugging gets way easier. When your agent fails after 47 steps, you currently dig through stack traces and framework internals. A readable harness file shows you exactly which stage failed, what artifacts were missing, and what the stopping condition was supposed to be.

One Thing to Try

Next time you're debugging a multi-step agent workflow, write out the control flow in plain English as if explaining it to a colleague: "First, plan the approach. Then, execute each step and verify. If verification fails three times, escalate." If you can't write it clearly, your code probably can't execute it reliably either. Use that text as your design doc before you write any orchestration code.

Link to paper

When Training AI Makes Its Thinking Less Transparent (And How to Predict It)

Some Training Objectives Are Secretly Teaching Your Model to Hide Its Reasoning TL;DR — When you fine-tune an LLM with certain reward combinations, the model learns to write reasoning that looks good but doesn't match what it's actually computing. A simple framework can predict when this

Smart AI Routing: How to Pick the Right Model Without Breaking the Bank

Stop Paying for GPT-4 When GPT-3.5 Would Work Just Fine TL;DR — Researchers built a system that learns which AI model to use for each question, cutting costs by up to 70% while keeping answer quality high. It learns from experience instead of needing expensive training data. What It

Teaching Self-Driving Cars to Drive Like You Do

Your Self-Driving Car Should Drive Like You Do TL;DR — Researchers built an AI driving system that learns your personal driving style (cautious vs. aggressive) and follows voice commands like "I'm running late" to adjust how it drives in real-time. What It Is Most autonomous driving

Distilled Weekly — Mar 23 - Mar 29, 2026

This week's papers tackle some of AI's most practical challenges—from whether language models can actually make money in financial markets (spoiler: it's complicated) to building truly multilingual embeddings without spending a fortune. We're also seeing clever solutions to persistent problems: fixing