distilled

Distilled Weekly — Mar 02 - Mar 08, 2026

Santthosh Selvadurai

08 Mar 2026 — 2 min read

This week we're diving deep into making AI agents actually useful — and that means teaching them to remember what they've learned, know their limits, and verify their own work. We've got fascinating papers on everything from giving agents memory systems that work like notebooks to training medical AI through real diagnostic reasoning instead of multiple choice tests. Plus, Meta's work on chatbots that improve through conversation and a clever approach to generating harder math problems using code generation.

This Week's Papers

1. Teaching AI Agents to Explore Like They Have a Notebook: How Memory Unlocks Better Problem-Solving

A new training approach lets AI agents both remember past attempts and learn from them simultaneously, leading to 2x better performance on complex tasks and the ability to tackle new problems without retraining.

2. Training Medical AI to Think Like a Doctor: How Reinforcement Learning Beats Multiple Choice

Researchers built a medical AI that explains its reasoning before answering questions about X-rays and CT scans, using a clever reward system that judges both correctness and clarity without needing millions of labeled examples.

3. How Meta Built a Chatbot That Gets Better Every Time You Talk to It

Meta improved their AI chatbot by repeatedly testing it on real Instagram, WhatsApp, and Messenger users, then training new versions based on what kept people engaged. After 15 iterations, they got 19% more conversation depth and nearly doubled how well the AI followed character instructions.

4. How AI Can Stop Fooling Itself by Actually Checking Its Math Answers

Letting AI models improve themselves by voting on their own answers sounds great, but they often agree on the wrong answer. Adding a simple code execution step to verify answers before learning from them fixes this problem.

5. Teaching AI Agents When to Say "No" Before They Break Something

When AI agents can use tools and take actions, teaching them to refuse unsafe requests is just as important as teaching them to complete tasks. A new training method cuts harmful behavior by 50% while keeping helpful tasks running smoothly.

6. Can AI Agents Create Harder Math Problems By Writing Code?

Researchers built a system where AI coding agents take existing math problems and automatically generate harder versions that are still solvable, potentially solving the shortage of challenging problems needed to train advanced math AI.

That's a wrap for this week. Hit reply if any of these sparked an idea.

— Santthosh

Teaching AI to Think Out Loud Without the Rambling

Teaching AI to Think Less and Say More TL;DR — Researchers found that AI reasoning models ramble too much, and simply asking them to "be concise" then training them to do it naturally cuts their thinking by half while making them more accurate. What It Is When you

Teaching AI to Search Like a Pro: How Reinforcement Learning Created a Next-Gen Enterprise Search Agent

Teaching AI Agents to Search Like Experts (Without Needing Human Labels) TL;DR — Databricks trained an AI agent that's better at searching through company documents and answering complex questions than GPT-5 or Claude, using fake data generated by other AI agents plus reinforcement learning. What It Is Most

Can AI Agents Create Harder Math Problems By Writing Code?

Teaching AI to Write Its Own Math Homework (And Make It Harder) TL;DR — Researchers built a system where AI coding agents take existing math problems and automatically generate harder versions that are still solvable, potentially solving the shortage of challenging problems needed to train advanced math AI. What It

Teaching AI Agents When to Say "No" Before They Break Something

AI Agents Need to Learn When to Say "No" TL;DR — When AI agents can use tools and take actions, teaching them to refuse unsafe requests is just as important as teaching them to complete tasks. A new training method cuts harmful behavior by 50% while keeping helpful