Distilled Weekly — Mar 02 - Mar 08, 2026
This week we're diving deep into making AI agents actually useful — and that means teaching them to remember what they've learned, know their limits, and verify their own work. We've got fascinating papers on everything from giving agents memory systems that work like notebooks to training medical AI through real diagnostic reasoning instead of multiple choice tests. Plus, Meta's work on chatbots that improve through conversation and a clever approach to generating harder math problems using code generation.
This Week's Papers
1. Teaching AI Agents to Explore Like They Have a Notebook: How Memory Unlocks Better Problem-Solving
A new training approach lets AI agents both remember past attempts and learn from them simultaneously, leading to 2x better performance on complex tasks and the ability to tackle new problems without retraining.
2. Training Medical AI to Think Like a Doctor: How Reinforcement Learning Beats Multiple Choice
Researchers built a medical AI that explains its reasoning before answering questions about X-rays and CT scans, using a clever reward system that judges both correctness and clarity without needing millions of labeled examples.
3. How Meta Built a Chatbot That Gets Better Every Time You Talk to It
Meta improved their AI chatbot by repeatedly testing it on real Instagram, WhatsApp, and Messenger users, then training new versions based on what kept people engaged. After 15 iterations, they got 19% more conversation depth and nearly doubled how well the AI followed character instructions.
4. How AI Can Stop Fooling Itself by Actually Checking Its Math Answers
Letting AI models improve themselves by voting on their own answers sounds great, but they often agree on the wrong answer. Adding a simple code execution step to verify answers before learning from them fixes this problem.
5. Teaching AI Agents When to Say "No" Before They Break Something
When AI agents can use tools and take actions, teaching them to refuse unsafe requests is just as important as teaching them to complete tasks. A new training method cuts harmful behavior by 50% while keeping helpful tasks running smoothly.
6. Can AI Agents Create Harder Math Problems By Writing Code?
Researchers built a system where AI coding agents take existing math problems and automatically generate harder versions that are still solvable, potentially solving the shortage of challenging problems needed to train advanced math AI.
That's a wrap for this week. Hit reply if any of these sparked an idea.
— Santthosh