Updating

Max Agency

LangChain

Released: 2026-04-09

Free New

2 Episodes

Audio

Free New

2 Episodes

Audio

Released: 2026-04-09

Most Recent Episode

How Hex Builds AI Agents: Making Agents Reason Like Human Data Analysts | Izzy Miller, AI Engineer

Time: 1:08:20

Play

Izzy Miller is an AI engineer at Hex, an AI analytics platform that was one of the first companies to ship data agents to real paying users. Today, Hex runs a multi-agent system with nearly 100K tokens of tools, and Izzy is building a 90-day simulation to evaluate whether those agents actually get smarter over time. In this conversation, he walks through the harness decisions that shaped their architecture, the failure modes Hex is seeing at scale, and what it takes to build an eval that no current model can pass.

We also discuss:

Why data agents are harder to verify than coding agentsUnder the hood of Hex’s agentsHow Hex is unifying separate agentsWhy most eval sets are badThe 90-day simulation for long-horizon evalsHow Izzy went from marketing to AI engineer

References:

Andon LabsAnthropicBarry McCardelChatGPTClaude CodeClaude Sonnet 4.6DBTGPT-3.5 TurboGPT-5.3 Codex SparkGPT-5.4HexLangChainLangSmithLookerOpenAIOpus 4.6Satya NadellaSnowflakeVending Machine

Where to find Izzy:

LinkedInTwitter/X

Where to find Harrison:

LinkedInTwitter/X

Where to find LangChain:

WebsiteDocs

Send feedback or questions to maxagency@langchain.dev

Timestamps:

01:35 Where Hex's notebook agent started

03:46 The moment Hex knew it was time for agents

07:36 Why data agents are harder to verify than coding agents

09:30 How Hex is unifying separate agents

13:28 Under the hood of the notebook agent

15:41 The harness features that are now holding the agent back

17:41 Why Hex built their own orchestrator

18:59 Managing nearly 100K tokens of tools

20:49 Ephemeral queries and agent behavior trade-offs

24:46 The UX problem with showing agents' thinking

27:28 Why verification is harder than transparency for data agents

31:00 Memory, context conflicts, and collapse modes

34:38 How Hex built their internal eval system

39:29 Why most eval sets are bad

44:30 The 900% quota eval that every model fails

46:55 Model upgrades and the "in distribution" debate

51:34 How Izzy went from marketer to AI engineer

59:59 The 90-day simulation for long-horizon evals

Episode ID: 1000760489140

GUID: a9f9dfd9-c3c5-4ac1-9aec-d97cad9b0f7a

Release Date: 09/04/2026, 17:00:00

Description

Welcome to Max Agency, a podcast about how the best AI agents are actually being built. Hosted by Harrison Chase, CEO of LangChain, each episode goes deep with the builders designing, deploying, and learning from real agent systems in the wild. From architecture decisions to evals, tooling, and failure modes, Max Agency is for people who want to understand what it really takes to build useful agents.

Feed URL

https://anchor.fm/s/1112ba400/podcast/rss

Apple Podcasts: Customer Reviews

No Entry