Skip to main content

Security

Three Hops Deep and No Browser in Sight

·2975 words·14 mins
You’ve seen this screen. If you use Claude Code or Cursor with MCP servers, you’ve clicked through it dozens of times. “Google MCP Server wants to access your Google Account.” You review the scopes, click Allow, a token lands in your local config, and everything works. Now imagine the agent that needs your Google Calendar isn’t the one you’re talking to. It’s three agents deep in a multi-agent chain, running in a container with no browser and no way to show you a consent screen.

Red Teaming Agents, Not Models

Your agent passed every guardrail test. It never says anything harmful, never generates offensive content, politely declines every adversarial prompt you throw at it. And last Tuesday, it quietly deleted the wrong database because a Jira ticket it was reading contained a hidden instruction in the description field. The guardrails caught everything the agent said. They caught nothing about what it did.