Auth is the part of backend architecture that feels solved until you add a second service. Then it feels solved again until you add an agent that needs to act on behalf of a user across three services simultaneously. Then you rebuild it.

This is the story of our third implementation, and why it is the one that held.

The problem with JWTs at service boundaries

JWTs are good for user → service authentication. They are awkward for service → service calls because:

  1. You don't want a service to hold a user's JWT and forward it — that couples services to user session lifetime and leaks the token surface
  2. You don't want services to share a single shared secret — a compromise in one service compromises all of them
  3. You need a way to express "service A is calling service B on behalf of user U with scope S"

Our first implementation used a single long-lived shared JWT secret across all services. It worked until we had a leaked credential incident in a dev environment and realized we couldn't rotate one service's secret without rotating all of them.

What we actually built

The current system has three token types:

User access tokens — short-lived JWTs (15 min), issued by the auth service, validated at the gateway. These never leave the gateway. Downstream services never see them.

Service-to-service tokens — asymmetrically signed JWTs issued per service pair. Service A has a private key; Service B trusts A's public key. Rotatable independently. Issued with a 5-minute TTL and cached in-process.

Delegation tokens — the interesting one. When the gateway validates a user token and routes to Service A, it issues a delegation token: {caller: "service-a", subject: user_id, scopes: [...], expires: now+5min}. Service A passes this downstream. Service B can verify the delegation chain without re-validating the user token.

The delegation token pattern is what made MCP session auth tractable. An MCP session runs under a delegation token, not a user token, so we can expire MCP sessions independently of user sessions and audit the delegation chain in our logs.

The rebuild trigger

Our second implementation had a correct design but a broken assumption: we assumed token validation would always hit the auth service. We had no local validation.

At load, this created a latency dependency — every request, every service hop, waited on an auth service round-trip. At 10 services deep, we were adding 80ms of auth overhead to every agent invocation.

The fix was obvious in retrospect: validate the signature locally (cached public keys, refreshed every 5 minutes), and only call the auth service for revocation checks on sensitive scopes. 99% of requests now validate in microseconds.

Scopes for agent actions

The part that gets complicated with agentic systems: an agent acts on behalf of a user, but the scope of what it can do should be narrower than the user's full permissions.

We model this as scope intersection. The user token has a full scope set. The agent session is issued with an explicit scope list. The effective permission is user_scopes ∩ agent_scopes. The agent can never do more than the user; the user can explicitly limit what the agent can do.

This is the same model OAuth uses for third-party app authorization. It turns out to be the right model for agents too, and framing it that way made the implementation obvious.

What I'd tell past-me

  • Asymmetric signing from day one. The ability to rotate one service's trust independently is worth the setup overhead.
  • Local validation from day one. Centralizing validation feels clean; at load it is a bottleneck.
  • Log the delegation chain, not just the subject. When something goes wrong with an agent action, you want to know which service authorized what.
  • Keep token issuance in one place. The temptation to let services issue their own tokens is real and wrong.