Layered Context: An Adaptive Memory Architecture for Agents

Once you're past isolated sessions, memory moves from a feature you bolt on, to the medium the agent runs on.

by Rishi Dean

May 20, 2026

The Core Idea

Andrej Karpathy named something most builders had been quietly doing for a year: the shift from “prompt engineering” to “context engineering” (tweet). Industrial-strength LLM apps don’t live or die on cleverer prompts. They live or die on what gets packed into the context window for the next step. It’s a sharp framing. It’s also incomplete.

Karpathy’s framing works in single-player mode. One user, one session, one task. Build agents that serve a thousand users, or one user across fifty projects over two years, and “what to pack into the window” stops being the right question. The question becomes: what’s the architecture that decides what gets packed, when, and at what density?

Once you’re past isolated sessions, memory isn’t a feature you bolt on with a vector store. It’s the medium the agent runs on. Every layer of an agent system is a form of memory, crystallized at different densities. What most people call “agent memory” is just the most volatile layer of a deeper structure. Treating it like the whole thing is what makes agents feel amnesiac no matter how many tokens you throw at them.

This is the post-corpus question. The Context Inversion argued for building a persistent, machine-readable corpus to replace runtime stitching. Once you have one, what’s inside it? How is it structured? How does it adapt? Four layers, each answering a different question: WHAT exists, WHY the agent behaves the way it does, HOW it operates right now, and WHO it’s serving.

The Four Layers

Layer 0: Ontology (WHAT)

The world model. Entities, relationships, valid state transitions, the tool graph. The physics of the agent’s world.

In a productivity agent: what a task is, what an initiative is, how they relate, what operations are legal. In a coding agent: repos, files, dependencies, test suites. In a sales agent: accounts, opportunities, contacts, pipeline stages.

The agent cannot reason without it. If you’ve written a YAML schema describing your data model, or a CLAUDE.md defining your project’s entity structure, that’s Layer 0.

Layer 1: Character (WHY)

Identity, communication norms, behavioral guardrails, safety boundaries. Character is what makes the same ontology feel like a Chief of Staff in one deployment and a code reviewer in another. Same world model. Different personality, different rules.

If you’ve seen OpenClaw’s SOUL.md or written a PERSONA.md, that’s Layer 1.

Layer 2: Workflow (HOW)

The active process, legal actions, preferred patterns for this mode. Workflow constrains the action space: a planning workflow shouldn’t have access to destructive mutations; a triage workflow biases toward classification over execution.

Given an intent, workflow defines what’s legal and what’s preferred. When a user says “let’s plan” vs. “just do it,” they’re selecting a workflow. Custom skills that codify a process — a blog-writing routine, a spec-audit checklist, a triage protocol — those are Layer 2.

Layer 3: Memory (WHO)

Preferences, history, patterns, in-flight context. Everything specific to this person in this moment.

This is what most people mean when they say “agent memory,” and it’s the layer where the gap between current tooling and the actual need is widest. The right version isn’t a user profile or a vector store of past conversations. It’s a living model of the user’s intent — what they’re trying to accomplish, what they’ve tried, how they think.

It should know that this user always triages P0 bugs before feature work. That they’re three weeks into a migration and context from week one still matters. That when they say “clean this up” they mean a specific thing informed by thirty prior interactions.

Claude’s memory feature, Mem0, various user-profile stores — they’re all reaching for this. Most stop at preferences when the real target is a compressed model of how this person works.

That’s the layer I'm building in D’Stil, a mobile-first agent whose entire job is maintaining this living model of you, your projects, and how you operate. Memory as the architecture, not as a feature bolted onto one.

State, Learning, and Coherence

The four layers are state — what the agent knows and how it’s configured right now, at different densities. Ontology is crystallized state, encoded in schema. Character is solidified state, encoded in configuration. Workflow is fluid state, encoded in process definitions. Memory is volatile state, encoded in retrieved information. They’re not static buckets. They’re density bands in a continuous medium. But they’re still just state.

Learning is what tunes those layers over time. It’s not a layer. It’s the intelligence that observes patterns in context and writes back into the structural layers, reshaping how the agent operates going forward. Having a bunch of markdown files is context. Evolving what they mean is learning. One is a filing cabinet. The other is what reorganizes the cabinet.

Promotion and demotion are both forms of learning.

Promotion crystallizes upward: episodic → repeated → routine → default. The system notices recurrence, starts acting on the pattern (“Set to p1 — your usual. Change?”), and eventually graduates stable routines into workflow or character. At that point the agent isn’t remembering a preference. It’s restructuring its own operating logic. Same crystallization curve I wrote about in Code Over Config, except now it’s the agent’s behavior doing the crystallizing, not just the features it builds.

Demotion dissolves downward. Without it, agents accumulate stale defaults that degrade coherence and waste context budget. The triggers that matter: repeated override, explicit command, context shifts, periodic review. Demotion should never be silent deletion. The user should feel the system learning. They should never feel it silently forgetting.

But promotion and demotion create a problem. Each layer adapts on its own clock — memory moves every request, character moves per deployment — and the layers drift apart. Memory shows what the user actually does. The other layers express what was intended. That gap is the most valuable signal in the architecture. Not a conflict to resolve. A diagnostic to interpret.

Is it friction (the user still wants the thing, something’s in the way) or obsolescence (the intent no longer applies)? Friction means redesign the path. Obsolescence means prune it.

This is coherence debt. Like tech debt: everything works today, but the layers are quietly getting misaligned. The most valuable thing a periodic review surfaces isn’t stale memories. It’s stale assumptions about how the layers relate. The system gets better not by adding more state, but by expanding the boundary of what friction it can solve, hardening the patterns that prove stable, and pruning the ones that don’t.

Summary

Four layers, each adapting on its own timescale:

Ontology (WHAT) → the world model
Character (WHY) → identity and norms
Workflow (HOW) → the active process
Memory (WHO) → the user, right now

Information flows between them through promotion and demotion. Coherence depends on reading the gap between intent and reality as a diagnostic.

Agents aren’t programs that happen to have memory. They’re memory architectures that can execute.

Layered Context: An Adaptive Memory Architecture for Agents

The Core Idea