What is agent memory?

Agent memory is the state an autonomous agent carries between turns. It usually splits into three layers: working memory (what sits in the context window now), episodic memory (a persisted log of past sessions, often summarized into a vector store), and semantic memory (durable facts like user preferences). Treating these as one feature is where most agents break.

Agent memory is a layered state problem, not a single feature. Most teams treat it as chat history stuffed into the context window, then hit the failure mode one HN engineer named directly: agents lose track of what they already did, re-implement things, or contradict decisions from 20 minutes ago. The context window is working memory, not durable memory.

The three layers worth naming

Working memory. What sits in the LLM's context: prior turns, retrieved chunks, current tool outputs. Bounded by the model's token limit. Volatile.
Episodic memory. A persisted record of past sessions, decisions, and outcomes. Usually summarized or chunked into a vector store so it can be retrieved later by similarity, not by stuffing it all back into context.
Semantic memory. Durable facts the agent treats as ground truth: user preferences, account settings, project conventions. Lives in a structured store (key-value, SQL, profile doc), not embeddings.

Why this matters operationally

Conflating the three produces two failure modes. Either context windows balloon with stale history until the model hallucinates, or memory means nothing more than appending to a chat log. LangChain memory modules, Letta (formerly MemGPT), and mem0 each implement variations of this split, with summarization triggers, vector recall, and external state that survives compaction.

For non-technical builders

Think of memory as three notebooks: a scratchpad the agent uses right now, a journal of past sessions it can search through, and a profile sheet of facts about you it always keeps handy.

Last updated: May 20, 2026