Memory for Production Agents
Working memory. Episodic memory. Semantic retrieval. Three memory types wired together so your agents remember what matters without hallucinating what they don't.
Three memory tiers, one unified API
Diaflow manages all three memory types behind a single SDK interface. Write to working memory in one call; query semantic vector memory in the next. No vector DB client to manage separately.
- ■Working memory — key/value store scoped to a single run. Shared across steps. Cleared on run end.
- ■Episodic memory — append-only log of past runs. Query by time range, tags, or user ID. Used for continuity across sessions.
- ■Semantic memory — vector store for knowledge retrieval. Connect Pinecone, Weaviate, Qdrant, or pgvector. Query by embedding similarity.
Memory you can trust in production
Context isolation
Working memory is scoped per run — no bleed between concurrent agent executions. Isolation enforced at the storage layer, not the SDK.
Hybrid search
Combine vector similarity with metadata filters (user ID, date range, document type) in a single query. Precision retrieval, not just top-K.
TTL and eviction
Set TTLs on working memory keys. Configure episodic memory retention policies. Auto-archive old runs to keep storage costs predictable.
One API for all three memory types
from diaflow import Agent
from diaflow.memory import PineconeBackend
agent = Agent(
llm="claude-3-5-sonnet",
memory=PineconeBackend(index="my-index")
)
with agent.run(run_id="run-42") as ctx:
ctx.working.set("user_id", "u123") # working mem
docs = ctx.semantic.query( # semantic mem
embedding=query_vec, top_k=5
)
history = ctx.episodic.get_last( # episodic mem
user_id="u123", n=10
)