What it is
Memory is how an agent keeps useful context across turns. In Phero, memory is a pluggable component: the agent can retrieve prior messages before each turn and save the new messages after each turn.
Different implementations provide different behavior:
- Short-term: keep the last N messages (fast, simple)
- Summarized: compress older history into a structured summary
- Semantic: retrieve by similarity search (RAG-backed)
Simple FIFO memory (in-process)
memory/simple stores recent llm.Message values in a fixed-size FIFO ring buffer.
When the capacity is reached, older messages roll off.
import memory "github.com/henomis/phero/memory/simple"
conversationMemory := memory.New(20)
a.SetMemory(conversationMemory)
This is the default choice for REPL-style conversational agents.
Optional summarization
The simple memory can optionally summarize history when the number of stored messages reaches a threshold. The idea is to keep a compact “state snapshot” plus the most recent turns.
// From examples/conversational-agent (edited for brevity)
conversationMemory := memory.New(
20,
memory.WithSummarization(llmClient, 8, 15),
)
a.SetMemory(conversationMemory)
Semantic memory (RAG-backed)
For long-term recall, you can back memory with a RAG store: messages are embedded and stored in a vector store, then retrieved by similarity search at query time.
The adapter lives at memory/rag and wraps a rag.RAG instance.
import (
ragmemory "github.com/henomis/phero/memory/rag"
"github.com/henomis/phero/rag"
)
ragEngine, _ := rag.New(store, embedder)
conversationMemory := ragmemory.New(ragEngine)
a.SetMemory(conversationMemory)
This is used in the Long-Term Memory example (RAG + Qdrant).
Run an example
The conversational agent example uses simple FIFO memory, with an option to enable summarization.
# from repo root
go run ./examples/conversational-agent
# with summarization enabled
go run ./examples/conversational-agent -summarize -summary-threshold 8 -summary-size 15 -max-messages 20
For semantic recall, try the long-term memory example (requires Qdrant):
go run ./examples/long-term-memory -qdrant-host localhost -qdrant-collection long_term_memory
Picking the right memory
- Fast prototyping: use
memory/simplewith a reasonable max messages value - Long conversations: enable summarization to keep context compact
- Long-term recall: use semantic memory via
memory/rag
Related packages
- agent: uses memory to seed/snapshot sessions
- rag: semantic retrieval engine used by RAG memory
- embedding: generates vectors for semantic memory
- vectorstore: stores vectors for semantic memory