Memory

Give agents context across turns: recent, summarized, or semantic.

What it is

Memory is how an agent keeps useful context across turns. In Phero, memory is a pluggable component: the agent can retrieve prior messages before each turn and save the new messages after each turn.

Different implementations provide different behavior:

Simple FIFO memory (in-process)

memory/simple stores recent llm.Message values in a fixed-size FIFO ring buffer. When the capacity is reached, older messages roll off.

import memory "github.com/henomis/phero/memory/simple"

conversationMemory := memory.New(20)

a.SetMemory(conversationMemory)

This is the default choice for REPL-style conversational agents.

JSON file memory (persistent local state)

memory/jsonfile stores the full conversation in a JSON file on disk. The file path acts as the session identifier, so restarting the process and pointing at the same file resumes the same history.

import memory "github.com/henomis/phero/memory/jsonfile"

conversationMemory, err := memory.New("memory.json")
if err != nil {
    panic(err)
}

a.SetMemory(conversationMemory)

This is a good fit for local assistants or skill workflows that need persistence without introducing a database.

Optional summarization

The in-process and persistent memories can optionally summarize history when the number of stored messages reaches a threshold. The idea is to keep a compact “state snapshot” plus the most recent turns.

// From examples/conversational-agent (edited for brevity)

conversationMemory := memory.New(
    20,
    memory.WithSummarization(llmClient, 8, 15),
)

a.SetMemory(conversationMemory)

Semantic memory (RAG-backed)

For long-term recall, you can back memory with a RAG store: messages are embedded and stored in a vector store, then retrieved by similarity search at query time.

The adapter lives at memory/rag and wraps a rag.RAG instance.

import (
    ragmemory "github.com/henomis/phero/memory/rag"
    "github.com/henomis/phero/rag"
)

ragEngine, _ := rag.New(store, embedder)

conversationMemory := ragmemory.New(ragEngine)

a.SetMemory(conversationMemory)

This is used in the Long-Term Memory example (RAG + Qdrant).

NATS JetStream KV memory (persistent, network-accessible)

memory/nats stores conversation history in a NATS JetStream Key-Value bucket. Each session is a single KV key; the value is a JSON-encoded []llm.Message. Because JetStream persists bucket data to disk, memory survives process restarts automatically.

Multiple sessions can coexist in the same bucket by using different session IDs — useful for running independent conversations side-by-side without separate databases.

import (
    "os"

    "github.com/nats-io/nats.go"

    natsmemory "github.com/henomis/phero/memory/nats"
)

nc, _ := nats.Connect(os.Getenv("NATS_URL"))
js, _ := nc.JetStream()
kv, _ := js.CreateKeyValue(&nats.KeyValueConfig{Bucket: "phero_memory"})

conversationMemory, err := natsmemory.New(kv, "session-123")
if err != nil {
    // handle error
}

a.SetMemory(conversationMemory)

Start a local NATS server with JetStream enabled:

docker run --rm -p 4222:4222 nats -js

The full walkthrough is in the NATS Memory example, which demonstrates session resumption across two process runs.

PostgreSQL conversation memory (persistent)

If you want conversation history to survive process restarts (or be shared across replicas), use memory/psql. It's a session-scoped store backed by PostgreSQL (JSONB rows), with optional automatic summarization similar to memory/simple.

import (
    "database/sql"
    "os"

    _ "github.com/jackc/pgx/v5/stdlib"

    "github.com/henomis/phero/memory/psql"
)

db, err := sql.Open("pgx", os.Getenv("DATABASE_URL"))
if err != nil {
    // handle error
}

conversationMemory, err := psql.New(
    db,
    "session-123",
    psql.WithSummarization(llmClient, 50, 25),
)
if err != nil {
    // handle error
}

a.SetMemory(conversationMemory)

By default the store auto-creates its table/index (you can disable that with psql.WithEnsureSchema(false)).

Run an example

The conversational agent example uses simple FIFO memory, with an option to enable summarization.

# from repo root

go run ./examples/conversational-agent

# with summarization enabled

go run ./examples/conversational-agent -summarize -summary-threshold 8 -summary-size 15 -max-messages 20

The skills example uses local JSON-backed memory:

cd ./examples/skills
go run .

For NATS-backed persistent memory (start NATS first):

docker run --rm -p 4222:4222 nats -js

go run ./examples/nats-memory -session my-session

For semantic recall, try the long-term memory example (requires Qdrant):

go run ./examples/long-term-memory -qdrant-host localhost -qdrant-collection long_term_memory

Picking the right memory

Related packages