Phero • Memory

What it is

Memory is how an agent keeps useful context across turns. In Phero, memory is a pluggable component: the agent can retrieve prior messages before each turn and save the new messages after each turn.

Different implementations provide different behavior:

Short-term: keep the last N messages (fast, simple)
File-backed: persist messages locally as JSON across runs
PostgreSQL-backed: persist session history in a shared database
NATS JetStream KV-backed: persist session history in NATS; survives restarts, supports named sessions
Summarized: compress older history into a structured summary
Semantic: retrieve by similarity search (RAG-backed)

Simple FIFO memory (in-process)

memory/simple stores recent llm.Message values in a fixed-size FIFO ring buffer. When the capacity is reached, older messages roll off.

import memory "github.com/henomis/phero/memory/simple"

conversationMemory := memory.New(20)

a.SetMemory(conversationMemory)

This is the default choice for REPL-style conversational agents.

JSON file memory (persistent local state)

memory/jsonfile stores the full conversation in a JSON file on disk. The file path acts as the session identifier, so restarting the process and pointing at the same file resumes the same history.

import memory "github.com/henomis/phero/memory/jsonfile"

conversationMemory, err := memory.New("memory.json")
if err != nil {
    panic(err)
}

a.SetMemory(conversationMemory)

This is a good fit for local assistants or skill workflows that need persistence without introducing a database.

Optional summarization

The in-process and persistent memories can optionally summarize history when the number of stored messages reaches a threshold. The idea is to keep a compact “state snapshot” plus the most recent turns.

// From examples/conversational-agent (edited for brevity)

conversationMemory := memory.New(
    20,
    memory.WithSummarization(llmClient, 8, 15),
)

a.SetMemory(conversationMemory)

Semantic memory (RAG-backed)

For long-term recall, you can back memory with a RAG store: messages are embedded and stored in a vector store, then retrieved by similarity search at query time.

The adapter lives at memory/rag and wraps a rag.RAG instance.

import (
    ragmemory "github.com/henomis/phero/memory/rag"
    "github.com/henomis/phero/rag"
)

ragEngine, _ := rag.New(store, embedder)

conversationMemory := ragmemory.New(ragEngine)

a.SetMemory(conversationMemory)

This is used in the Long-Term Memory example (RAG + Qdrant).

NATS JetStream KV memory (persistent, network-accessible)

memory/nats stores conversation history in a NATS JetStream Key-Value bucket. Each session is a single KV key; the value is a JSON-encoded []llm.Message. Because JetStream persists bucket data to disk, memory survives process restarts automatically.

Multiple sessions can coexist in the same bucket by using different session IDs — useful for running independent conversations side-by-side without separate databases.

import (
    "os"

    "github.com/nats-io/nats.go"

    natsmemory "github.com/henomis/phero/memory/nats"
)

nc, _ := nats.Connect(os.Getenv("NATS_URL"))
js, _ := nc.JetStream()
kv, _ := js.CreateKeyValue(&nats.KeyValueConfig{Bucket: "phero_memory"})

conversationMemory, err := natsmemory.New(kv, "session-123")
if err != nil {
    // handle error
}

a.SetMemory(conversationMemory)

Start a local NATS server with JetStream enabled:

docker run --rm -p 4222:4222 nats -js

The full walkthrough is in the NATS Memory example, which demonstrates session resumption across two process runs.

PostgreSQL conversation memory (persistent)

If you want conversation history to survive process restarts (or be shared across replicas), use memory/psql. It's a session-scoped store backed by PostgreSQL (JSONB rows), with optional automatic summarization similar to memory/simple.

import (
    "database/sql"
    "os"

    _ "github.com/jackc/pgx/v5/stdlib"

    "github.com/henomis/phero/memory/psql"
)

db, err := sql.Open("pgx", os.Getenv("DATABASE_URL"))
if err != nil {
    // handle error
}

conversationMemory, err := psql.New(
    db,
    "session-123",
    psql.WithSummarization(llmClient, 50, 25),
)
if err != nil {
    // handle error
}

a.SetMemory(conversationMemory)

By default the store auto-creates its table/index (you can disable that with psql.WithEnsureSchema(false)).

Run an example

The conversational agent example uses simple FIFO memory, with an option to enable summarization.

# from repo root

go run ./examples/conversational-agent

# with summarization enabled

go run ./examples/conversational-agent -summarize -summary-threshold 8 -summary-size 15 -max-messages 20

The skills example uses local JSON-backed memory:

cd ./examples/skills
go run .

For NATS-backed persistent memory (start NATS first):

docker run --rm -p 4222:4222 nats -js

go run ./examples/nats-memory -session my-session

For semantic recall, try the long-term memory example (requires Qdrant):

go run ./examples/long-term-memory -qdrant-host localhost -qdrant-collection long_term_memory

Picking the right memory

Fast prototyping: use memory/simple with a reasonable max messages value
Local persistence: use memory/jsonfile when you want history to survive restarts on one machine
Shared persistence: use memory/psql when sessions should live in PostgreSQL
Lightweight network persistence: use memory/nats when you want NATS JetStream KV as the store (easy Docker setup, named sessions)
Long conversations: enable summarization to keep context compact
Long-term recall: use semantic memory via memory/rag

Related packages

agent: uses memory to seed/snapshot sessions
rag: semantic retrieval engine used by RAG memory
embedding: generates vectors for semantic memory
vectorstore: stores vectors for semantic memory