Architecture Overview
Version: 0.6.x
Statewave is a Memory OS — a trusted context runtime for AI agents and applications.
Core loop
RECORD → COMPILE → CONTEXT → GOVERN- Record — immutable episodes capture raw interaction truth
- Compile — pluggable compilers (heuristic or LLM) derive typed memories with provenance, embeddings, and conflict resolution
- Context — assembly service builds ranked, token-bounded, deterministic context bundles using temporal reasoning and semantic similarity
- Govern — provenance inspection, delete-by-subject, authentication, rate limiting, webhook notifications
Component architecture
┌──────────────────────────────────────────────────────────────┐
│ FastAPI Server │
│ │
│ ┌─ Middleware Stack (execution order) ─────────────────────┐ │
│ │ CORS → RequestID → Auth → RateLimit → Tenant │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐ │
│ │ Episodes │ │ Memories │ │ Context │ │ Subjects │ │
│ │ Route │ │ Route │ │ Route │ │ Route │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └─────┬──────┘ │
│ │ │ │ │ │
│ ┌────┴──────────────┴──────────────┴──────────────┴───────┐ │
│ │ Service Layer │ │
│ │ Compilers (heuristic | LLM) · Embeddings (stub|LiteLLM)│ │
│ │ ContextAssembler (ranked, semantic, temporal) │ │
│ │ ConflictResolver · Webhooks │ │
│ └────┬──────────────┬──────────────┬──────────────────────┘ │
│ │ │ │ │
│ ┌────┴──────────────┴──────────────┴──────────────────────┐ │
│ │ Repository / DB Layer │ │
│ │ episodes · memories · semantic search (pgvector) │ │
│ └──────────────────────┬──────────────────────────────────┘ │
└──────────────────────────┼───────────────────────────────────┘
│
┌──────┴──────┐
│ PostgreSQL │
│ + pgvector │
└─────────────┘Key design decisions
- Thin routes, strong services — API handlers delegate to service functions
- Raw truth first — episodes are append-only, never mutated
- Compiled memory second — memories are derived, with provenance to source episodes
- Provenance everywhere — every memory links back to source episodes via
source_episode_ids - Idempotent compilation — recompiling is safe; only uncompiled episodes are processed
- Token-bounded context — context bundles respect configurable token budgets
- Ranked retrieval — composite scoring: kind priority × recency × relevance × temporal validity
- Semantic search — pgvector cosine similarity with graceful fallback to text search
- Structured errors — consistent
{error: {code, message, details, request_id}}everywhere - Local-first —
docker compose up+pip installgets you running - Framework-neutral — no AI framework coupling in core
Middleware stack
Execution order (outermost to innermost):
- CORS — cross-origin headers
- RequestID — generate/propagate
X-Request-ID, bind to structlog - Auth — validate
X-API-Key(skipped when no key configured) - RateLimit — per-IP sliding window (skipped when RPM = 0)
- Tenant — extract
X-Tenant-ID, scope all queries to tenant (app-layer isolation)
Compilation pipeline
Uncompiled Episodes → Compiler → Raw Memories → Embedding → Conflict Resolution → Commit- Compilers:
HeuristicCompiler(regex/pattern, no external deps) andLLMCompiler(any provider via LiteLLM, runs in thread pool to avoid blocking) - Embeddings:
StubEmbeddingProvider(deterministic hash vectors for dev/test) andOpenAIEmbeddingProvider(real semantic vectors via LiteLLM — supports OpenAI, Azure, Cohere, Bedrock, etc.) - Conflict resolution: Jaccard similarity within same (subject, kind) groups; older memory superseded with
valid_toset - All steps execute in a single database transaction
Context scoring model
| Signal | Range | Source |
|---|---|---|
| Kind priority | 3–10 | profile_fact=10, procedure=8, episode_summary=5, raw_episode=3 |
| Recency | 0–5 | Linear scale: most recent = max |
| Task relevance | 0–5 (text) or 0–8 (semantic) | Word overlap or cosine similarity |
| Temporal validity | -4 to +3 | Currently valid = +3, expired = -4 |
Data model
| Entity | Description | Key fields |
|---|---|---|
| Episode | Immutable raw event | id, subject_id, source, type, payload, metadata, provenance, created_at, last_compiled_at |
| Memory | Derived typed memory | id, subject_id, kind, content, summary, confidence, valid_from, valid_to, source_episode_ids, status, embedding |
| ContextBundle | Runtime output | subject_id, task, facts, episodes, procedures, provenance, assembled_context, token_estimate |
Version history
v0.5 — Reliability & Trust
- True multi-tenant isolation (tenant_id on all tables, query-scoped)
- Distributed rate limiting (Postgres-backed)
- Backup/restore tooling (subject-level export/import)
- Durable async compilation (Postgres-backed job queue)
- Admin introspection endpoints (jobs, webhooks, tenant audit)
v0.6 — Support-Agent Superiority
- Session-aware context assembly (active session boosted, resolved deprioritized)
- Resolution tracking (open/resolved/unresolved per session)
- Handoff context packs (structured escalation briefs with health + SLA)
- Customer health scoring (0–100, explainable factors)
- Repeat-issue detection (prior resolution surfacing)
- Proactive health alerts (webhooks on state transitions)
- SLA tracking (response time, resolution time, breach flags)
v0.3.5 stabilization
- Fixed middleware execution order (auth before rate limit)
- Compile + conflict resolution in single transaction
- Request validation (string lengths, bounded limits)
- LLM compiler runs in ThreadPoolExecutor (non-blocking)
- SDKs support auth, tenant headers, and semantic search
v0.3 additions
- LLM-backed memory compiler (any provider via LiteLLM, thread-pooled)
- Embedding generation (LiteLLM + stub providers)
- Semantic search via pgvector cosine similarity with fallback
- Temporal reasoning in context assembly
- Memory conflict resolution (Jaccard similarity, auto-supersede)
- Webhooks (episode.created, memories.compiled, subject.deleted)
- API key authentication, rate limiting (per-IP)
- Multi-tenant header extraction
v0.2 additions
- Pluggable
BaseCompilerprotocol - Ranked retrieval with composite scoring
- Request ID middleware, structured errors, health endpoints
- SDKs with typed exceptions (Python + TypeScript)