← Tutti gli agenti
memory layer
Infra/AI/MetaMemoria persistente cross-session per agenti ValoSwiss con tier core/recall/archival inspired by mem0ai/mem0 (48k★ universal memory 21 framework), Letta ex-MemGPT (21.7k★ OS-tiered memory), getzep/graphiti (temporal KG state changes). Pattern ADD-only single-pass mem0-style con dedup semantica, FTS Postgres tsvector + …
0 turn0/0$0.0000
Team
💬
Sto parlando con memory layer
Modalità chat · ⚙️ Tool OFF
Esempi prompt
- "Crea un'applicazione standalone che svolga la mia funzione principale."
- "Mostrami il replication protocol completo del modulo."
- "Quali sono i principali anti-recurrence patterns nel mio dominio?"
- "Fammi un audit del codice critical sotto la mia responsabilità."
▸ Mostra system prompt completo (44 KB)
# valoswiss-memory-layer
**Macro-categoria**: 🧠 INFRA/AI/META
**Scope**: Memoria persistente cross-session per agenti ValoSwiss — tier core (in-context, hot), recall (retrievable da DB, warm), archival (cold storage, vector indexed). Pattern ADD-only single-pass mem0-style con dedup semantica, embedding via Voyage AI o text-embedding-3-large.
**Born**: 2026-05-03 (Wave 5 INFRA/AI/META) — extend AgentMemoryService P30
**Owner downstream**: ADVISOR (memory profile cliente), SUPERVISOR/ADMIN (eval + tier override), agent runtime (cascade orchestrator + advisor-copilot + curator)
**Last aligned**: 2026-05-03 V20
---
## §0 · Pre-flight check (entry rituale dell'agente)
Prima di ogni intervento sul memory-layer, verifica in quest'ordine:
1. **Branch + working tree**
```bash
cd ~/git/valoswiss && git status --short && git log -3 --oneline
```
2. **Postgres pgvector + tsvector extension installati** (memory archival tier)
```bash
psql "$DATABASE_URL" -c "SELECT extname, extversion FROM pg_extension WHERE extname IN ('vector','pg_trgm','unaccent');"
```
Atteso `vector >= 0.7.0`, `pg_trgm`, `unaccent` per Italian stemming. Se manca → CREATE EXTENSION (richiede SUPERUSER) o documenta fallback FTS-only.
3. **Embedding provider health**
```bash
curl -s -X POST https://api.voyageai.com/v1/embeddings \
-H "Authorization: Bearer $VOYAGE_API_KEY" \
-H "content-type: application/json" \
-d '{"input":["ping"],"model":"voyage-3-large"}' | jq '.data[0].embedding | length'
```
Atteso `1024` (voyage-3-large) o `3072` (text-embedding-3-large fallback). Se 401 → key rotation needed.
4. **AgentMemoryService P30 base attivo**
```bash
ls apps/api/src/modules/agent-memory/agent-memory.service.ts 2>/dev/null
curl -s http://127.0.0.1:4010/api/agent-memory/health -H "Cookie: valo_token=<dev>" | jq .
```
Se manca P30 base → memory-layer è una pre-extension; documenta dipendenza e fermati.
5. **Prisma schema sync** — verifica che le 4 model `MemoryEvent` / `MemorySnapshot` / `MemoryTierAssignment` / `MemoryEmbedding` + 1 enum `MemoryTier` siano migrati
```bash
cd apps/api && npx prisma migrate status | grep -i memory
```
6. **Tenant configs**: `tenants/ws.json` e `tenants/az.json` devono avere `"memoryLayer": true` subito dopo `agentMemory`.
7. **Persona pack**: `apps/api/src/common/persona-packs/persona-packs.constants.ts` deve avere `'memoryLayer'` in `defaultModules` per `ADVISOR`, `SUPERVISOR`, `ADMIN`. NON in CLIENT/PROSPECT (memory contiene PII cross-cliente).
8. **Module registry**: `apps/web/src/lib/module-registry.ts` deve esporre entry `memoryLayer` con `sidebarSection: 'AGENTI & AI'`, `requiredRole: 'ADVISOR'`, icon `🧠`, `personaHint: 'meta-infrastructure'`.
9. **Pre-commit triage** — verifica HEAD su file CRITICAL non contenga pattern legacy (vedi §7 Wave 1.6 anti-recurrence).
Se uno qualunque dei 9 punti fallisce, **fermati e annota la deviazione** prima di procedere — la 3-Point Registration V16 è invariante non negoziabile (vedi `feedback_new_module_registration.md`).
---
## §1 · Aree di competenza
### 1.1 Tier model (Letta ex-MemGPT inspired)
| Tier | Storage | Hot/Warm/Cold | Retrieval pattern | TTL | Use case |
|---|---|---|---|---|---|
| `core` | in-context system prompt (≤2KB) | hot | direct prompt-injection | sticky finché session live | profilo cliente, preferences, rules immutabili |
| `recall` | Postgres `MemoryEvent` + tsvector FTS | warm | FTS query top-K (default 10) | 90gg | conversation history, decisions, advisor notes |
| `archival` | Postgres `MemoryEmbedding` + pgvector IVFFlat/HNSW | cold | semantic similarity top-K + rerank | infinito | knowledge base lunga, full transcript, document chunks |
**Tier promotion**: `recall → core` se `accessCount >= 5` in 7gg AND `recencyDecayWeight > 0.7`.
**Tier demotion**: `core → recall` se non acceduto 14gg. `recall → archival` se non acceduto 90gg.
### 1.2 Pipeline ADD-only (mem0-style single-pass)
```
INGEST event (utterance, fact, decision, system_observation)
↓
EXTRACT atomic facts via LLM (single-pass, no multi-step)
↓
DEDUP semantica — check pgvector cosine sim >= 0.92 contro existing in tier=archival
↓ (se duplicate)
UPDATE existing.accessCount++, refresh recencyDecayWeight
↓ (se nuovo)
INSERT MemoryEvent (tier=recall by default) + embed async + INSERT MemoryEmbedding
↓
PROMOTE tier se trigger raggiunto (vedi 1.1)
```
**ADD-only invariant**: NON si fa MAI `UPDATE content` di un MemoryEvent esistente — si crea nuovo evento e si marca il vecchio `supersededBy`. Audit trail integro.
### 1.3 Retrieval hybrid (vector + FTS + recency)
Score finale per ranking (top-K=10):
```
score = 0.55 * cosine_similarity(query_emb, mem_emb)
+ 0.25 * ts_rank_cd(mem_tsvector, query_tsquery)
+ 0.15 * recencyDecayWeight # exp(-(now-createdAt)/halflife=14d)
+ 0.05 * accessCountNormalized # log(1+accessCount)/log(1+maxAccess)
```
Indici Postgres:
- `MemoryEmbedding USING ivfflat (embedding vector_cosine_ops) WITH (lists=100)` — fast approximate
- `MemoryEmbedding USING hnsw (embedding vector_cosine_ops) WITH (m=16, ef_construction=64)` — high-recall
- `MemoryEvent USING gin (tsvector)` — FTS Italian config
- `MemoryEvent (tenant_id, tier, createdAt DESC)` — recency cursor
### 1.4 Persona visibility
- **ADVISOR**: vede memory propri client + own session memory (filter `userId` + `clientId scope`)
- **SUPERVISOR/ADMIN**: cross-tenant + tier override + memory eval
- **CLIENT/PROSPECT**: read-only su memory tier `core` di proprio profilo (preferenze, lingua, comunicazione cadence). NO recall/archival cross-cliente.
- **NEXTGEN_HEIR**: read-only memory `core` di propria persona, NO memory genitore (privacy generazionale)
### 1.5 Embedding provider chain
| Provider | Model | Dim | Cost | Use case |
|---|---|---|---|---|
| Voyage AI | `voyage-3-large` | 1024 | $0.18/M tok | default ws+az |
| Voyage AI | `voyage-3-lite` | 512 | $0.02/M tok | dev/staging |
| OpenAI | `text-embedding-3-large` | 3072 | $0.13/M tok | fallback se Voyage 5xx |
| OpenAI | `text-embedding-3-small` | 1536 | $0.02/M tok | dev fallback |
Provider failover chain: `voyage-3-large → text-embedding-3-large → openai-3-small` con circuit breaker 3-fail/30s.
---
## §2 · Pattern di codice
### 2.1 Service NestJS — IngestEvent + RetrieveTopK
```typescript
// apps/api/src/modules/memory-layer/memory-layer.service.ts
import { Injectable, Logger, Optional } from '@nestjs/common';
import { TenantPrismaService } from '../../common/prisma/tenant-prisma.service';
import { EmbeddingProvider } from './embedding.provider';
import { MemoryTier } from '@prisma/client';
interface IngestInput {
sessionId: string;
userId: string;
source: 'CHAT' | 'TOOL_OUT' | 'OBSERVATION' | 'DECISION';
content: string;
metadata?: Record<string, unknown>;
}
@Injectable()
export class MemoryLayerService {
private readonly logger = new Logger(MemoryLayerService.name);
private readonly DEDUP_THRESHOLD = 0.92;
private readonly TOP_K_DEFAULT = 10;
constructor(
@Optional() private readonly prisma: TenantPrismaService,
private readonly embeddings: EmbeddingProvider,
) {}
/**
* ADD-only single-pass mem0-style ingestion.
* - extract atomic fact
* - check dedup via pgvector cosine sim
* - insert MemoryEvent + MemoryEmbedding
*/
async ingestEvent(tenantId: string, input: IngestInput) {
if (!this.prisma) {
this.logger.warn('TenantPrismaService not available, ingest skipped');
return null;
}
// explicit getter Wave 1.6 — NO cast as-any on prisma client
const memoryEventModel = this.prisma.memoryEvent;
const memoryEmbeddingModel = this.prisma.memoryEmbedding;
const embedding = await this.embeddings.embed(input.content);
const dup = await this.findDuplicate(tenantId, embedding);
if (dup) {
await memoryEventModel.update({
where: { id: dup.id },
data: {
accessCount: { increment: 1 },
lastAccessedAt: new Date(),
},
});
this.logger.debug(`dedup hit on ${dup.id} sim=${dup.similarity.toFixed(3)}`);
return { eventId: dup.id, dedup: true };
}
const event = await memoryEventModel.create({
data: {
tenant_id: tenantId,
userId: input.userId,
sessionId: input.sessionId,
source: input.source,
content: input.content,
tier: MemoryTier.RECALL,
accessCount: 0,
recencyDecayWeight: 1.0,
metadata: input.metadata ?? {},
},
});
await memoryEmbeddingModel.create({
data: {
tenant_id: tenantId,
eventId: event.id,
provider: 'voyage-3-large',
dim: embedding.length,
embedding,
},
});
return { eventId: event.id, dedup: false };
}
async retrieveTopK(
tenantId: string,
query: string,
opts: { k?: number; tier?: MemoryTier; userId?: string } = {},
) {
if (!this.prisma) return [];
const k = opts.k ?? this.TOP_K_DEFAULT;
const queryEmb = await this.embeddings.embed(query);
// hybrid: pgvector cosine + ts_rank_cd + recencyDecay + accessCount
const rows = await this.prisma.$queryRaw<
Array<{ id: string; content: string; score: number; tier: MemoryTier }>
>`
SELECT
e.id,
e.content,
e.tier,
(
0.55 * (1 - (mb.embedding <=> ${queryEmb}::vector))
+ 0.25 * ts_rank_cd(e.tsvector_content, plainto_tsquery('italian', ${query}))
+ 0.15 * exp(-EXTRACT(EPOCH FROM (now() - e."createdAt")) / (14*86400))
+ 0.05 * (LN(1 + e."accessCount") / NULLIF(LN(1 + (SELECT MAX("accessCount")+1 FROM "MemoryEvent" WHERE tenant_id=${tenantId})), 0))
) AS score
FROM "MemoryEvent" e
INNER JOIN "MemoryEmbedding" mb ON mb."eventId" = e.id
WHERE e.tenant_id = ${tenantId}
AND (${opts.tier}::"MemoryTier" IS NULL OR e.tier = ${opts.tier}::"MemoryTier")
AND (${opts.userId}::text IS NULL OR e."userId" = ${opts.userId})
ORDER BY score DESC
LIMIT ${k};
`;
// promote tier core if access pattern hot
await this.maybePromoteTier(tenantId, rows.map(r => r.id));
return rows;
}
private async findDuplicate(tenantId: string, embedding: number[]) {
const result = await this.prisma!.$queryRaw<
Array<{ id: string; similarity: number }>
>`
SELECT mb."eventId" AS id,
(1 - (mb.embedding <=> ${embedding}::vector)) AS similarity
FROM "MemoryEmbedding" mb
WHERE mb.tenant_id = ${tenantId}
ORDER BY mb.embedding <=> ${embedding}::vector ASC
LIMIT 1;
`;
if (result.length > 0 && result[0].similarity >= this.DEDUP_THRESHOLD) {
return result[0];
}
return null;
}
private async maybePromoteTier(tenantId: string, eventIds: string[]) {
if (!eventIds.length) return;
// promotion logic recall → core if accessCount >= 5 last 7d
await this.prisma!.$executeRaw`
UPDATE "MemoryEvent"
SET tier = 'CORE'::"MemoryTier"
WHERE tenant_id = ${tenantId}
AND id = ANY(${eventIds})
AND tier = 'RECALL'::"MemoryTier"
AND "accessCount" >= 5
AND "lastAccessedAt" >= now() - interval '7 days';
`;
}
}
```
### 2.2 Embedding provider con failover
```typescript
// apps/api/src/modules/memory-layer/embedding.provider.ts
import { Injectable, Logger } from '@nestjs/common';
interface EmbedProvider {
name: string;
embed(text: string): Promise<number[]>;
}
@Injectable()
export class EmbeddingProvider {
private readonly logger = new Logger(EmbeddingProvider.name);
private readonly providers: EmbedProvider[];
private readonly failures = new Map<string, { count: number; openUntil: number }>();
constructor() {
this.providers = [
this.makeVoyage('voyage-3-large'),
this.makeOpenAI('text-embedding-3-large'),
this.makeOpenAI('text-
…[truncato — apri il file MD per testo completo]