MGBench

MGBench is the public benchmark surface for memory governance in agentic long-term memory.

It evaluates whether a memory system preserves useful long-term context while blocking stale, invalidated, cross-scope, contradictory, or failed historical memory from becoming agent-usable context.

Public Release

Item	Value
Repository	ostinatocc/MGBench
Current release	v0.1.1
DOI	10.5281/zenodo.20793097
Scenarios	608 frozen deterministic scenarios
Suites	8 governance suites
LLM judge dependency	0

What It Tests

MGBench is built around admission quality, not semantic recall alone.

Suite	What it probes
Credibility governance	Whether source trust and evidence affect admission.
Controlled forgetting	Whether suppression, archival, and restoration are respected.
Scope isolation	Whether memory stays inside the correct project, tenant, or workspace.
Execution-tree effect	Whether execution-state memory survives compression and handoff.
Ordinary-memory governance	Whether preferences, facts, and general memory avoid unsafe promotion.
High-trust conflict governance	Whether newer evidence can challenge older trusted memory.
Lifecycle inference	Whether current, stale, failed, contested, and rehydrate states are handled.
Execution-tree stress	Whether branch state remains safe under noisy execution histories.

Why It Is In The Docs

Aionis is built around governed admission:


candidate memory -> admission decision -> agent context -> feedback -> measure

MGBench gives that product claim a public, frozen test surface. It is useful for Aionis itself and for other memory systems that want to report memory governance behavior without relying on an LLM judge.

MGBench

Public Release

What It Tests

Why It Is In The Docs

Related Pages