Skip to Content
Aionis v0.2 is ready for local agent loops, MCP clients, SDK integrations, and self-managed Runtime deployments.
MGBenchOverview

MGBench

MGBench is the public benchmark surface for memory governance in agentic long-term memory.

It evaluates whether a memory system preserves useful long-term context while blocking stale, invalidated, cross-scope, contradictory, or failed historical memory from becoming agent-usable context.

Public Release

ItemValue
Repositoryostinatocc/MGBench 
Current releasev0.1.1 
DOI10.5281/zenodo.20793097 
Scenarios608 frozen deterministic scenarios
Suites8 governance suites
LLM judge dependency0

What It Tests

MGBench is built around admission quality, not semantic recall alone.

SuiteWhat it probes
Credibility governanceWhether source trust and evidence affect admission.
Controlled forgettingWhether suppression, archival, and restoration are respected.
Scope isolationWhether memory stays inside the correct project, tenant, or workspace.
Execution-tree effectWhether execution-state memory survives compression and handoff.
Ordinary-memory governanceWhether preferences, facts, and general memory avoid unsafe promotion.
High-trust conflict governanceWhether newer evidence can challenge older trusted memory.
Lifecycle inferenceWhether current, stale, failed, contested, and rehydrate states are handled.
Execution-tree stressWhether branch state remains safe under noisy execution histories.

Why It Is In The Docs

Aionis is built around governed admission:

candidate memory -> admission decision -> agent context -> feedback -> measure

MGBench gives that product claim a public, frozen test surface. It is useful for Aionis itself and for other memory systems that want to report memory governance behavior without relying on an LLM judge.