Proof by evidence

This page exists for a narrow question:

Does Aionis actually improve execution over time, or does it only describe that idea well?

The answer here is based on six live Lite runs through the public SDK on 2026-04-18, not on hypothetical product language.

What this page is for

If you want the shortest external proof path for the self-evolving claim, start here. Each section shows what changed, what route family proved it, and how to rerun it yourself.

Observed runsLite runtimePublic SDKReproducible

How this relates to the release story

This page proves the strongest runtime claims. If you want the broader product view of what is available today, read What Ships Today.

task startpolicy memorygovernance loopprovenanceforgettingThese proofs were produced from real Lite runs, not hand-written example output.

What the six proofs show

Proof 1

Startup improves

The second run stops looking like a generic tool pick and starts looking like learned task-start guidance grounded in prior execution.

Proof 2

Execution becomes policy

Repeated positive feedback becomes persisted policy memory instead of staying as a vague runtime hint.

Proof 3

Policy can be governed

The resulting policy memory can be retired and reactivated through explicit runtime governance instead of drifting silently.

Proof 4

Promotion keeps provenance

Continuity carriers still expose where stable workflow guidance came from after candidate promotion and replay-side normalization.

Proof 5

Session state alone can promote workflows

Repeated session continuity writes now count as distinct observations and can promote stable workflow guidance without needing an append-only event path.

Proof 6

Forgetting cools memory without deleting it

Archived workflow memory now surfaces semantic forgetting, archive relocation, and differential rehydration instead of silently disappearing.

Proof 1: The second task start gets better

This is the simplest continuity claim Aionis should be able to defend:

the next similar task should start better because the previous one happened

Before

Cold start

The runtime returned a generic first move: source_kind = "tool_selection", no file path, and a generic bash step.

After

Warm start

After two successful writes for the same task family, the runtime returned source_kind = "experience_intelligence" with a learned file path and next action.

Proof

Observed signal

src/services/billing.ts was surfaced as the target file, and the next action became Patch src/services/billing.ts and rerun validation.

Run it yourself:

bash

npm run example:sdk:task-start-proof

Why this matters:

it proves the runtime improved startup behavior
it proves the improvement is grounded in prior execution memory
it shows the difference between "long task support" and "better next task start"

Proof 2: Stable feedback becomes persisted policy memory

The second claim is stronger:

successful execution should become reusable policy, not only replayable history

Before

Early positive feedback

The first and second positive tool-feedback runs did not yet materialize policy memory. The learning signal existed, but it was still too early.

After

Stable policy memory

By the third positive run, the runtime produced persisted policy memory with a policy contract, state, and inspectable governance context.

Proof

Observed signal

materialization_state = "persisted", selected_policy_memory_state = "active", and the same state was visible from both evolution review and agent inspect.

Run it yourself:

bash

npm run example:sdk:policy-memory

Why this matters:

it proves Aionis accumulates usable execution policy
it proves stable execution can become persistent execution policy
it makes the self-evolving claim inspectable through the public SDK

Proof 3: Policy memory can be retired and reactivated

The third claim is what separates a learning substrate from a silent accumulator:

execution policy must remain governable

Before

Materialized policy is active

The runtime produced a persisted policy memory in active state after repeated positive feedback.

After

Governance moved it twice

The public governance route retired that policy memory, then reactivated it with fresh live evidence.

Proof

Observed signal

The state transition actually ran as active → retired → active, and the live policy still resolved selected_tool = "bash" after reactivation.

Run it yourself:

bash

npm run example:sdk:policy-governance

Why this matters:

it proves policy memory is reversible and reviewable
it proves governance is a runtime action, not just a metadata idea
it shows how self-evolving behavior stays inspectable and governable

Proof 4: Continuity provenance survives promotion

The fourth claim is narrower but important:

a self-evolving runtime should preserve where learned workflow guidance came from even after promotion

Before

Carrier provenance could disappear

`handoff`, `session_event`, and `session` could be stored and even projected, but the stable workflow path could stop showing the original continuity signal clearly enough for a host to trust it.

After

Promotion preserves origin

After two `handoff` writes and two `session_event` writes for the same task family, the runtime promoted stable workflow guidance while preserving `distillation_origin` all the way into planner and introspection surfaces.

Proof

Observed signal

The demo produced stable workflow lines containing distillation=handoff_continuity_carrier and distillation=session_event_continuity_carrier, and introspection reported carrier counts of 2 for each corresponding flow.

Run it yourself:

bash

WORKFLOW_GOVERNANCE_STATIC_PROMOTE_MEMORY_PROVIDER_ENABLED=true npm run lite:start
npm run example:sdk:continuity-provenance

Why this matters:

it proves continuity learning stays explainable after workflow promotion
it proves handoff and session_event are not only stored, but promoted with preserved lineage
it makes task-start and review surfaces easier to trust because the learning source survives normalization

Proof 5: Session continuity carriers promote stable workflows

The fifth claim tightens the continuity story:

session state itself should be able to produce durable workflow guidance

Before

Session continuity was weaker than session events

`memory.sessions.create(...)` could store execution state, but repeated updates to the same session topic did not reliably count as independent workflow observations.

After

Session carriers now promote

Repeated session continuity writes for the same task family now move from candidate workflow guidance to stable workflow guidance through session_continuity_carrier.

Proof

Observed signal

The demo produced a first candidate workflow with distillation=session_continuity_carrier, then promoted it to a stable workflow with observed_count = 2 while keeping the same provenance and support counts visible in both planning and introspection surfaces.

Run it yourself:

bash

WORKFLOW_GOVERNANCE_STATIC_PROMOTE_MEMORY_PROVIDER_ENABLED=true npm run lite:start
npm run example:sdk:session-continuity

Why this matters:

it proves session continuity is a first-class learning input, not only supporting metadata
it proves updated session state can count as distinct workflow observations
it makes the continuity model broader than event-only carrier streams

Proof 6: Semantic forgetting archives and rehydrates execution memory

The sixth claim is about memory quality rather than raw accumulation:

a self-evolving runtime should cool down execution memory instead of either keeping everything hot or deleting it

Before

Cold memory was harder to explain

Archived or colder workflow guidance existed, but it was harder to prove why it should stay cold, where its payload should live, and how much should be rehydrated when a task needed it again.

After

Forgetting becomes a visible runtime surface

The runtime now exposes semantic forgetting, archive relocation, and differential rehydration through direct node state, planning summaries, and execution introspection summaries.

Proof

Observed signal

The demo produced semantic_forgetting.action = "archive", archive_relocation.relocation_state = "cold_archive", execution_archive_count = 1, and a differential payload restore that selected only the archived payload node the task needed.

Run it yourself:

bash

npm run example:sdk:semantic-forgetting

Why this matters:

it proves forgetting is lifecycle control, not deletion
it proves the runtime can explain colder-memory decisions in public summary surfaces
it makes selective rehydration part of the product story

What these six proofs mean together

Claim	What the evidence shows
Aionis improves startup	The second task start became more specific and file-aware
Aionis learns execution policy	Stable feedback became persisted policy memory
Aionis governs its learned policy	Policy memory moved through retire/reactivate cleanly
Aionis preserves learned provenance	Stable workflow guidance still shows whether it came from handoff or session-event continuity
Aionis learns directly from session state	Stable workflow guidance can now be promoted from repeated session continuity writes
Aionis manages colder execution memory	Archived workflow memory can be cooled down, relocated, and selectively rehydrated without deletion

That combination is the real point of the product:

not just memory
not just long tasks
not just replay

It is a continuity runtime that can improve startup, materialize execution policy, govern what it learned, preserve the provenance of how that learning happened, lift repeated session state into stable workflow guidance, and cool down execution memory without losing the ability to restore it selectively.

Next steps

If you want the raw runnable commands, go to:

Self-Evolving Demos

If you want the underlying route families, go to:

Proof by evidence ​

What the six proofs show ​

Startup improves

Execution becomes policy

Policy can be governed

Promotion keeps provenance

Session state alone can promote workflows

Forgetting cools memory without deleting it

Proof 1: The second task start gets better ​

Cold start

Warm start

Observed signal

Proof 2: Stable feedback becomes persisted policy memory ​

Early positive feedback

Stable policy memory

Observed signal

Proof 3: Policy memory can be retired and reactivated ​

Materialized policy is active

Governance moved it twice

Observed signal

Proof 4: Continuity provenance survives promotion ​

Carrier provenance could disappear

Promotion preserves origin

Observed signal

Proof 5: Session continuity carriers promote stable workflows ​

Session continuity was weaker than session events

Session carriers now promote

Observed signal

Proof 6: Semantic forgetting archives and rehydrates execution memory ​

Cold memory was harder to explain

Forgetting becomes a visible runtime surface

Observed signal

What these six proofs mean together ​

Next steps ​

Proof by evidence

What the six proofs show

Proof 1: The second task start gets better

Proof 2: Stable feedback becomes persisted policy memory

Proof 3: Policy memory can be retired and reactivated

Proof 4: Continuity provenance survives promotion

Proof 5: Session continuity carriers promote stable workflows

Proof 6: Semantic forgetting archives and rehydrates execution memory

What these six proofs mean together

Next steps