Design Book 设计文档

AgenticKernel

A Minimal Kernel for Agentic Systems 面向智能体系统的极简内核

4 axioms · 2 invariants · 1 interface · zero theme 4 公理 · 2 不变量 · 1 接口 · 零主题
AgenticKernel is the bedrock of an agentic system. It defines what an agent is, what a task is, where things execute, and what can be loaded — and nothing else. No graph. No scheduler. No vocabulary. No persistence. This book is a complete account of Layer 1.
AgenticKernel 是智能体系统的基石。它定义了什么是 Agent、什么是 Task、在哪里执行、可加载什么——仅此而已。没有图(graph),没有调度器,没有词汇表,没有持久化。本书是 Layer 1 的完整阐述。

Read it linearly if you are new. Jump to Part II for the concept chapters (four axioms + Execution). Jump to Appendix C for the design decisions behind every choice.

初读者建议线性通读。想直接看概念章节(四条公理 + Execution),请翻到第二部分。想了解每个选择背后的"为什么",请翻到附录 C

Foreword 前言

A Self-Contained Layer 一个自洽的层

What this book is, and what it is not. 本书是什么,不是什么。

This book defines Layer 1 only. Everything else — orchestration, persistence implementations, role vocabularies, UI conventions — lives above. We do not refer to those layers by name in this book. Where we must point upward, we say "the layer above the kernel".

The reader we have in mind is a developer building on top of AgenticKernel, an architect designing a sibling product on the same kernel, or a reviewer asking "did they get the fundamentals right?". For the first reader, the concept chapters in Part II are the contract. For the second, Part III's invariants and Appendix C's design decisions matter most. For the third, this whole book is a self-justification.

We use a mixed voice throughout. Definitions and invariants are stated declaratively — "Capability is a name", "Task is the only Aggregate Root" — because they are commitments, not opinions. Rationale is stated as "we" — "we kept it that small because" — because it is choice, and choice has authors.

The single takeaway from this foreword: AgenticKernel is a self-contained, theme-neutral kernel with four axioms, two invariants, and two interfaces (Runtime + Query). Everything in this book justifies that sentence.

本书只定义 Layer 1。所有其他内容——编排、持久化实现、角色词汇、UI 约定——都在内核之上。书中不点名引用那些上层。当必须向上指代时,我们说"内核之上的层"。

本书的目标读者有三类:在 AgenticKernel 之上构建产品的开发者;基于同一内核设计兄弟产品的架构师;以及审视"基础是否扎实"的评审者。第一类读者,第二部分的概念章节就是契约;第二类读者,第三部分的不变量与附录 C 的设计决策最为关键;第三类读者,整本书都是一次自我论证。

全书使用混合语气。定义与不变量采用陈述式——"Capability 就是一个 name"、"Task 是唯一的 Aggregate Root"——因为它们是承诺,而非观点。论证与权衡采用"我们"——"我们之所以把它压缩到这么小是因为……"——因为这是选择,而选择有作者。

本前言的唯一要点:AgenticKernel 是一个自洽、主题中立的内核,包含 4 条公理、2 条不变量、以及 2 个接口(Runtime + Query)。本书余下的全部内容,都是在论证这一句话。

Chapter 1 · Part I 第 1 章 · 第一部分

The Core Idea 核心思想

A microkernel for agentic systems — and why "micro" really means micro. 面向智能体系统的微内核——以及为什么"微"真的就是"微"。

An agentic system needs a place to start. Before you choose how agents collaborate, before you choose where state lives, before you pick UI conventions or task-graph shapes — you need a small, stable answer to four questions:

  1. What is an agent?
  2. What is a task?
  3. Where does work execute?
  4. What can be loaded into the system?

AgenticKernel is that small, stable answer. It is the Layer 1 of the architecture: a microkernel in the literal sense — the smallest set of concepts that any agentic product can reuse, regardless of what kind of work it does.

Every higher layer (orchestration, role definitions, persistence implementations, UIs) is built on top of these four concepts. The kernel itself does not know they exist.

Two sub-contexts

Inside the kernel, the four concepts split into two natural groupings:

  • Catalog — what can exist: Capability and Agent. These are Loadable: they describe possibility.
  • Execution — what is happening: Task and Runtime. These are Active: they describe a particular run.

That's the entire mental picture. The diagram below is the only diagram you strictly need to remember.

智能体系统需要一个起点。在你决定 agent 如何协作之前,在你决定状态存在哪里之前,在你选择 UI 约定或任务图形态之前——你需要对四个问题给出一个稳定的小答案:

  1. 什么是 Agent?
  2. 什么是 Task?
  3. 工作在哪里执行?
  4. 什么可以被加载进系统?

AgenticKernel 就是这个稳定的小答案。它是整个架构的 Layer 1:字面意义上的微内核——所有智能体产品都能复用的最小概念集合,无论这些产品具体做什么样的工作。

所有更高的层(编排、角色定义、持久化实现、各种 UI)都建立在这四个概念之上。内核自己并不知道它们的存在。

两个子上下文

在内核内部,这四个概念自然地分成两组:

  • Catalog(目录)——可以存在什么:CapabilityAgent。它们是可加载的,描述"可能性"。
  • Execution(执行)——正在发生什么:TaskRuntime。它们是激活的,描述"某一次具体的运行"。

这就是整张心智地图。下图是你唯一需要严格记住的图。

graph TB subgraph Kernel["AgenticKernel · Layer 1"] direction TB subgraph Catalog["Catalog · Loadable"] C[Capability
name only — opaque] A[Agent
name, capabilities list] C -.listed in.-> A end subgraph Execution["Execution · Active"] T[Task
instructions to result
+ runtime_kind, status, agent_name] R[Runtime
interface · stateless] T -.dispatched via.-> R end A -.referenced by agent_name.-> T end classDef cat fill:#ccfbf1,stroke:#0f766e,color:#115e59 classDef exe fill:#f3e8ff,stroke:#7c3aed,color:#6b21a8 class C,A cat class T,R exe
graph TB subgraph Kernel["AgenticKernel · Layer 1"] direction TB subgraph Catalog["Catalog · 可加载"] C[Capability
仅 name — 不透明] A[Agent
name, capabilities 列表] C -.被列入.-> A end subgraph Execution["Execution · 激活"] T[Task
instructions to result
+ runtime_kind, status, agent_name] R[Runtime
接口 · 无状态] T -.通过 dispatch.-> R end A -.通过 agent_name 被引用.-> T end classDef cat fill:#ccfbf1,stroke:#0f766e,color:#115e59 classDef exe fill:#f3e8ff,stroke:#7c3aed,color:#6b21a8 class C,A cat class T,R exe
Read the diagram top-down: a Capability is listed in an Agent; a Task references that Agent by agent_name and is dispatched through a Runtime implementation chosen by Task's runtime_kind. That arrow chain is the entire kernel.
自上而下阅读:Capability 被列入 AgentTask 通过 agent_name 引用该 Agent,并通过 Task 的 runtime_kind 所选定的 Runtime 实现来派发。这条箭头链就是整个内核。

Chapter 2 · Part I 第 2 章 · 第一部分

The Two Sub-Contexts 两个子上下文

Catalog and Execution — the only partition the kernel knows. Catalog 与 Execution——内核知道的唯一分区。

The Layer 1 bounded context is "the kernel itself". Inside it we recognise two sub-contexts. They are not separate services or modules; they are conceptual groupings that make the four axioms easier to reason about.

Catalog — what can exist

Catalog answers the question "what is possible to run?". It is the register of building blocks. Two concepts live here:

  • Capability — a named instruction. Pure data.
  • Agent — a named bundle of capabilities. Pure data. (The runtime substrate is not bound here — see runtime_kind on Task.)

Both are Loadable. The kernel itself is empty until something on the layer above loads them in. Loading is not part of the kernel — only the shapes are.

Execution — what is happening

Execution answers the question "what work is in flight?". It is the life of the system. Two concepts live here:

  • Task — a unit of work, optionally bound to an Agent. The only Aggregate Root, with its own state machine and its own behaviour.
  • Runtime — the contract under which a Task's work happens. A Domain Service Interface; not 1:1 with any Task and not a data entity.

Tasks are created by the layer above the kernel, dispatched onto a Runtime impl chosen by runtime_kind, and observed via their own status, result, and other fields.

Why these two, and only these two

We tried collapsing them into one ("just have agents do things") and we tried splitting them further (separate "scheduler", "planner", "memory" sub-contexts). Both failed.

  • Collapsing loses the Loadable / Active distinction — you cannot describe an agent without inventing it on the spot.
  • Splitting further introduces orchestration concerns into the kernel, which immediately violates Layer 1's neutrality.

Catalog + Execution is the smallest split that makes the kernel both useful and theme-neutral.

Layer 1 的限界上下文就是"内核本身"。在它内部我们识别出两个子上下文。它们不是独立的服务或模块,而是概念上的分组,目的是让四条公理更容易推理。

Catalog——可以存在什么

Catalog 回答"什么是可以被运行的?"。它是构件的登记簿。两个概念住在这里:

  • Capability——具名的 instruction。纯数据。
  • Agent——具名的 capabilities bundle。纯数据。(运行时基底绑定在这里——见 Task 上的 runtime_kind。)

两者都是可加载的。内核自身是空的,直到内核之上的层把它们加载进来。"加载"本身不属于内核——只有"形状"属于内核。

Execution——正在发生什么

Execution 回答"哪些工作正在进行?"。它是系统的生命。两个概念住在这里:

  • Task——一个工作单元,可选地绑定到某个 Agent。唯一的聚合根,有自己的状态机,也有自己的行为。
  • Runtime——Task 工作发生的契约。一个领域服务接口;与任何 Task 都是 1:1,是数据实体。

Task 由内核之上的层创建,按 runtime_kind 派发到某个 Runtime impl,并通过它自身的 status、result 与其它字段来观察。

为什么是这两个,且只有这两个

我们尝试过把它们合并("让 agent 直接做事就行了"),也尝试过进一步拆分(独立的 scheduler、planner、memory 子上下文)。两种都失败了。

  • 合并会丢失"可加载 / 激活"的区分——你无法在不当场发明的情况下描述一个 agent。
  • 进一步拆分会把编排关注点带入内核,这立刻违反了 Layer 1 的中立性。

Catalog + Execution 是让内核既有用主题中立的最小拆分。

Chapter 3 · Part I 第 3 章 · 第一部分

Boundaries 边界

Everything the kernel deliberately does not own. 内核刻意拥有的所有东西。

The hardest part of Layer 1 is not what is in it — it is what is kept out. A kernel earns its name by refusing concerns that would couple it to a particular product, methodology, or runtime implementation.

What is kept out, and where it goes

ConcernNot in kernel because…Lives on…
Orchestration / task graph Different products want different shapes (linear, DAG, swarm). Picking one rules out others. The layer above the kernel
Role vocabulary (e.g. coordinator, executor, reviewer, researcher) Roles are methodology choices, not kernel facts. Two products on the same kernel may use different vocabularies. The layer above the kernel
Concrete persistence (DB, files, in-memory) The kernel does not impose where state lives. Persistence is an implementation choice. The layer above the kernel
Concrete runtime implementation (CLI, container, cloud, browser) The kernel only defines the interface. The shape of the actual process is up to the implementer. Implementations of the Runtime interface
Tooling, prompt templates, model selection These are concerns of capability authors, not of the kernel. Inside Capability instructions / on the layer above
UI, CLI shape, IPC protocols The kernel is invisible to humans. Anything human-facing is product-level. The layer above the kernel
Domain events, message queues, observability backends How tasks are made observable is a workflow concern, not a kernel concern. The layer above the kernel

The "is it kernel?" rule

When in doubt, ask:

"Could a different agentic product, with a totally different theme, still need this exact concept?"

If yes, it might belong in the kernel. If no, it belongs above. The four axioms pass this test. Everything else fails it.

The kernel ships no implementations

One consequence of this discipline deserves to be stated outright: the AgenticKernel package itself contains only the four axioms, the two invariants, and the Runtime/Query interfaces. It ships no concrete Runtime impl, no persistence, no notification transport, no UI, no vendor SDK. It does not even maintain a registry of runtime_kind values.

Concrete implementations live in sibling extension libraries, named by the convention AgenticKernel.<substrate>. For example:

  • AgenticKernel.Copilot — a Runtime impl backed by the Copilot CLI.
  • AgenticKernel.OpenAI — a Runtime impl backed by the OpenAI Responses / Assistants API.
  • AgenticKernel.Claude — a Runtime impl backed by the Anthropic Messages API.
  • AgenticKernel.Local — a Runtime impl backed by a local llama.cpp / Ollama process.

Each sibling depends on AgenticKernel; none depends on another; AgenticKernel never depends on (or even names) any of them. The product layer picks which siblings to take as dependencies and registers them under its chosen runtime_kind values. The same discipline applies to every other port a higher layer may define (e.g. an AgenticWorkflow.Sqlite for a Persistence impl).

The rule is uncompromising: if a piece of code names a concrete substrate, it does not live in the kernel package — it lives in a sibling library or above.

Layer 1 最难的部分不是它包含什么,而是它挡在外面的是什么。内核之所以配得上"内核"这个名字,是因为它拒绝那些会让它绑死到具体产品、方法论或运行时实现的关注点。

被挡在外面的东西,以及它们的归宿

关注点不在内核中,是因为……它住在……
编排 / 任务图 不同产品想要不同的形态(线性、DAG、群集)。选一个就排除了其他。 内核之上的层
角色词汇(例如 coordinator、executor、reviewer、researcher) 角色是方法论选择,不是内核事实。同一个内核上的两个产品可以用完全不同的词汇。 内核之上的层
具体持久化(DB、文件、内存) 内核不规定状态存在哪里。持久化是实现选择。 内核之上的层
具体 Runtime 实现(CLI、容器、云、浏览器) 内核只定义接口。实际进程的形态由实现者决定。 Runtime 接口的具体实现
工具、prompt 模板、模型选择 这些是capability 作者的关注点,不是内核的关注点。 Capability 的 instructions 内部 / 内核之上的层
UI、CLI 形态、IPC 协议 内核对人类是不可见的。任何面向人的东西都是产品层的。 内核之上的层
Domain event、消息队列、可观测后端 "如何让 task 可观测"是 workflow 的关注点,不是内核的关注点。 内核之上的层

"它属于内核吗?"判定法则

拿不准时,问自己:

"另一个完全不同主题的智能体产品,会不会仍然需要这个完全相同的概念?"

是 → 它可能属于内核。否 → 它属于上面。四条公理都通过了这一测试,其他东西都通不过。

内核不附带任何实现

这条纪律有一个直接推论值得点明:AgenticKernel 包本身包含四条公理、两条不变量与 Runtime/Query 接口。它不附带任何具体 Runtime impl、不附带持久化、不附带通知传输、不附带 UI、不附带任何 vendor SDK。它甚至不维护一个 runtime_kind 值的注册表。

具体实现走同级扩展库,命名约定是 AgenticKernel.<substrate>。例如:

  • AgenticKernel.Copilot——基于 Copilot CLI 的 Runtime impl。
  • AgenticKernel.OpenAI——基于 OpenAI Responses / Assistants API 的 Runtime impl。
  • AgenticKernel.Claude——基于 Anthropic Messages API 的 Runtime impl。
  • AgenticKernel.Local——基于本地 llama.cpp / Ollama 进程的 Runtime impl。

每个 sibling 都依赖 AgenticKernel;它们之间互不依赖;AgenticKernel 永远不会依赖(甚至命名)任何一个 sibling。产品层挑选所需 sibling 作为依赖,并把它们注册到自己选定的 runtime_kind 值之下。同样的纪律也适用于更高层可能定义的任何 port(如 Persistence impl 走 AgenticWorkflow.Sqlite)。

规则不容讨价还价:任何点名具体 substrate 的代码都不能住在 kernel 包里——只能住在某个 sibling 库或更上层。

Chapter 4 · Part I 第 4 章 · 第一部分

Conventions 约定

Notation, voice, and how to read the rest of the book. 符号、语气,以及如何阅读本书余下的内容。

Type tags

Each axiom is tagged with its DDD type and its sub-context:

Catalog · Value Object Execution · Aggregate Root

Teal indicates Catalog, purple indicates Execution. The DDD vocabulary (Aggregate Root, Value Object, Domain Service Interface) is precise — see Appendix B if you need a refresher.

Field notation

Field names are written in snake_case and quoted as code: agent_name, runtime_kind, instructions, result. The kernel does not prescribe a serialization format — these names are conceptual. An implementation may use camelCase or PascalCase as long as the meaning is preserved.

Voice

Definitions and invariants are declarative. They state what is, in unconditional terms.

Rationale uses "we". Where the book defends a choice or compares it against alternatives, it does so in the first-person plural.

Callouts

Boxes labelled Note highlight a clarification that aids reading but is not part of the contract.
Boxes labelled Why not… answer a tempting alternative we explicitly rejected.

Dependencies

Read Part I in order. Part II's chapters can be read in any order, though Capability → Agent → Task → Runtime → Execution is the order in which one concept naturally needs the previous. Part III and the appendices are reference material.

类型标签

每条公理都带有 DDD 类型和所属子上下文的标签:

Catalog · Value Object Execution · Aggregate Root

青绿色表示 Catalog,紫色表示 Execution。DDD 词汇(聚合根、值对象、领域服务接口)是精确的——若需复习,请翻到附录 B

字段记法

字段名采用 snake_case 并以代码格式标注:agent_nameruntime_kindinstructionsresult。内核不规定序列化格式——这些名字是概念性的。具体实现可以用 camelCase 或 PascalCase,只要语义被保留。

语气

定义与不变量是陈述式的。它们以无条件的方式陈述"是什么"。

论证使用"我们"。当本书在为某个选择辩护或与备选方案做对比时,使用第一人称复数。

边栏标注

标记为 提示 的方框给出帮助阅读的澄清,但不是契约的一部分。
标记为 为什么不…… 的方框回答一个我们明确拒绝的诱人备选方案。

阅读依赖

第一部分请按顺序阅读。第二部分各章可任意顺序阅读,不过 Capability → Agent → Task → Runtime → Execution 是一种概念自然依赖前一项的顺序。第三部分和附录是参考材料。

Chapter 5 · Part II 第 5 章 · 第二部分

Capability Capability(能力) Catalog · Value Object Catalog · 值对象

A scoped name. Kernel-opaque. That's the whole concept. 一个 scoped 名字。Kernel 完全不透明。这就是它的全部。

Definition

A Capability is a loadable, reusable primitive that extends an Agent's repertoire. On the kernel side it is structurally fully opaque.

Capability = name

That is the entire kernel-side structure. The name is shaped like <scope>/<name> (Docker / npm style). The kernel only treats Capability as a reference; it never inspects the inside.

Why Capability is an axiom even though it is structurally trivial

  • Semantic anchor — every product on AgenticKernel agrees that the loadable units of an Agent are called Capabilities; the kernel is the registrar of that agreement.
  • Evolution-safe — future kernel interfaces (e.g. deployAgent / deployCapability) need a typed handle for capability-shaped arguments. Naming the concept Capability lets us add optional fields later without breaking consumers.
  • Documentation anchor — gives "what loaders must be able to resolve" a single home.
  • Cross-product API — interface signatures use Capability instead of bare string, making contracts self-describing.

Out of kernel scope (all delegated to product / loader layer)

  • kind (Skill / Tool / Memory / …) — the kernel does not dispatch by kind; that is the loader's job.
  • payload / actual content — the loader fetches by name and deploys to the Runtime.
  • Dependencies between capabilities.
  • Capability nesting.
  • Expansion from a "root capability" to a flat list.
  • Manifest fields beyond nameversion / description / type / dependencies all live above the kernel.
The kernel only ever sees the resolved, flat capabilities list on an Agent (a list of names). How that list is assembled — manual enumeration, dependency-graph traversal, single-root expansion — is the loader's responsibility, defined per product. What each name resolves to is also the loader's concern.

Why not let Capability hold its own instructions, tools, or model in the kernel?

Because then every product on top of the kernel would inherit an opinion. A product that uses no LLM at all would still have a model field staring at it. The kernel keeps it down to the bare reference; everything else is the loader's job.

定义

Capability 是一个可加载、可复用的能力原语,扩展 Agent 的能力。kernel 侧结构上完全不透明。

Capability = name

这就是 kernel 端的全部结构。name 形如 <scope>/<name>(Docker / npm 风格)。Kernel 只把 Capability 看作引用,从不检查内部结构。

为什么 Capability 即使结构平凡也要作为公理

  • 语义锚点——所有基于 AgenticKernel 的产品都同意"Agent 的可加载单元叫 Capability";kernel 是这一约定的注册中心。
  • Evolution-safe——未来的 kernel 接口(如 deployAgent / deployCapability)需要 capability-shaped 参数的类型化句柄;以 Capability 作为命名概念,可以在不破坏 consumer 的前提下追加可选字段。
  • 文档锚点——给"loader 必须能解析什么"一个单一去处。
  • 跨产品 API——接口签名用 Capability 而非裸 string,使契约自描述。

Kernel 范围之外(全部下放到产品 / loader 层)

  • kind(Skill / Tool / Memory / …)—— kernel 不按 kind 分派,那是 loader 的事。
  • payload / 实际内容—— loader 按 name 取,部署到 Runtime。
  • Capability 之间的依赖。
  • Capability 嵌套。
  • 从某个"根 capability"展开成扁平列表。
  • name 之外的 manifest 字段—— version / description / type / dependencies 等都活在更上层。
Kernel 只看到 Agent 上已解析的扁平 capabilities 列表(一组 name)。这个列表如何被组装——手动列举、依赖图遍历、单根展开——是 loader 的职责,由产品定义。每个 name 解析成什么,也是 loader 的职责。

为什么不让 Capability 在 kernel 里自带 instructions、tools 或 model?

因为那样的话,内核之上的每一个产品都会继承一个观点。一个完全不用 LLM 的产品仍然会被一个 model 字段盯着。Kernel 把它压到只剩纯粹的引用;其余一切都是 loader 的事。

Chapter 6 · Part II 第 6 章 · 第二部分

Agent Agent(智能体) Catalog · Value Object Catalog · 值对象

A static configuration: a named bundle of capabilities. Not a running thing. 一份静态配置:一组 capabilities 的具名 bundle。不是运行中的东西。

Definition

An Agent is a static configuration: a named bundle of capabilities. It is not a running thing. It is serialisable, and reusable across many Task executions.

Agent = (
  name:         <scope>/<name>,        // human-authored, scoped, persistent
  capabilities: [capability_name],     // flat, already-resolved list
  metadata?:    map<string, any>,      // optional; kernel does not interpret
)

There is no id — an Agent is a definition, not an instance. The name is the unique handle.

metadata examples (kernel never interprets)

  • <product>/owner_team — who owns this Agent.
  • <product>/version_tag — semantic version.
  • <product>/deprecated — deprecation flag.

What Agent does not do

  • Does not manage process lifecycle (that is the Runtime impl).
  • Does not hold execution state (that is Task).
  • Does not provide isolation (that is the Runtime impl).
  • Does not prescribe how the capabilities list is assembled (that is the loader, living in the product / methodology layer above the kernel).

What Agent does do

  • Declares which capabilities to load.
  • Defines identity through its capability bundle (e.g. an "identity"-typed Skill acting as a persona prefix).
  • Is addressable, replayable, shareable.
Agent does not carry runtime_kind. The same Agent can be executed on different runtime kinds; the choice is made per-Task on Task's runtime_kind field (Chapter 7). This keeps Agent purely a Catalog-side definition, not a dispatch decision.

Why no separate id on Agent?

An Agent is a definition, not an instance. A separate id would invent an instance-vs-definition distinction we do not need. The scoped name already carries identity, ownership, and uniqueness in one field.

定义

Agent 是一份静态配置:一组 capabilities 的具名 bundle。不是运行中的东西。可序列化,可跨多次 Task 执行复用。

Agent = (
  name:         <scope>/<name>,        // 人类编写、scoped、持久
  capabilities: [capability_name],     // 扁平、已解析的列表
  metadata?:    map<string, any>,      // 可选;kernel 不解释
)

没有 id—— Agent 是定义而非实例。name 就是唯一句柄。

metadata 示例(kernel 都不解释)

  • <product>/owner_team——谁拥有这个 Agent。
  • <product>/version_tag——语义版本。
  • <product>/deprecated——弃用标记。

Agent 不做

  • 不管理进程生命周期(这是 Runtime impl)。
  • 不持有执行状态(这是 Task)。
  • 不提供隔离(这是 Runtime impl)。
  • 不规定 capabilities 列表如何被组装(这是 loader 的事,活在 Kernel 之上的产品 / 方法论层)。

Agent 做

  • 声明加载哪些 capabilities。
  • 定义身份(通过 capability bundle ——例如某个 "identity" 类型的 Skill 充当 persona 前缀)。
  • 可寻址 / 可复盘 / 可分享。
Agent 携带 runtime_kind。同一个 Agent 可以在不同的 runtime kind 上执行;这一选择按 Task 上的 runtime_kind 字段(第 7 章)逐 Task 决定。这让 Agent 保持为纯粹的 Catalog 侧定义,而不是派发决策。

为什么 Agent 上没有独立的 id

Agent 是定义,不是实例。多加一个 id 等于发明了一组我们不需要的"实例 vs 定义"区分。Scoped 的 name 一个字段就同时承载了身份、归属和唯一性。

Chapter 7 · Part II 第 7 章 · 第二部分

Task Task(任务) Execution · Aggregate Root Execution · 聚合根

The only Aggregate Root. Instructions in, result out — state transitions driven by the pure Apply function. 一个充血的聚合根。Instructions 进,result 出——但它自己管自己的状态机。

External contract

From the layer above the kernel, a Task is a single black-box function:

Task : instructions → result

That is the entire contract. Everything else on Task exists to make this contract observable, controllable, auditable, and re-executable via clone-and-redispatch.

Internal schema

Task = (
  // identity
  id:           string,                     // kernel-issued, unique

  // input (set on dispatch, immutable)
  summary:      string,                     // short title for catalog / UI
  instructions: string,                     // free-form input — the contract
  runtime_kind: string,                     // chooses Runtime impl
  agent_name?:  string,                     // optional binding to Agent in catalog
  metadata?:    map<string, any>,           // open key/value bag; kernel never interprets

  // execution facts (kernel-managed, atomic per Concurrency Contract)
  status:       not_started | running | paused | success | failure | cancelled,
  result?:      any,                        // present iff status == success
  failure?:     Failure,                    // present iff status == failure (see "Failure type" below)
  supplements?: [string],                   // append-only audit trail; one entry per resume(extra)
  created_at:   timestamp,
  started_at?:  timestamp,                  // present iff status has reached running at least once
  ended_at?:    timestamp,                  // present iff status ∈ {success, failure, cancelled}
)

State machine — six states

Task evolves through a six-state machine. Three are active; three are terminal.

not_started ──dispatch──▶ running ──complete──▶ success     (terminal)
                            │   ▲          ──fail──────▶ failure     (terminal)
                            │   │          ──kill──────▶ cancelled   (terminal)
                          pause resume(extra?)
                            ▼   │
                          paused ──kill──▶ cancelled   (terminal)
  • not_started — Task has been constructed but not yet dispatched.
  • running — work is in flight on a Runtime impl.
  • paused — work is suspended; the impl will not advance until resume is called.
  • success — terminal; result is set.
  • failure — terminal; failure is set.
  • cancelled — terminal; reached via kill from any non-terminal state.

Terminal states are forever frozen. There is no reset, no retry, no transition that leaves a terminal state. "Re-executing" a Task is not a kernel operation — see the Re-execution doctrine below.

Field roles

  • Identityid: the only handle the kernel uses to address the Task after dispatch.
  • Inputsummary, instructions, runtime_kind, optional agent_name, optional metadata: set once on dispatch, then immutable.
  • Execution factsstatus, result?, failure?, supplements?, timestamps: written by the aggregate's own transitions, atomically per the Concurrency Contract (Chapter 10).

Field invariants

  • status follows the six-state machine above. Terminal states never transition out.
  • result is present iff status == success.
  • failure is present iff status == failure. (Cancellation produces no Failure; partial output, if any, lands under an impl-prefixed key in metadata.)
  • supplements grows only as a side effect of resume(extra), atomically with the paused → running transition.
  • started_at is present iff status has reached running at least once.
  • ended_at is present iff status ∈ {success, failure, cancelled}.
  • Input fields are immutable post-dispatch.

Aggregate methods

Task is a pure data type whose state transitions are driven by the Apply function. Six transitions cover the full lifecycle:

  • dispatch(task) — construct and start: not_started → running.
  • pause(task_id) — request a pause: running → paused at the next safe checkpoint.
  • resume(task_id, extra?) — request resumption: paused → running; if extra is provided, append it to supplements atomically as part of the same transition.
  • kill(task_id) — request termination: any non-terminal state → cancelled.
  • complete(task_id, result) — natural success: running → success; sets result and ended_at in one atomic step.
  • fail(task_id, failure) — natural failure: running → failure; sets failure and ended_at in one atomic step.

All six are addressed through the Runtime interface (dispatch takes the whole Task; the others take only task_id — see Chapter 8). "Who has the right to call which verb" (e.g. only the agent calls complete; only the orchestrator calls kill) is a policy/auth concern at the layer above, not an interface-level split.

Best-effort control verbs. pause, resume, and kill are requests, not commands. Different Runtime substrates have different control over their work — an LLM-chat impl can pause between turns; a one-shot HTTP-call impl physically cannot. Every impl must accept all six verbs; the effectiveness varies by substrate. Callers see the truth via Task.status. pause may lose to natural completion; kill may race to success/failure if completion was already imminent. Idempotency on already-terminal tasks is guaranteed.

Failure type

When a Task ends in failure, the failure field carries a structured value:

Failure = (
  code:      string,                        // impl-defined, suggested namespace <impl>/<code>
  message:   string,                        // human-readable; both code and message are required
  metadata?: map<string, any>,              // optional details: stack, raw_response, partial_output, cause, retry_after_ms, ...
)
  • code separates fact from words. It is an open string (e.g. cli_runtime/timeout, openai/rate_limit) that the layer above can pattern-match on without parsing prose.
  • message is for humans and logs.
  • metadata carries impl-specific detail; the kernel never interprets it.

The kernel deliberately does not include a retriable: bool field. Whether a given code is retriable depends on the workflow's tolerance, budget, and context — not on the failure itself. A 429 rate_limit may be retriable for one workflow (with backoff budget) and not for another (cost-capped). Retry policy lives at the layer above, encoded as code → action; the Failure stays opinion-free.

Cancellation produces no Failure. A cancelled Task has status == cancelled and no failure field. Partial output observed before cancellation, if any, is preserved under an impl-prefixed key inside metadata (e.g. cli_runtime/partial_output).

Re-execution doctrine

Terminal Tasks are forever frozen. The kernel exposes no operation that takes a terminal Task back to a non-terminal state. To "re-execute" a piece of work, the caller constructs a new Task with a fresh id, copying whatever fields are still desired (instructions, runtime_kind, …), and dispatches it.

Lineage between the original and the new Task is recorded in metadata, by convention under a scoped key:

new_task.metadata["<scope>/lineage_of"] = original_task.id

This keeps Task immutable, keeps the state machine simple, and keeps audit clean: every attempt is its own Task, with its own timestamps, its own outcome, and its own audit trail. The kernel does not police the lineage convention; the layer above adopts whatever scope it likes.

Observability — what the kernel guarantees, and what it does not

Task is the kernel's only observation surface. By the Concurrency Contract (Chapter 10), every read returns a per-Task atomic snapshot. Five rules pin down the kernel-level observability floor:

  1. Pull is the universal lower bound. Reading any field of a Task is always possible and always returns a consistent snapshot. status is the canonical channel; the field invariants above make every other field's meaning deterministic given status.
  2. State transitions are well-defined moments, but the kernel mandates no notification mechanism. Polling status, subscribing to push events, or registering callbacks are all legal observation strategies — the choice belongs to the layer above.
  3. Push is not a kernel axiom. Subscription, callback, and event-bus mechanisms are engineering choices, not axiom-level conclusions.
  4. Liveness is not a kernel concern. "Task shows running, but is the substrate actually making progress?" — the kernel cannot know. Impls may write heartbeat hints into metadata (e.g. <impl>/last_heartbeat_at); observers interpret these by their own conventions.
  5. Mid-execution events are not first-class. Token streams, tool invocations, internal reasoning steps, partial outputs in flight — all impl-internal. Valuable for tracing, debugging, billing; never axiom-level. Impls surface them through metadata updates or their own trace channels, not through kernel concepts.

The summary: pull works; everything else is optional. This minimum is enough to build any pull-driven workflow on. Push, notification, liveness, and tracing are layered on top — at the layer above, or by individual impls — without changing any axiom.

Why is agent_name optional?

Because not every Task originates from a catalog Agent. The layer above can dispatch ad-hoc tasks (e.g. from a one-off prompt) by providing only instructions and runtime_kind. The kernel does not require an Agent to be in the loop.

对外契约

从内核之上的层来看,Task 就是一个黑箱函数:

Task : instructions → result

这就是全部契约。Task 上其它字段的存在,都是为了让这一契约可观察、可控、可审计,并且能够通过 clone-and-redispatch 重跑。

内部 schema

Task = (
  // 身份
  id:           string,                     // kernel 颁发,唯一

  // 输入(dispatch 时一次性设定,之后不可变)
  summary:      string,                     // catalog / UI 用的短标题
  instructions: string,                     // 自由文本输入——即契约
  runtime_kind: string,                     // 选择 Runtime impl
  agent_name?:  string,                     // 可选,绑定 catalog 中的某个 Agent
  metadata?:    map<string, any>,           // 开放 key/value 包;kernel 从不解释

  // 执行事实(kernel 管理,按并发契约原子写入)
  status:       not_started | running | paused | success | failure | cancelled,
  result?:      any,                        // 仅当 status == success 时存在
  failure?:     Failure,                    // 仅当 status == failure 时存在(见下文 "Failure 类型")
  supplements?: [string],                   // 仅追加的审计轨迹;resume(extra) 一次产生一条
  created_at:   timestamp,
  started_at?:  timestamp,                  // 仅当 status 至少进入过 running 时存在
  ended_at?:    timestamp,                  // 仅当 status ∈ {success, failure, cancelled} 时存在
)

状态机——六个状态

Task 在六态状态机内演化。三个活跃态;三个终态

not_started ──dispatch──▶ running ──complete──▶ success     (终态)
                            │   ▲          ──fail──────▶ failure     (终态)
                            │   │          ──kill──────▶ cancelled   (终态)
                          pause resume(extra?)
                            ▼   │
                          paused ──kill──▶ cancelled   (终态)
  • not_started——Task 已构造但尚未 dispatch。
  • running——工作正在某个 Runtime impl 上进行。
  • paused——工作已挂起;在 resume 被调用之前 impl 不会前进。
  • success——终态;result 已写入。
  • failure——终态;failure 已写入。
  • cancelled——终态;从任意非终态经由 kill 到达。

终态永久冻结。没有 reset,没有 retry,没有任何离开终态的转换。"重跑"一个 Task 不是 kernel 操作——见下文 Re-execution doctrine。

字段角色

  • 身份—— id:dispatch 之后 kernel 寻址 Task 的唯一句柄。
  • 输入—— summaryinstructionsruntime_kind、可选 agent_name、可选 metadata:dispatch 时一次性设定,此后不可变。
  • 执行事实—— statusresult?failure?supplements?、时间戳:由聚合自身的转换写入,按并发契约(第 10 章)保持原子。

字段不变量

  • status 遵循上述六态状态机。终态永不转出。
  • result 仅当 status == success 时存在。
  • failure 仅当 status == failure 时存在。(取消不产生 Failure;如果有 partial output,写入 metadata 中带 impl 前缀的 key。)
  • supplements 仅作为 resume(extra) 的副作用增长,与 paused → running 转换原子绑定。
  • started_at 仅当 status 至少进入过 running 时存在。
  • ended_at 仅当 status ∈ {success, failure, cancelled} 时存在。
  • 输入字段 dispatch 后不可变。

聚合方法

Task 是纯数据类型,所有状态转换都通过 Apply 函数驱动。六个转换覆盖完整生命周期:

  • dispatch(task)——构造并启动:not_started → running
  • pause(task_id)——请求暂停:running → paused,在下一个安全 checkpoint 生效。
  • resume(task_id, extra?)——请求恢复:paused → running;若提供 extra,则在同一原子转换中追加到 supplements
  • kill(task_id)——请求终止:任意非终态 → cancelled
  • complete(task_id, result)——自然成功:running → success;在一个原子步骤中写入 resultended_at
  • fail(task_id, failure)——自然失败:running → failure;在一个原子步骤中写入 failureended_at

六个动词全部通过 Runtime 接口寻址(dispatch 收整个 Task;其余只收 task_id——见第 8 章)。"谁有权调哪个动词"(如只有 agent 调 complete,只有编排层调 kill)是上层的策略/权限问题,不是接口层面的划分。

控制类动词是 best-effort。 pauseresumekill请求,不是命令。不同 Runtime substrate 对工作的控制力不同——LLM-chat impl 能在轮次之间 pause;一次性 HTTP 调用的 impl 物理上做不到。每一个 impl 都必须接受六个动词;有效率因 substrate 而异。Caller 通过 Task.status 看到真相。pause 可能输给自然完成;kill 可能与即将到达的 success/failure 撞车。对已经处于终态的 task 保证幂等。

Failure 类型

当 Task 以 failure 结束时,failure 字段携带一份结构化数据:

Failure = (
  code:      string,                        // impl 自定,建议命名空间 <impl>/<code>
  message:   string,                        // 人读;code 与 message 都必填
  metadata?: map<string, any>,              // 可选细节:stack、raw_response、partial_output、cause、retry_after_ms…
)
  • code 把"事实"和"措辞"分开。开放字符串(如 cli_runtime/timeoutopenai/rate_limit),上层可以基于它做模式匹配,而不是去 parse 散文。
  • message 给人和日志看。
  • metadata 携带 impl 特定的细节;kernel 永远不解释。

Kernel 刻意包含 retriable: bool 字段。同一个 code 是否该重试,取决于 workflow 的容忍度、预算与上下文——不取决于失败本身。429 rate_limit 对一个有 backoff 预算的 workflow 可重试,对一个 cost-cap 严格的 workflow 不可重试。Retry policy 活在内核之上的层,编码为 code → actionFailure 不带任何意见。

取消不产生 Failure。Cancelled Task 的 status == cancelled 且没有 failure 字段。取消之前观察到的 partial output(如果有)按 impl 前缀写入 metadata(如 cli_runtime/partial_output)。

Re-execution doctrine

终态 Task 永久冻结。Kernel 不暴露任何能让终态 Task 回到非终态的操作。要"重跑"一段工作,调用方需新建一个新的 Task(新 id),按需拷贝原字段(instructionsruntime_kind…),然后 dispatch。

原 Task 与新 Task 之间的世系信息记在 metadata 里,按约定的 scoped key:

new_task.metadata["<scope>/lineage_of"] = original_task.id

这样 Task 保持不可变,状态机保持简单,审计保持干净:每一次尝试都是它自己的 Task,有自己的时间戳、自己的结果、自己的审计轨迹。Kernel 不强制世系约定;上层用什么 scope 都行。

可观测性——内核保证什么、不保证什么

Task 是内核唯一的观察面。按并发契约(第 10 章),每次 read 返回 per-Task 原子 snapshot。五条规则钉死内核级 observability 地板:

  1. Pull 是 universal lower bound。读 Task 任意字段任时可行,永远返回一致 snapshot。status 是标准信道;上述字段不变量让其它字段在给定 status 时语义确定。
  2. 状态转换是 well-defined moment,但内核不规定通知机制。Polling status、订阅 push 事件、注册 callback 都是合法策略——选择属于内核之上的层。
  3. Push 不是 kernel 公理。订阅、callback、event-bus 是工程选择,不是公理结论。
  4. Liveness 不是 kernel 责任。"Task 显示 running,但 substrate 是不是真在干活" ——内核无法知道。Impl 可往 metadata 写心跳(如 <impl>/last_heartbeat_at);observer 自行约定解释。
  5. 中间执行事件不是一等概念。Token 流、tool 调用、内部 reasoning 步骤、in-flight partial output——都是 impl-internal。对 trace / debug / 计费有价值;不属于公理层。Impl 通过 metadata 更新或自己的 trace channel surface,不通过 kernel 概念。

一句话总结:pull works;其它一切 optional。这个最小集足以构建任何 pull-driven workflow。Push、notification、liveness、tracing 都在它之上叠(在内核之上的层,或个别 impl)——不需要改任何公理。

为什么 agent_name 是可选的?

因为不是每个 Task 都来源于 catalog 中的 Agent。内核之上的层可以派发 ad-hoc 任务(例如一次性 prompt),只提供 instructionsruntime_kind 即可。Kernel 并不强制必须有 Agent 介入。

Chapter 8 · Part II 第 8 章 · 第二部分

Runtime Runtime(运行时) Execution · Domain Service Interface Execution · 领域服务接口

A contract — not a data entity. The kernel's only interface, dispatched per Task by runtime_kind. 命令侧契约——不是数据实体。按 Task 上的 runtime_kind 派发。与 Query(读侧)接口共同构成内核的两个接口。

What Runtime is

An interface for executing a Task. Carries no data, no fields, no lifecycle of its own. It is a Domain Service Interface (DDD), nothing more.

// Conceptual signature — kernel does not provide implementations
interface Runtime {
  dispatch(task)                  → ()    // hand the whole Task to the runtime
  pause(task_id)                  → ()    // request a halt; effective at the next safe checkpoint
  resume(task_id, extra?)         → ()    // request resumption; if extra given, append to supplements atomically
  kill(task_id)                   → ()    // request termination; non-terminal → cancelled
  complete(task_id, result)       → ()    // report successful completion
  fail(task_id, failure)          → ()    // report failure
}

Six verbs covering the full Task lifecycle. The signatures are deliberately asymmetric: dispatch(task) takes the whole Task (the materialisation moment); the other five take only task_id (the Task is already materialised). "Who has the right to call Complete vs Kill" is a policy/auth concern at the layer above, not an interface-level split.

Asymmetric signatures — why

  • dispatch(task) takes the whole Task object. This is the materialisation moment: the Runtime impl needs every input field (instructions, agent_name, metadata, …) to actually start work, and the Task may not yet exist anywhere the Runtime can look it up by id.
  • pause / resume / kill / complete / fail take only task_id (with resume taking an optional extra payload, complete taking a result, and fail taking a failure). After dispatch, the Task is materialised; the Runtime impl already has everything it needs, and the kernel-side caller only needs to address it.

Symmetry would have been pretty, but it would have forced either an extra "register the task with the runtime" call before dispatch, or required all three control ops to re-pass the immutable input. Neither buys anything.

Best-effort control verbs

Different Runtime substrates have different control over their work. An LLM-chat impl can pause between turns; a one-shot HTTP-call impl physically cannot. Every impl must accept all six verbs; the effective rate of pause / resume varies, and that variation is fine — the truth is always in Task.status.

Concretely:

  • dispatch is synchronous to the point of state entry: it returns once the Task has entered running, or raises if the Runtime cannot start (Task stays not_started).
  • pause is a request: the Task transitions to paused at the next safe checkpoint, or completes naturally before the checkpoint is reached (in which case the request had no effect). Idempotent on already-paused / terminal Tasks.
  • resume(extra?) is atomic: if extra is provided, append to supplements first, then transition paused → running. No-op (and extra dropped) on non-paused Tasks.
  • kill is a request: the Task reaches a terminal state (typically cancelled, but may race to success/failure if completion was already imminent) in finite time. Idempotent on already-terminal Tasks.

runtime_kind on Task — not on Agent

Task carries the runtime_kind field (Chapter 7). The kernel's caller picks an implementation per Task. The kernel itself does not enumerate the kinds; the layer above maintains a registry mapping runtime_kind → impl. Putting runtime_kind on Task (not Agent) means the same Agent definition can be executed on different runtime kinds depending on context — useful for migration, A/B, fallback, and local-vs-remote choice.

Why an interface, not a 1:1 entity

An earlier draft framed Runtime as an entity with 1:1 lifetime to a Task and no-arg methods. We rejected it because:

  • State duplication. A Runtime entity would need to mirror Task fields (id, status) for any external observer to read them — but Task already carries those, denormalised by design.
  • Implementation lock-in. 1:1 forces every impl to allocate per-Task state structures, ruling out stateless or pooled implementations even when they are the natural fit.
  • The kernel asks too much. The kernel only needs to name the operations; how an impl maps those onto processes / containers / sandboxes / threads / coroutines is the impl's freedom.
Concurrency, ordering, and crash semantics are not defined by the Runtime interface. They are governed by the Concurrency Contract (Chapter 10, Invariant 1), which is a property of the kernel as a whole.

Runtime 什么

是一个执行 Task 的接口。本身不携带数据、不带字段、不带自己的生命周期。一个领域服务接口(DDD),仅此而已。

// 概念签名——内核不提供任何实现
interface Runtime {
  dispatch(task)                  → ()    // 把整个 Task 对象交给 runtime
  pause(task_id)                  → ()    // 请求挂起;在下一个安全 checkpoint 生效
  resume(task_id, extra?)         → ()    // 请求恢复;若提供 extra,原子追加到 supplements
  kill(task_id)                   → ()    // 请求终止;非终态 → cancelled
  complete(task_id, result)       → ()    // 报告成功完成
  fail(task_id, failure)          → ()    // 报告失败
}

六个动词,覆盖完整的 Task 生命周期。签名刻意不对称:dispatch(task) 接收整个 Task(物化时刻);其余五个只接收 task_id。"谁有权调 Complete vs Kill"是上层的策略/权限问题,不是接口层面的划分。

为什么签名不对称

  • dispatch(task) 接收整个 Task 对象。这是"物化"的时刻:Runtime impl 需要 Task 的全部输入字段(instructionsagent_namemetadata…)来真正启动工作,而此时 Task 可能还不存在于 Runtime 能用 id 查到的地方。
  • pause / resume / kill / complete / fail 只接收 task_id(其中 resume 多收一个可选 extracomplete 收一个 resultfail 收一个 failure)。dispatch 之后 Task 已经被物化;Runtime impl 早就拿到了所有需要的东西,kernel 侧的调用方只需要寻址即可。

对称当然好看,但那要么强制在 dispatch 前多一次"向 runtime 注册 task"调用,要么强制其余三个控制操作重复传递不可变输入。两者都没有任何收益。

控制类动词是 best-effort

不同 Runtime substrate 对工作的控制力不同。LLM-chat impl 能在轮次之间 pause;一次性 HTTP 调用的 impl 物理上做不到。每一个 impl 都必须接受六个动词;pause / resume 的有效率因 substrate 而异,这种差异是 OK 的——真相永远在 Task.status

具体而言:

  • dispatch 同步到状态入口:在 Task 进入 running 后返回,或在 Runtime 无法启动时抛错(Task 保持 not_started)。
  • pause 是请求:Task 在下一个安全 checkpoint 转为 paused在 checkpoint 之前自然完成(这时请求无效)。对已 paused / 终态 Task 幂等。
  • resume(extra?) 原子:若提供 extra,先追加到 supplements,再做 paused → running 转换。对非 paused Task 是 no-op(extra 被丢弃)。
  • kill 是请求:Task 在有限时间内到达终态(通常 cancelled,但若完成已迫在眉睫,可能撞到 success/failure)。对已终态 Task 幂等。

runtime_kind 在 Task 上——不在 Agent 上

Task 携带 runtime_kind 字段(第 7 章)。Kernel 调用方按 Task 选择实现。Kernel 本身不枚举具体种类;内核之上的层维护一个 runtime_kind → impl 的注册表。把 runtime_kind 放在 Task 上(而不是 Agent 上)意味着:同一个 Agent 定义可以根据上下文在不同 runtime kind 上执行——便于迁移、A/B、降级、本地 vs 远程之类的选择。

为什么是接口,而不是 1:1 的实体

早先的草稿曾把 Runtime 框架成一个"与 Task 1:1、方法不带参数"的实体。我们拒绝了它,因为:

  • 状态重复。Runtime 实体不得不镜像 Task 字段(id、status)才能让外部观察者读到它们——但 Task 已经按设计反范式化地携带了这些。
  • 实现锁定。1:1 强迫每一个 impl 都为每个 Task 分配状态结构,把"无状态"或"池化"实现自然适合的场景都堵死。
  • Kernel 要求过度。Kernel 只需要命名操作;至于 impl 把这些操作映射成进程 / 容器 / 沙箱 / 线程 / 协程,那是 impl 的自由。
并发、顺序与崩溃语义由 Runtime 接口定义。它们由"并发契约"(第 10 章,不变量 1)统辖,而那是内核作为整体的属性。

Chapter 9 · Part II 第 9 章 · 第二部分

Execution Execution(执行) Named Concept · Not an Entity 命名概念 · 非实体

The act of running a Task. A name we give to a behaviour — there is no separate object. "运行一个 Task"这件事本身。我们给这一行为起的名字——没有独立对象。

Definition

An Execution is the act of running a Task. It is named so the rest of the kernel — and the documentation — has a noun for "the thing happening between dispatch and a terminal status". It is not a separate entity, table, record, or object.

Execution ≡ "this Task is running, on some Runtime"
  ↳ no id of its own         (the Task's id IS the execution's id)
  ↳ no separate fields       (status, started_at, ended_at all live on Task)
  ↳ 1:1 with the Task        (start/end == Task's running ↔ terminal transitions)

Why name it at all

  • To group Task + Runtime under "Execution" as the second sub-context (alongside Catalog).
  • To give documentation, hooks, and observability a noun: "an execution failed", "execution ended", "execution duration".
  • To avoid false promotion: the alternative is to model Execution as a record with its own id and fields, which would split execution facts across two places (Task + Execution) and break the "Task is the only Aggregate Root" invariant.

What Execution is not

  • Not an entity with its own id.
  • Not a table or record alongside Task.
  • Not a wrapper around Task — Task already is the wrapper.
  • Not a place to store retry counts, attempt numbers, or step indices (those are workflow / scheduler concerns above the kernel).
All execution facts (status, started_at, ended_at, result, failure) live on the Task, denormalised. This is what lets a single Task object answer every question an external observer might ask, without joins.

定义

Execution 就是"运行一个 Task"这一行为。我们给它命名,是为了让内核的其它部分——以及文档——有一个名词指代"在 dispatch 与终态之间正在发生的那件事"。它不是一个独立实体、表、记录或对象。

Execution ≡ "这个 Task 正在某个 Runtime 上运行"
  ↳ 没有自己的 id           (Task 的 id 就是 execution 的 id)
  ↳ 没有独立字段           (status、started_at、ended_at 都在 Task 上)
  ↳ 与 Task 1:1            (起止 == Task 的 running ↔ 终态跳转)

为什么要命名

  • 为了把 Task + Runtime 一起归到 "Execution" 这个第二子上下文(与 Catalog 并列)。
  • 为了给文档、hook、可观测性一个名词:"an execution failed"、"execution ended"、"execution duration"。
  • 为了避免错误地把它"提升"成实体:替代方案是把 Execution 建模为带自身 id 与字段的记录,那样就会把执行事实拆分到两个地方(Task + Execution),并破坏"Task 是唯一聚合根"的不变量。

Execution 不是什么

  • 不是带自身 id 的实体。
  • 不是与 Task 并列的表或记录。
  • 不是 Task 的 wrapper—— Task 自己就是 wrapper。
  • 不是存放重试次数、尝试编号、step 序号的地方(那些是 kernel 之上的 workflow / scheduler 关注点)。
所有执行事实(statusstarted_atended_atresultfailure)都在 Task 上反范式化存放。这正是让"一个 Task 对象就能回答外部观察者的所有问题、无需 join"成为可能的原因。

Chapter 10 · Part III 第 10 章 · 第三部分

Invariants 不变量

The two properties that hold across the whole kernel. 贯穿整个内核的两条性质。

The four axioms tell you what the kernel has. The two invariants tell you what the kernel guarantees. They are not concepts; they are commitments that any implementation of the kernel must honour.

Invariant 1 — The Concurrency Contract

The Concurrency Contract has two clauses. Both must hold.

Clause 1 — Inter-Task parallelism. Different Tasks may execute concurrently, in any order, on any combination of Runtimes. The state transitions of one Task are not affected by the existence or absence of other Tasks.
Clause 2 — Intra-Task serialisation. Mutating operations on the same Task must be serialised. Task is the consistency boundary; concurrent mutations to one Task are not permitted.

Together, the two clauses say: between Tasks, anything goes; within a Task, one operation at a time. This is why Task is the only Aggregate Root: the boundary of consistency is the boundary of the aggregate.

Precise semantics — seven rules

The two clauses are short. The contract they imply is more precise. The kernel guarantees seven rules; impls choose the mechanism.

  1. The mutator set is closed and named. The mutators are exactly: dispatch, pause, resume, kill, complete, fail. The atomic append to supplements happens inside resume(extra) — it is not a separate mutator. All other operations on Task (reading any field, snapshotting the Task, computing derived views) are readers.
  2. Same-Task mutators serialise. At any moment, at most one mutator is in flight per Task. Impl chooses mechanism (per-task lock, single-consumer actor, MQ partition by task_id, optimistic locking with version field, …). The kernel does not prescribe.
  3. Mutations are atomic from observer perspective. A reader sees either the pre-state or the complete post-state — never an intermediate / torn state. Readers can never observe status == success with result unset, or status == paused with supplements partially appended.
  4. Readers do not serialise — neither with each other, nor with mutators (MVCC-style semantics). Multiple readers may run concurrently. Readers may run concurrently with a mutator; the reader sees the latest committed snapshot. Readers do not block mutators; mutators do not block readers.
  5. Cross-Task batch reads have no global snapshot guarantee. A workflow asking "give me the status of these N Tasks" receives N independently per-Task-atomic snapshots; their relative timing is not guaranteed (no SERIALIZABLE-level isolation). Agentic systems rarely need cross-Task atomicity, and providing it has prohibitive cost.
  6. Read-your-writes within the same caller context. When a caller's mutator returns successfully, that same caller's subsequent read on the same Task observes the post-state. Across distinct callers, propagation is "as soon as possible" but not strictly synchronous (impl-specific).
  7. Same-Task mutation order preserves happens-before. State-machine causality is an axiom guarantee: dispatch happens-before complete/fail; pause happens-before its corresponding resume; kill happens-before the resulting cancelled snapshot. Observers never see the post-state of a later mutation without also seeing the post-state of all earlier same-Task mutations.

Three edge cases follow directly:

  • A failed mutator (e.g. dispatch raising before any state change) still occupies the same-Task serialisation slot for its duration; on return (success or error) the slot is released. State unchanged ⇒ observers see the pre-state throughout.
  • A no-op mutator (e.g. kill on an already-terminal Task) is API-classified as a mutator — it occupies the slot, returns harmlessly. State unchanged ⇒ observers see the unchanged terminal state.
  • Transition-related timestamps (started_at, ended_at) are part of the atomic mutation envelope (Rule 3). Observers see them appear together with the transition that sets them, not later.

What this implies, practically:

  • The kernel does not need (and does not have) a global lock or single dispatcher.
  • Each Task implementation needs some per-Task ordering mechanism — but the choice is the impl's, not the kernel's.
  • Cross-task coordination — domain events, message queues, shared workspaces — is a workflow concern. The kernel exposes no primitive for it.
  • The kernel-level observability floor (Chapter 7, Observability) is a direct consequence of these rules: pull always works precisely because readers do not serialise and snapshots are atomic.

Why not promise stronger guarantees, like exactly-once or causal ordering across Tasks?

Both are valuable and both are expensive. Different products need different tradeoffs. Encoding either into the kernel forces every product to pay for it. We push these guarantees to the layer above, which can choose what it needs.

Invariant 2 — Architectural Equivalence

This invariant is what makes the kernel a kernel, not a particular product's foundation:

Architectural Equivalence — Two completely different agentic products, with different methodologies, vocabularies, runtime substrates, and persistence backends, can be built on top of the same AgenticKernel. The kernel does not encode anything that distinguishes one product from another.

This invariant is what we mean by "theme-neutral". It is also what every chapter of Part I (Boundaries, Conventions, the "is it kernel?" rule) was preparing you to enforce.

An implementation of AgenticKernel that ships with built-in roles, built-in runtime kinds, or built-in persistence is, by this invariant, not an implementation of AgenticKernel — it is a product on top of one.

Why exactly two invariants

Every additional invariant in a kernel is a constraint that all consumers must accept. Two felt like the floor: anything fewer and the kernel would not be useful (concurrency would be undefined; portability would be unstated). Anything more and the kernel would start dictating product choices.

If you find yourself wanting to add a third invariant, the first question to ask is: "is this a property of the kernel, or a property of a particular product on top of it?" In our experience, every candidate has been the latter.

四条公理告诉你内核有什么。两条不变量告诉你内核保证什么。它们不是概念,而是承诺——任何对内核的实现都必须兑现。

不变量 1 — 并发契约

并发契约有两条子句。两者都必须成立。

子句 1 —— Task 之间的并行。不同的 Task 可以并发执行,以任意顺序,跑在任意 Runtime 组合上。一个 Task 的状态转换不受其它 Task 存在与否的影响。
子句 2 —— Task 内部的串行。同一个 Task 的变更类操作必须串行化。Task 是一致性边界;不允许对同一个 Task 做并发变更。

合起来:Task 之间,怎么并发都行;Task 内部,一次只能一个操作。这正是 Task 为什么是唯一聚合根的原因:一致性的边界就是聚合的边界。

精确语义——七条规则

两条子句很短,它们蕴含的契约更精确。Kernel 承诺七条规则;机制由 impl 自定。

  1. Mutator 集合明确闭合。Mutators 只有:dispatchpauseresumekillcompletefailsupplements 的原子追加发生在 resume(extra) 内部,不是独立 mutator。Task 上的其它操作(读字段、snapshot、计算派生视图)都是readers
  2. 同 Task mutators 串行。任一时刻每个 Task 上至多一个 mutator 在执行中。Impl 自由选机制(per-task lock、单消费者 actor、按 task_id 分区的 MQ、optimistic locking with version field…)。Kernel 不规定。
  3. Mutation 对 observer 是原子的。Reader 只能看到 pre-state 或完整 post-state——永远看不到中间撕裂状态。Reader 永远不会观察到 status == successresult 未填,或 status == pausedsupplements 半截被写入。
  4. Readers 不串行——不与彼此串行,也不与 mutators 串行(MVCC 语义)。多个 reader 可同时跑。Reader 与 mutator 可并发;reader 看到最新已 commit snapshot。Reader 不阻塞 mutator;mutator 也不阻塞 reader。
  5. 跨 Task 批量读无全局 snapshot 保证。Workflow 询问"给我 N 个 Task 的 status",收到 N 个各自原子的 per-Task snapshot;它们之间相对时序保证(不要求 SERIALIZABLE 级别)。Agentic 系统几乎用不到跨 Task 原子性,代价过高。
  6. Read-your-writes(同 caller 上下文内)。同一 caller 调用 mutator 成功返回后,该 caller 立即 read 同一 Task 看到的是 mutation 后的状态。跨 caller 的传播是"尽快"但不严格同步(impl 决定)。
  7. 同 Task mutation 序保留 happens-before。状态机因果性是公理保证:dispatch happens-before complete/failpause happens-before 配对的 resumekill happens-before 其产生的 cancelled snapshot。Observer 永远不会看到后一次 mutation 的 post-state 而看不到同 Task 所有更早 mutation 的 post-state。

由此直接得出三条边界:

  • 失败的 mutator(如 dispatch 在状态变更前抛错):仍占同 Task 的串行槽位直到返回(成功或失败);状态未变 ⇒ observer 全程看到 pre-state。
  • No-op mutator(如 kill 在已终态 Task 上):API 上仍归 mutator(占槽,无害返回);状态不变 ⇒ observer 看到不变的终态。
  • 转换相关时间戳started_atended_at)是原子 mutation envelope 的一部分(Rule 3);observer 看到它们与设置它们的转换一起出现,不会延后。

实际含义:

  • 内核不需要(也没有)全局锁或单一派发器。
  • 每一个 Task 实现都需要某种按 Task 的顺序化机制,但具体选择是 impl 的事,不是 kernel 的事。
  • 跨 task 的协调—— domain event、消息队列、共享 workspace ——属于 workflow 关注点。内核不为此提供任何原语。
  • 内核级 observability 地板(第 7 章 · 可观测性)正是这些规则的直接推论:pull 永远 work,恰恰因为 readers 不串行、snapshot 是原子的。

为什么不承诺更强的保证,比如跨 Task 的 exactly-once 或因果顺序?

这两者都有价值,也都有代价。不同产品需要不同的取舍。把任何一种编码进内核,都会强迫所有产品为此买单。我们把这些保证推到内核之上的层,让它根据需要选择。

不变量 2 — 架构等价性

这一条是让内核成为内核而不是某个具体产品根基的关键:

架构等价性——两个完全不同的智能体产品,使用不同的方法论、不同的词汇、不同的 runtime 基底、不同的持久化后端,都可以建立在同一个 AgenticKernel 之上。内核不编码任何能区分一个产品与另一个产品的东西。

这就是我们说的"主题中立"。也是第一部分每一章(边界、约定、"它属于内核吗?"法则)所做铺垫的目的——让你能去执行这一条。

一个出厂自带角色、自带 runtime 种类或自带持久化的 AgenticKernel 实现,按这一不变量来说,并不是 AgenticKernel 的实现——而是建在它之上的某个产品。

为什么恰好是两条不变量

内核中的每一条额外不变量都是所有消费者必须接受的约束。两条像是底线:再少,内核就不够有用(并发未定义、可移植性未声明);再多,内核就开始替产品做选择了。

当你发现自己想加第三条不变量时,第一个该问的问题是:"这是内核的性质,还是建在它之上的某个产品的性质?"按我们的经验,每一个候选都属于后者。

Appendix A 附录 A

Glossary 术语表

Every term used by the kernel, with a one-line definition. 内核所使用的每一个术语,附一行定义。

AgenticKernel
The Layer 1 microkernel defined by this book: 4 axioms, 2 invariants, 1 interface.
Layer 1
The kernel itself. The lowest layer in the architecture, on which all higher layers depend.
Catalog (sub-context)
The conceptual grouping of Loadable concepts: Capability and Agent. "What can exist."
Execution (sub-context)
The conceptual grouping of Active concepts: Task and Runtime. "What is happening."
Capability
A scoped name (<scope>/<name>). Value Object. Kernel-opaque — only the name; no instructions, model, or other fields at the kernel level.
Agent
A static, named bundle of capabilities (name, capabilities[], optional metadata). Value Object. No id; no runtime_kind (that lives on Task).
Task
The whole-system contract: instructions → result. Rich Aggregate Root with a six-state machine (not_started · running · paused · success · failure · cancelled). Carries id, summary, instructions, runtime_kind, status, optional result/failure/supplements, timestamps, optional agent_name, optional metadata. The only Aggregate Root.
Runtime
A contract — Domain Service Interface for executing a Task. Carries no data of its own. Six verbs, asymmetric: dispatch(task) takes the whole Task; pause / resume(extra?) / kill / complete(result) / fail(failure) take only task_id.
Failure
The structured value carried in Task.failure when a Task ends in failure. Shape: { code: string, message: string, metadata?: map }. code is impl-defined (suggested namespace <impl>/<code>); both code and message are required. The kernel does not encode a retriable flag — that is policy, not fact.
supplements
Append-only audit list on Task. Grows only as a side effect of resume(extra), atomically with the paused → running transition. Audit-only at the kernel level; cross-impl semantic interpretation of historical entries is explicitly not a kernel goal.
Re-execution doctrine
Terminal Tasks are forever frozen. To re-execute, construct a new Task with a fresh id (copying desired fields) and dispatch it. Lineage is recorded by convention in metadata as <scope>/lineage_of: <original_task_id>. The kernel exposes no reset or retry verb.
Sibling extension library
A library that depends on AgenticKernel and provides a concrete impl of one of its interfaces (typically Runtime). Naming convention: AgenticKernel.<substrate> (e.g. AgenticKernel.Copilot, AgenticKernel.OpenAI). The kernel package itself ships zero impls and never depends on any sibling.
Execution
Named concept · not an entity. The act of running a Task. 1:1 with Task; all execution facts are denormalised onto the Task itself.
runtime_kind
A field on Task (not Agent) that names the runtime implementation to use. The kernel does not enumerate the kinds; the layer above does.
Loadable
A concept whose instances are populated by the layer above the kernel, not built into the kernel itself. Capability and Agent are loadable.
Aggregate Root (DDD)
An entity through which all changes to a cluster of objects must pass. In the kernel, only Task is an Aggregate Root.
Value Object (DDD)
An object defined entirely by its attributes, with no independent identity or lifecycle. Capability and Agent are value objects.
Domain Service Interface (DDD)
An interface for behavior that does not naturally belong to any single entity. Runtime is a domain service interface.
Concurrency Contract
Invariant 1. Two clauses: (1) different Tasks may run concurrently, in any order; (2) mutating operations on the same Task must be serialised (Task is the consistency boundary). Unfolds into seven precise rules — see Chapter 10.
Observability floor (G1)
The kernel-level minimum, defined in Chapter 7: pull always works · transitions are deterministic moments but no notification mechanism is mandated · push is not an axiom · liveness is not the kernel's concern · mid-execution events are not first-class.
Architectural Equivalence
Invariant 2. The same kernel can underpin completely different agentic products without modification.
Theme-neutral
Containing no concept, vocabulary, or default that ties the kernel to a specific product or methodology.
The layer above the kernel
How this book refers to anything outside Layer 1. The book deliberately avoids naming higher layers.
AgenticKernel
本书所定义的 Layer 1 微内核:4 公理、2 不变量、1 接口。
Layer 1
内核本身。架构中最底层,所有更高层都依赖于它。
Catalog(子上下文)
"可加载"概念的概念性分组:Capability 与 Agent。回答"可以存在什么"。
Execution(子上下文)
"激活"概念的概念性分组:Task 与 Runtime。回答"正在发生什么"。
Capability
一个 scoped 名字(<scope>/<name>)。值对象。Kernel 完全不透明——只有 name;kernel 层面不带 instructions、model 或其它字段。
Agent
一份静态、具名的 capabilities bundle(namecapabilities[]、可选 metadata)。值对象。没有 id;没有 runtime_kind(那活在 Task 上)。
Task
整套系统的契约:instructions → result。充血聚合根,六态状态机(not_started · running · paused · success · failure · cancelled)。携带 idsummaryinstructionsruntime_kindstatus、可选 result/failure/supplements、时间戳、可选 agent_name、可选 metadata。唯一的聚合根。
Runtime
一份契约——执行 Task 的领域服务接口。本身不携带任何数据。六个动词,签名不对称:dispatch(task) 接收整个 Task;pause / resume(extra?) / kill / complete(result) / fail(failure) 只接收 task_id
Failure
Task 以 failure 结束时 Task.failure 携带的结构化值。形状:{ code: string, message: string, metadata?: map }code 是 impl 自定(建议命名空间 <impl>/<code>);codemessage 都必填。Kernel 不编码 retriable 标志——那是 policy,不是 fact。
supplements
Task 上的仅追加审计列表。仅作为 resume(extra) 的副作用增长,与 paused → running 转换原子绑定。在 kernel 层面只用于审计;跨 impl 对历史条目的语义解读明确不是 kernel 目标。
Re-execution doctrine
终态 Task 永久冻结。要重跑,构造一个新的 Task(新 id,按需拷贝字段)并 dispatch。世系按约定记在 metadata,形如 <scope>/lineage_of: <original_task_id>。Kernel 不暴露 resetretry 动词。
同级扩展库(Sibling extension library)
依赖 AgenticKernel 并提供某个接口(通常是 Runtime)具体实现的库。命名约定:AgenticKernel.<substrate>(如 AgenticKernel.CopilotAgenticKernel.OpenAI)。Kernel 包本身不附带任何 impl,也永远不依赖任何 sibling。
Execution
命名概念 · 非实体。"运行一个 Task"这件事本身。与 Task 1:1;所有执行事实都反范式化在 Task 自身。
runtime_kind
Task(不是 Agent)上的字段,指定要使用的 runtime 实现。内核不枚举具体种类,由内核之上的层来决定。
Loadable(可加载)
实例由内核之上的层填充进来的概念,而不是内置于内核本身。Capability 和 Agent 都是可加载的。
聚合根(DDD)
所有对一组对象的变更必须经过的实体。在内核中,只有 Task 是聚合根。
值对象(DDD)
完全由其属性所定义、不具独立身份与生命周期的对象。Capability 与 Agent 都是值对象。
领域服务接口(DDD)
用来承载"不自然归属于任何单一实体"的行为的接口。Runtime 是一个领域服务接口。
并发契约
不变量 1。两条子句:(1) 不同 Task 可以以任意顺序并发运行;(2) 对同一个 Task 的变更类操作必须串行化(Task 是一致性边界)。展开为七条精确规则——见第 10 章。
可观测性地板(G1)
Kernel 级最小集,定义于第 7 章:pull 永远 work · 转换是确定时刻但不规定通知机制 · push 不是公理 · liveness 不归 kernel · 中间执行事件不是一等概念。
架构等价性
不变量 2。同一个内核可以无修改地支撑完全不同的智能体产品。
主题中立
不包含任何把内核绑定到具体产品或方法论的概念、词汇或默认值。
内核之上的层
本书对任何 Layer 1 之外的事物的统一称呼。本书刻意不点名任何更高层。

Appendix B 附录 B

The DDD Lens DDD 视角

Why every kernel concept maps to exactly one DDD pattern. 为什么内核中的每个概念都恰好映射到一个 DDD 模式。

The kernel is small enough that every one of its concepts has a single, unambiguous DDD type. This is not a coincidence — it is a consequence of using DDD as the design lens from the start.

ConceptSub-contextDDD typeHas lifecycle?Has identity?
CapabilityCatalogValue ObjectNoBy name only
AgentCatalogValue ObjectNoBy name only (no id)
TaskExecutionAggregate Root (rich)Yes (6 states + timestamps)Yes (kernel-issued id)
RuntimeExecutionDomain Service InterfaceNone of its ownNone — it's a contract
ExecutionExecutionNamed concept · not an entitySame as its TaskBorrows Task's id

Ubiquitous language

The terms Capability, Agent, Task, Runtime are the entire ubiquitous language of Layer 1. We use them everywhere in this book, and we expect any conforming implementation to use them in code, in documentation, and in operator-facing tooling.

Higher layers are free to introduce additional vocabulary (for roles, workflows, methodologies). Those vocabularies must not replace the kernel terms; they layer above them.

Bounded context

"AgenticKernel" is the bounded context for Layer 1. The two sub-contexts (Catalog, Execution) are inside that single context, not outside it. This is why an implementation may freely place all four concepts in one module — the boundaries are conceptual, not deployment.

What is deliberately absent from the DDD picture

The kernel has no Repositories, no Domain Events, no Application Services, no Factories at the kernel level. Every one of these patterns is a fine choice above the kernel, but each would impose a particular implementation style on every consumer. We left them out for the same reason we left out tools, persistence, and orchestration: optionality.

内核已经小到每个概念都有一个单一、明确的 DDD 类型。这不是巧合——而是从一开始就用 DDD 作为设计透镜的结果。

概念子上下文DDD 类型有生命周期?有身份?
CapabilityCatalog值对象仅按 name
AgentCatalog值对象仅按 name(无 id)
TaskExecution聚合根(充血)有(6 态 + 时间戳)有(kernel 颁发的 id)
RuntimeExecution领域服务接口自身没有没有——它是契约
ExecutionExecution命名概念 · 非实体跟随它的 Task借用 Task 的 id

统一语言

CapabilityAgentTaskRuntime 这四个术语就是 Layer 1 的全部统一语言。本书中处处使用它们,并期望任何符合规范的实现也在代码、文档和面向操作者的工具中使用它们。

更高的层可以自由引入额外词汇(用于角色、workflow、方法论)。这些词汇不得替换内核术语,而是在其之上叠加。

限界上下文

"AgenticKernel" 是 Layer 1 的限界上下文。两个子上下文(Catalog、Execution)位于这个单一上下文之,而不是之外。这就是为什么具体实现可以把四个概念都放进同一个模块——边界是概念性的,不是部署性的。

DDD 图景中刻意缺席的东西

内核里没有 Repository,没有 Domain Event,没有 Application Service,内核层面也没有 Factory。这些模式中的每一个,在内核之上都是不错的选择,但任何一个都会把某种特定的实现风格强加给所有消费者。我们把它们排除在外,原因和把工具、持久化、编排排除在外是一样的:可选性。

Appendix C 附录 C

Design Decisions 设计决策

Eighteen decisions that shaped the kernel — each with the alternative we rejected. 塑造了内核的十八个决策——每条都附上我们拒绝的备选方案。

C.1 — Why "kernel" rather than "framework"

Decision: The Layer 1 component is a kernel — small, neutral, providing only the smallest stable surface.

Rejected alternative: A framework that ships orchestration, role conventions, and persistence in the box. We rejected it because it would couple every product to one set of choices.

C.2 — Why exactly four axioms

Decision: Capability, Agent, Task, Runtime — and nothing else.

Rejected alternative: Adding a separate "Tool", "Memory", or "Plan" axiom. We rejected each because they failed the "is it kernel?" test from Chapter 3.

C.3 — Why two sub-contexts (Catalog, Execution)

Decision: Inside Layer 1, group the four concepts into Catalog and Execution.

Rejected alternative: Either no sub-contexts (a flat list) or further sub-contexts (separate Scheduler, Planner, Memory). Neither preserved both useful and theme-neutral simultaneously.

C.4 — Why Capability has only name

Decision: One field. Capability is structurally fully kernel-opaque; name is its only shape at the kernel level (formatted as <scope>/<name>).

Rejected alternative: Adding instructions, tools, model, version, or tags inside the kernel. Each would force a particular execution style (LLM, prompt-shape, tooling) on every product. Those concerns belong to the loader and to the layer above the kernel.

C.5 — Why Agent is name + capabilities[] + metadata? — no id, no runtime_kind

Decision: Agent is a static, scoped-named bundle of capability names. Its scoped name is the unique handle (no separate id). runtime_kind lives on Task — not on Agent.

Rejected alternative A (id on Agent): would invent an instance-vs-definition distinction we do not need; the scoped name already carries identity.

Rejected alternative B (runtime_kind on Agent): would lock each Agent to one runtime kind. Putting it on Task lets the same Agent run on different kinds across migrations, A/B, fallback, and local-vs-remote.

C.6 — Why Task is the only Aggregate Root

Decision: Only Task has state evolution; therefore only Task is an Aggregate Root.

Rejected alternative: Promoting Agent or Runtime to Aggregate Root. We rejected this because neither has any independent state — promoting them would invent a lifecycle we do not need.

C.7 — Why Task has six states (not four)

Decision: not_started, running, paused, success, failure, cancelled.

Rejected alternative A — four states (no paused, no cancelled): would have forced cancellation to be encoded as a special failure code, conflating "the work failed" with "we stopped the work". And it would have left "supplement mid-execution" without a clean home — every impl would invent its own ad-hoc protocol. Adding paused + cancelled made the supplement story trivial (pause + resume(extra)) and the cancellation story honest (terminal, but not failure).

Rejected alternative B — adding even more states (blocked, queued): Each was either redundant (a blocked Task is a paused Task whose pause reason is in metadata) or above-kernel (queued belongs to a scheduler at the layer above).

C.8 — Why Task carries no parent / child references

Decision: Tasks are flat at the kernel level.

Rejected alternative: Encoding hierarchy or DAG edges in the Task aggregate. We rejected it because task-graph shape is precisely what differs between products. Encoding any one shape rules out others.

C.9 — Why Runtime is an interface, not a concrete type

Decision: Runtime is a Domain Service Interface; the kernel ships no implementation.

Rejected alternative: Picking a default (e.g. local CLI) and shipping it with the kernel. We rejected this because the default would dictate the deployment surface for every product.

C.10 — Why Runtime is an interface, not a 1:1 entity

Decision: Runtime is a contract. The kernel does not model Runtime as an entity with its own id, fields, or 1:1 lifetime alongside a Task.

Rejected alternative: Modelling Runtime as an entity 1:1 with Task. We rejected this because (a) it duplicates execution facts already denormalised on Task, (b) it forces every impl to allocate per-Task state structures (ruling out stateless / pooled impls), and (c) it makes the kernel ask more than it needs — the kernel only needs to name the operations.

C.11 — Why Runtime has six verbs and asymmetric signatures

Decision: The Runtime interface has six verbs: dispatch(task), pause(task_id), resume(task_id, extra?), kill(task_id), complete(task_id, result), fail(task_id, failure). dispatch takes the whole Task; the other five take only task_id.

Rejected alternative A — symmetric, all take whole task: would force the three control ops to redundantly re-pass the immutable input on every call.

Rejected alternative B — symmetric, all take only task_id: would force a separate "register the task with the runtime" call before dispatch — adding a step with no semantic gain. Asymmetry is a feature: it reflects the materialisation moment.

Rejected alternative C — keeping a separate supplement verb (the older design): created two mutation paths to supplements and a race window between "appended" and "resumed". Folding the append into resume(extra) gives a single atomic mutation per supplement. The supplement story is now pause + resume(extra), with no axiom-level supplement verb.

C.12 — Why concurrency is a contract, not a mechanism

Decision: The Concurrency Contract describes what holds, not how it is implemented.

Rejected alternative: Specifying locks, queues, or specific schedulers in the kernel. We rejected this because mechanism choice is precisely what implementations should be free to vary.

C.13 — Why "the layer above the kernel" appears throughout this book

Decision: When the book must point upward, it uses a generic phrase.

Rejected alternative: Naming a specific higher-layer architecture (e.g. a particular workflow, a particular product). We rejected this because the kernel must be describable without referring to anything above it. Layer 1 is self-contained by writing, not just by structure.

C.14 — Why Failure is { code, message, metadata? } with no retriable

Decision: Failure = { code: string, message: string, metadata?: map }. code is impl-defined (suggested namespace <impl>/<code>); both code and message are required. The kernel does not carry a retriable: bool.

Rejected alternative A — a single reason string: forced upstream code to parse prose to make decisions. code separates fact from words.

Rejected alternative B — adding retriable: bool: conflates fact (what failed) with policy (should we retry). Whether 429 rate_limit is retriable depends on the workflow's tolerance, budget, and context — not on the failure itself. Retry policy lives at the layer above as code → action; the Failure stays opinion-free.

C.15 — Why "re-execution" is not a kernel verb

Decision: Terminal Tasks are forever frozen. To re-execute, the caller constructs a new Task (fresh id) copying desired fields and dispatches it. Lineage is recorded by convention in metadata as <scope>/lineage_of: <original_task_id>.

Rejected alternative A — a reset verb that returns a terminal Task to not_started: would break the Aggregate Root's monotonic state machine, complicate every observer, and force impls to define what "reset" means for partial result, partial supplements, partial metadata. Cloning is unambiguous.

Rejected alternative B — an attempts: [Attempt] array on Task: turns the Aggregate Root into a meta-aggregate of attempts, doubles the schema, and still needs lineage to relate attempts. The clone-and-redispatch approach achieves the same semantics with a single immutable Task per attempt.

C.16 — Why the Concurrency Contract is two clauses unfolded into seven rules

Decision: The two top-level clauses (inter-Task parallelism, intra-Task serialisation) remain the headline; precise semantics are captured as seven rules (closed mutator set, same-Task serialise, atomic mutations, MVCC reads, no cross-Task global snapshot, read-your-writes per caller, happens-before per Task) plus three edge cases (failed mutator, no-op kill, atomic timestamps). See Chapter 10.

Rejected alternative A — leaving the contract at two clauses only: left every impl to invent its own answer to "do readers serialise with mutators?", "is there a global snapshot?", "what does kill on a terminal Task do?". The seven rules are the tightest formulation that admits all reasonable impl strategies (per-Task lock, actor, MQ partition, MVCC store) without admitting incompatible observable behaviour.

Rejected alternative B — going further to dictate a mechanism: would couple the kernel to a particular concurrency primitive. Mechanism choice is exactly what impls should be free to vary.

C.17 — Why concrete impls live in sibling extension libraries

Decision: The AgenticKernel package ships only axioms, invariants, and the Runtime/Query interfaces. Concrete impls live in sibling libraries named AgenticKernel.<substrate> (e.g. AgenticKernel.Copilot, AgenticKernel.OpenAI, AgenticKernel.Claude, AgenticKernel.Local). Each sibling depends on the kernel; the kernel depends on no sibling.

Rejected alternative A — bundling default impls in the kernel package: would couple every consumer to the bundled vendor SDKs and force a versioning treadmill. Sibling packages let consumers depend only on what they use.

Rejected alternative B — putting impls inside a single product package (e.g. SWAT): works for one product but blocks reuse. Sibling libraries make impls available to any consumer of the kernel — a research product, a different SWAT, a benchmark harness.

C.18 — Why the kernel-level observability floor is "pull works"

Decision: The kernel commits only to a pull floor (Chapter 7, Observability): pull always works · transitions are deterministic moments but no notification mechanism is mandated · push is not an axiom · liveness is not the kernel's concern · mid-execution events are not first-class.

Rejected alternative A — committing to push notifications in the kernel: different impls natively support different push transports (websocket, SSE, in-process callback, Erlang process). Mandating one would pollute the kernel surface; mandating an abstract one would push every impl into adapter overhead. Push belongs above.

Rejected alternative B — kernel-level liveness / heartbeat: "is the substrate making progress?" is impl-substrate-specific. The kernel cannot meaningfully define it. Impls write heartbeat hints to metadata by convention; observers interpret as they wish.

C.1 — 为什么是"内核"而不是"框架"

决策:Layer 1 的组件是内核——小、中立、只提供最小且稳定的表面。

被拒绝的备选:一个出厂自带编排、角色约定和持久化的框架。我们拒绝它,因为那会把每个产品都绑定到同一套选择上。

C.2 — 为什么恰好是四条公理

决策:Capability、Agent、Task、Runtime——别无他物。

被拒绝的备选:另加一条独立的 "Tool"、"Memory" 或 "Plan" 公理。每一条都没通过第 3 章的"它属于内核吗?"测试。

C.3 — 为什么是两个子上下文(Catalog、Execution)

决策:在 Layer 1 内部,把四个概念分组为 Catalog 与 Execution。

被拒绝的备选:要么不分子上下文(一个扁平列表),要么进一步细分(独立的 Scheduler、Planner、Memory)。两者都不能同时保住"有用"和"主题中立"。

C.4 — 为什么 Capability 只有 name

决策:一个字段。Capability 在结构上对内核完全不透明;name 是它在 kernel 层面的唯一形态(形如 <scope>/<name>)。

被拒绝的备选:在内核里加入 instructions、tools、model、version 或 tags。每一项都会把某种特定的执行风格(LLM、prompt 形态、工具栈)强加给所有产品。这些关注点属于 loader 与内核之上的层。

C.5 — 为什么 Agent 是 name + capabilities[] + metadata?—— 无 id、无 runtime_kind

决策:Agent 是一份静态、scoped 命名的 capability 名字 bundle。Scoped 的 name 就是唯一句柄(不再单独设 id)。runtime_kind 活在 Task 上——不在 Agent 上。

被拒绝的备选 A(在 Agent 上加 id):会发明一组我们不需要的"实例 vs 定义"区分;scoped name 已经承载了身份。

被拒绝的备选 B(runtime_kind 放在 Agent 上):会把每个 Agent 锁死到单一 runtime kind。放在 Task 上则允许同一个 Agent 在迁移、A/B、降级、本地 vs 远程之间跨 kind 执行。

C.6 — 为什么 Task 是唯一的聚合根

决策:只有 Task 有状态演化,因此只有 Task 是聚合根。

被拒绝的备选:把 Agent 或 Runtime 提升为聚合根。我们拒绝这一点,因为两者都没有任何独立状态——把它们提升上来等于发明了一个我们不需要的生命周期。

C.7 — 为什么 Task 有六个状态(不是四个)

决策:not_started、running、paused、success、failure、cancelled。

被拒绝的备选 A —— 四态(无 paused、无 cancelled):会强迫把"取消"编码为某个特殊的 failure code,把"工作失败"和"我们停掉了它"混在一起。也会让"执行中追加补充"无家可归——每个 impl 都得发明自己的临时协议。加上 paused + cancelled 让 supplement 故事变得简单(pause + resume(extra)),让 cancellation 故事变得诚实(终态,但不是 failure)。

被拒绝的备选 B —— 加更多状态(blockedqueued):每一个要么是冗余的(blocked 就是 paused,pause 原因写在 metadata),要么属于内核之上(queued 属于上层 scheduler)。

C.8 — 为什么 Task 不携带 parent / child 引用

决策:Task 在内核层面是扁平的。

被拒绝的备选:在 Task 聚合中编码层级或 DAG 边。我们拒绝它,因为任务图形态恰恰是不同产品差异最大的地方。编码任何一种形态就排除了其他。

C.9 — 为什么 Runtime 是接口而不是具体类型

决策:Runtime 是一个领域服务接口;内核不附带任何实现。

被拒绝的备选:挑一个默认(例如本地 CLI)并随内核出厂。我们拒绝它,因为那个默认会替每个产品决定部署面貌。

C.10 — 为什么 Runtime 是接口,而不是 1:1 的实体

决策:Runtime 是一份契约。内核不把 Runtime 建模成"带自身 id、字段、与 Task 1:1 生命周期"的实体。

被拒绝的备选:把 Runtime 建模为与 Task 1:1 的实体。我们拒绝这一点,因为:(a) 这会重复 Task 上已经反范式化的执行事实;(b) 强迫每一个 impl 都为每个 Task 分配状态结构(把"无状态" / "池化"实现都堵死);(c) 让 kernel 要求过度——内核只需要命名操作

C.11 — 为什么 Runtime 有六个动词且签名不对称

决策:Runtime 接口有六个动词:dispatch(task)pause(task_id)resume(task_id, extra?)kill(task_id)complete(task_id, result)fail(task_id, failure)dispatch 收整个 Task;其余五个只收 task_id

被拒绝的备选 A(全对称,都收整个 task):强迫三个控制操作每次重复传递不可变输入。

被拒绝的备选 B(全对称,都收 task_id):强迫在 dispatch 之前多一次"向 runtime 注册 task"调用——多一步,无语义收益。不对称是一种特性:它正好反映了"物化时刻"。

被拒绝的备选 C(保留独立的 supplement 动词,旧设计):会让 supplements 有两条 mutation 路径,并在"已追加"和"已 resume"之间留下竞态窗口。把追加折进 resume(extra) 后,每条 supplement 一次原子 mutation。Supplement 故事现在就是 pause + resume(extra),没有公理级 supplement 动词。

C.12 — 为什么并发是契约而不是机制

决策:并发契约描述"什么成立",而不是"如何实现"。

被拒绝的备选:在内核里指定锁、队列或具体调度器。我们拒绝它,因为机制选择恰恰是各种实现应该可以自由变化的部分。

C.13 — 为什么本书通篇使用"内核之上的层"

决策:当本书必须向上指代时,使用通用短语。

被拒绝的备选:点名某个具体的更高层架构(例如某个 workflow、某个产品)。我们拒绝它,因为内核必须能够在不引用任何上层的前提下被完整描述。Layer 1 的自洽不仅是结构上的,更是书写上的。

C.14 — 为什么 Failure{ code, message, metadata? } 且不带 retriable

决策:Failure = { code: string, message: string, metadata?: map }code 由 impl 自定(建议命名空间 <impl>/<code>);codemessage 都必填。Kernel retriable: bool

被拒绝的备选 A —— 单一 reason 字符串:逼着上层 parse 散文来做决策。code 把"事实"和"措辞"分开。

被拒绝的备选 B —— 加 retriable: bool把"事实"(出了什么错)和"策略"(要不要重试)混在一起。同一个 429 rate_limit 是否该重试,取决于 workflow 的容忍度、预算与上下文——不取决于失败本身。Retry policy 活在内核之上的层,编码为 code → actionFailure 不带任何意见。

C.15 — 为什么"重跑"不是 kernel 动词

决策:终态 Task 永久冻结。要重跑,调用方构造一个新 Task(新 id,按需拷贝字段)并 dispatch。世系按约定记在 metadata,形如 <scope>/lineage_of: <original_task_id>

被拒绝的备选 A —— 提供 reset 动词把终态 Task 拉回 not_started会破坏聚合根的单调状态机,让每个 observer 都更复杂,并强迫 impl 定义 "reset" 对 partial result / partial supplements / partial metadata 的语义。Clone 没有歧义。

被拒绝的备选 B —— 在 Task 上加 attempts: [Attempt] 数组:把聚合根变成"尝试的元聚合",schema 翻倍,且仍然需要 lineage 关联各次尝试。Clone-and-redispatch 用一个 immutable Task 一次尝试,达到同样的语义。

C.16 — 为什么并发契约是"两条子句展开为七条规则"

决策:顶层两条子句(Task 间并行、Task 内串行)保持为标题;精确语义用七条规则(mutator 集闭合、同 Task mutators 串行、mutation 原子、MVCC 读、无跨 Task 全局 snapshot、同 caller read-your-writes、同 Task 保留 happens-before)+ 三条边界(失败 mutator、no-op kill、原子时间戳)描述。详见第 10 章。

被拒绝的备选 A —— 只保留两条子句:会让每个 impl 自行回答"reader 与 mutator 串行吗?"、"是否有全局 snapshot?"、"kill 在终态 Task 上做什么?"。这七条是允许所有合理 impl 策略(per-Task lock、actor、MQ 分区、MVCC 存储)但又不允许互不兼容的可观察行为的最紧表述。

被拒绝的备选 B —— 进一步规定机制:会把内核绑定到某种并发原语。机制选择恰恰是 impl 应自由变化的部分。

C.17 — 为什么具体实现走同级扩展库

决策:AgenticKernel 包只附带公理、不变量与 Runtime/Query 接口。具体实现走 sibling 库,命名约定 AgenticKernel.<substrate>(如 AgenticKernel.CopilotAgenticKernel.OpenAIAgenticKernel.ClaudeAgenticKernel.Local)。每个 sibling 依赖 kernel;kernel 不依赖任何 sibling。

被拒绝的备选 A —— 在 kernel 包里捎带默认 impl:把每个消费者都耦合到捎带的 vendor SDK 上,并强加版本升级压力。Sibling 包让消费者只依赖自己用的那个。

被拒绝的备选 B —— 把 impl 放进单一产品包(如 SWAT):对一个产品 OK,但阻断复用。Sibling 库让 impl 可被任何 kernel 消费者使用——研究产品、另一个 SWAT、benchmark harness。

C.18 — 为什么内核级可观测性地板是"pull works"

决策:Kernel 只承诺一个 pull 地板(第 7 章 · 可观测性):pull 永远 work · 转换是确定时刻但不规定通知机制 · push 不是公理 · liveness 不归 kernel · 中间执行事件不是一等概念。

被拒绝的备选 A —— 在 kernel 里承诺 push 通知:不同 impl 原生支持不同 push transport(websocket、SSE、in-process callback、Erlang process)。强加一种会污染 kernel 表面;强加一个抽象的会让每个 impl 背上 adapter 开销。Push 属于上层。

被拒绝的备选 B —— Kernel 级 liveness / 心跳:"substrate 是不是真在前进"是 impl-substrate 特定的。Kernel 无法有意义地定义。Impl 按约定往 metadata 写心跳;observer 自行解释。