Catalog · Infrastructure

Volume 05

The AI Agent Fabric Catalog

Volume 05 of the Agentic AI Series

20 patterns draft-v0.1 2026-05 Infrastructure

A Catalog of Substrate, Identity, and Sandbox

Draft v0.1

May 2026

Table of Contents

About This Catalog

This is the fifth and final volume in a catalog of the working vocabulary of agentic AI. The four prior volumes work from the inside out: how agent runs are structured in time (Patterns), what instructions guide the model (Skills), what primitives the agent invokes (Tools), what makes agents run in the first place (Events and Triggers). This fifth volume completes the picture by looking underneath all of them --- the infrastructure layer on which the entire stack sits.

“Fabric” is the term of art that emerged in 2025—2026 for this layer. The metaphor borrows from textile manufacturing: a fabric is what gets woven from many threads, holding the structure together. An agent fabric weaves together five sub-concerns --- compute (where agents run), identity (who they are), network (how they talk), registry (how they’re found), sandbox (how they’re contained) --- into a single substrate. Without a fabric, every agent system invents its own answer to these questions; with one, the answers become reusable across many agents and many products.

The category is young enough that the canonical reference implementations don’t yet have famous names. The AWS Sample Agentic Fabric (“Arbiter”), released in 2026, is the most ambitious reference architecture available as open source: a constitutional substrate with deterministic governance, dynamically fabricated worker agents, and an explicit authority model. Microsoft’s Agent Framework targets the same problem from a different angle. Strata’s Identity Orchestration product treats the fabric as primarily an identity problem (their data: 80× more agents than human users within two years). E2B and Modal address the sandbox concern with microVMs; GitHub built a trusted-VM-runner substrate specifically for agentic workflows. The catalog covers all of these and more, plus the personal-fabric stacks that have been built by individuals (Daniel Miessler’s PAI and Fabric Prompts Framework being the best-known).

Scope

Coverage:

  • Cloud-vendor reference architectures for agent fabrics: AWS Sample Agentic Fabric, AWS Sample Agentic Platform, Microsoft Agent Framework, Azure Agentic Fabric App Sample.

  • Sandbox runtimes that contain agent code: E2B (Firecracker microVMs), Modal sandboxes, GitHub Agentic Workflows substrate.

  • Agent identity and federation: Strata Agent Fabric, SPIFFE/SPIRE workload identity, OAuth-for-agents patterns (PKCE, dynamic client registration, RFC 8693 token exchange).

  • Networking layers: service mesh patterns for agents, Pilot Protocol for encrypted agent tunnels.

  • Registries and discovery: the MCP Registry, Smithery, Composio connectors, agent fabric registries.

  • Kubernetes-shaped fabrics: KEDA event-driven autoscaling, AWS Bedrock AgentCore.

  • Personal and modular fabrics: Daniel Miessler’s Personal AI Infrastructure, the Fabric Prompts Framework.

  • Curation hubs: mfornos/awesome-agentic-ai, kyrolabs/awesome-agents.

Out of scope:

  • General cloud-native infrastructure (Kubernetes itself, Terraform, etc.) when used without an agent-specific framing. The agent angle is required.

  • Per-vendor connector libraries treated as one-off integrations. The fabric is the layer that hosts connectors; specific connectors live in the Tools and Events catalogs.

  • Closed enterprise IDPs (Entra, Okta, CyberArk, etc.) as standalone products. They appear here only as components of agent-fabric solutions.

  • Closed agent platforms that bundle the fabric with a product (ChatGPT, Claude.ai, Bedrock-as-product). The substrate is mentioned where relevant but the products themselves don’t get separate entries.

How to read this catalog

Part 1 (“The Narratives”) is conceptual orientation: what fabric means and what it isn’t, the five sub-layers, the cloud-vendor blueprint pattern, the sandboxing tier model, agent identity and federation, and the constitutional-substrate design pattern from the AWS reference. Five diagrams sit in Part 1; everything in Part 2 is text and code.

Part 2 (“The Substrates”) is reference material organized by section. Each section opens with a short essay on what its entries have in common and how they relate to alternatives. Representative substrates are presented in the same Fowler-style template used by the prior four catalogs. The entries are not meant to be read front-to-back; jump in via the table of contents to whatever matches the task at hand.

Part 1 — The Narratives

Five short essays frame the design space for agent fabric. The reference entries in Part 2 assume the vocabulary established here.

Chapter 1. What Fabric Means

“Fabric” in the agent-infrastructure context names the layer underneath orchestration: the substrate of compute, identity, networking, registry, and sandbox concerns that every agent system depends on whether or not it has a name for them. The term gained currency in 2025—2026 alongside the explicit recognition that AI agents are a third class of runtime actor, distinct from humans (handled by the identity fabric) and from applications (handled by the app fabric). Strata Identity’s framing is now the standard: three sibling fabrics, knit together by Identity Orchestration.

The three fabrics
Identity, App, and Agent fabrics are sibling concerns at the same architectural level. Identity Orchestration glues them together.

The identity fabric is the older and more mature of the three. It abstracts over IDPs (Okta, Entra, Auth0, Keycloak, CyberArk), normalizes assurance levels and authentication mechanisms, and makes login flows portable across the application stack. The app fabric extends the same idea to services and APIs: consistent identity governance, Zero-Trust controls, and access visibility regardless of where the application runs.

The agent fabric is the new entrant. Its job is structurally the same as the other two --- a unifying layer that imposes consistent identity, policy, and observability --- but the runtime actor is now an AI agent, which has properties human and application identities don’t. Agents spin up and down rapidly. They span LLM frameworks, API runtimes, CI/CD pipelines, and multiple clouds simultaneously. They act on behalf of humans or on their own, often without clear consent boundaries. They have OAuth scopes that grow over time. And there will be a lot of them --- Strata’s often-cited estimate is 80× more agents than human users within two years.

The fabric model also operates inside a single deployment, not just across them. Inside one organization, the agent fabric typically has five visible sub-concerns: compute (where the agent runs), identity (how it authenticates), network (how it communicates), registry (how it’s discovered), and sandbox (how it’s contained when it executes untrusted code). The remainder of this chapter walks through those five briefly, anchoring each with the representative technologies.

The five fabric sub-layers
Compute, Identity, Network, Registry, Sandbox. Most cloud-vendor reference architectures address all five.

Compute is the most visible: containers on Kubernetes, microVMs on Firecracker, serverless functions on Lambda, dedicated agent runtimes like AWS Bedrock AgentCore. Identity is the most contested: who is an agent, in IDP terms? A SPIFFE workload identity that issues short-lived OIDC tokens? A service account in the cloud’s IAM? A first-class user record in the IDP, with audit trails like a human’s? Different platforms answer differently; agreement is emerging that the agent should have a verifiable identity object in the IDP, with scopes and TTL like any other principal.

Network covers how agents reach each other and the services they consume. Service-mesh patterns (Istio, Linkerd, App Mesh) transfer cleanly to inter-agent traffic; specialized layers like the Pilot Protocol issue agents encrypted tunnels and virtual addresses so they can communicate without exposing routable IPs. Registry is the answer to “which agents exist and what can they do”: the MCP Registry for tools, the Strata-style agent fabric registry for identity bindings and scopes, internal capability catalogs for what each agent is authorized to do. Sandbox is the containment story --- covered in detail in Chapter 3.

Chapter 2. The Cloud-Vendor Substrates

The hyperscalers each ship reference architectures for agent fabrics. The architectures look different in their details but converge on the same primitives, which makes a working knowledge of one architecture more transferable than the marketing suggests.

AWS’s Sample Agentic Fabric --- the “Arbiter” repository, released as open source in 2026 --- is the most ambitious of the public references. EventBridge is the inter-agent bus (rebranded in the architecture as the “Neural Weave”); SQS carries async task queues; DynamoDB holds twelve different tables of agent metadata, workflow state, and governance ledger entries; S3 stores the code of dynamically-generated worker agents; Lambda hosts the orchestrator (the Arbiter), the worker wrapper, and the fabricator (the agent that generates new agents). The architecture is wired up with CDK in TypeScript. The model under it is governance-first: every dispatch passes through a deterministic Control-Surface Band that evaluates authority scopes and composition contracts before letting the agent execute. The constitutional substrate pattern is covered in Chapter 5.

Microsoft’s Agent Framework targets the same architectural problem from the .NET/Python side. The repo provides building blocks for orchestrating and safely deploying production multi-agent workflows; it integrates with Azure AI Foundry for hosting and observability, with Entra for identity, with Service Bus for the event substrate, and with Container Apps for compute. The Azure Agentic Fabric App Sample is a step-by-step variant that targets Microsoft Fabric (the analytics platform), wiring up a multi-agent framework that operates against Fabric workloads via dev-container infrastructure.

GCP’s position is comparable but less consolidated in a single sample: Vertex AI Agent Builder provides the agent hosting; Cloud Run jobs and Cloud Tasks handle async work; Pub/Sub is the event substrate; the Workload Identity Pool handles agent identity; Cloud Logging and Cloud Trace handle observability. The reference architectures are documented but less coherent than AWS’s single-repo Arbiter or Microsoft’s Agent Framework.

The pattern across all three: a managed event bus connects agents asynchronously; a queue absorbs back-pressure; a serverless or container runtime hosts the agent code; a NoSQL store holds agent state; an identity provider supplies short-lived credentials; a deterministic policy layer governs access; and an observability stack records everything. The names differ --- EventBridge/Service Bus/Pub/Sub, SQS/Service Bus Queues/Cloud Tasks, DynamoDB/Cosmos DB/Firestore --- but the substrate is the same. A team that builds on one can port to another with mechanical work, not architectural rework.

Chapter 3. Sandboxing Tiers

Agents that generate and execute code present a containment problem that pre-agent applications didn’t face at the same scale: the code is created at runtime by a probabilistic process; verification before execution is hard; the blast radius of a bad call ranges from a wasted token to a deleted production database. The dominant operational response is layered sandboxing, with five tiers in regular use.

Sandboxing tiers
Five tiers, ranging from process-level isolation to dedicated hardware. Match the tier to the trust level.

Tier 1 is no sandbox at all: the agent runs in the same OS process as the surrounding application. Boot time is negligible (microseconds); the only isolation is the operating system’s process boundary, which is meaningless for code running in the same process. This tier is appropriate only for fully trusted agents written by the same team that operates the system.

Tier 2 is the container: Linux namespaces and cgroups, packaged by Docker or containerd, scheduled by Kubernetes. The boot time is ~100 milliseconds; the isolation is significantly stronger than process-level but doesn’t survive a kernel-exploit-class attack. Most agent stacks default here: it’s well-understood, the tooling is mature, and the trade-off is acceptable for most workloads.

Tier 3 is the microVM: a lightweight virtual machine via Firecracker, Cloud Hypervisor, or Kata Containers. Boot time is ~200 milliseconds (Firecracker’s headline number); the isolation is hypervisor-grade, which is dramatically stronger than container-level. This is where E2B, Modal sandboxes, and Fly Machines live. For untrusted agent code at scale, this is the right answer: nearly container-fast, nearly VM-isolated.

Tier 4 is the trusted-VM-runner: a full hypervisor with attestation, audit logging, and a kernel-enforced communication boundary between the host and the guest. Boot times are ~1—3 seconds. GitHub’s Agentic Workflows substrate falls here: a runner architecture purpose-built to shield enterprise environments from untrusted agent code execution. The trade-off is operational complexity and slower boot in exchange for the strongest software-level isolation.

Tier 5 is dedicated hardware: bare-metal or per-customer host, with physical separation between agents and the hosting infrastructure. Boot times are measured in minutes. The use case is regulated workloads (gov clouds, financial services compliance) where physical separation is a procurement requirement.

The operational discipline is to match the tier to the trust level. Agents written by trusted internal teams running well-tested code can live in Tier 1 or Tier 2. Agents generating code at runtime from a user’s prompt should be in Tier 3 by default; pushing them to Tier 2 invites the kind of incident where a clever prompt injection escalates to host compromise. Tier 4 and Tier 5 are for cases where the regulatory or compliance posture demands them.

Chapter 4. Agent Identity and Federation

The hardest unsolved problem in the agent fabric is identity. Humans authenticate via an IDP; applications authenticate via service accounts or workload identities; agents are an awkward third case. They are too autonomous to share a human’s identity (the audit trail loses meaning); they are too dynamic for static service accounts (an agent that spins up and down thousands of times a day can’t reasonably own a long-lived secret); they routinely cross cloud and IDP boundaries (ChatGPT in Azure calling LangChain in AWS calling CrewAI on-prem). The agent fabric’s job is to give each agent a verifiable identity that travels with it.

Agent identity federation across clouds
An agent in cloud A authenticates against a service in cloud B via short-lived OIDC tokens and standardized token exchange (RFC 8693).

The emerging answer borrows from workload identity patterns developed for service-to-service authentication. SPIFFE (Secure Production Identity Framework for Everyone) and its reference implementation SPIRE issue short-lived workload identities to agents based on attested platform properties --- the agent’s container, node, namespace, and image hash. The workload identity becomes the agent’s native principal. AWS’s IAM Roles for Service Accounts (IRSA), GKE Workload Identity, and Azure’s Workload Identity Federation provide cloud-native equivalents.

Cross-cloud federation is solved by OAuth 2.0 token exchange (RFC 8693). The agent in cloud A presents its native workload identity to an STS in cloud B; the STS exchanges it for a cloud-B-compatible token after policy evaluation; the agent uses that token to call a service in cloud B. The pattern works regardless of which cloud is which --- Azure to AWS, AWS to GCP, on-prem to any cloud. The Strata Agent Fabric, Auth0’s agent-identity features, and several emerging products commercialize this pattern with management UIs on top.

Below the federation layer, the agent fabric registry tracks every binding: which agent has which identity, which scopes it holds, when those scopes expire, who delegated them, what audit trail accompanies each issuance. Without the registry, the agent population becomes invisible to security teams --- the “shadow agents” problem Strata describes, where over-permissioned agents accumulate without oversight.

Two pragmatic recommendations for teams building today. First, give every agent a verifiable IDP identity from the start, even if it costs operational overhead --- retrofitting identity into a running agent fleet is painful. Second, use short-lived tokens (minutes, not hours) and rotate aggressively; long-lived agent credentials are the agent-era equivalent of the shared-account password problem. The fabric exists in large part to make these practices feasible at scale.

Chapter 5. The Constitutional Substrate

The most sophisticated agent-fabric design pattern to enter the public discourse in 2026 is the constitutional substrate, introduced in the AWS Sample Agentic Fabric (“Arbiter”) reference. The premise is that the existing answers for governing agent behavior --- LLM-based oversight, prompt-engineered guardrails, post-hoc auditing --- are compensatory rather than structural. They try to catch bad actions after the agent decides on them. The constitutional substrate moves the governance into the dispatch path: before the agent executes any action, a deterministic engine evaluates the action against an explicit authority model, and denied actions never run.

The constitutional substrate
AWS Sample Agentic Fabric: every agent dispatch passes through a deterministic Control-Surface Band that evaluates authority scopes and composition contracts.

The pattern has four governance primitives. Authority Scopes are typed tuples describing what an agent is allowed to do (the action, the target, the conditions). Composition Contracts describe how scopes combine when multiple authorities apply to the same action (the four canonical patterns are scope-based, priority, conjunction, and state-aware with monotonic reduction). Arbitration Patterns resolve conflicts between scopes deterministically. The Legibility Ledger is a write-once audit trail of every governance evaluation, the inputs it considered, and the resolution it produced.

Three architectural commitments make the pattern work. First, the governance engine makes no LLM calls --- it’s pure deterministic Python that evaluates structured tuples. This satisfies the independence requirement: the mechanism governing the agents is architecturally separate from the agents it governs. Second, residual authority defaults to denial --- actions not explicitly permitted are blocked, not allowed. This trades some operational friction for the property that gaps in the authority graph surface as denied actions rather than silent overreach. Third, the governance engine is deployed as a Lambda Layer in each consumer stack rather than as a service, which avoids cross-stack coupling issues and reinforces architectural independence.

The pattern’s most interesting property is its evolutionary mechanism. When the governance engine encounters a conflict it can’t resolve, it escalates to a human; the human’s resolution is encoded as case law in the authority model and applied deterministically on recurrence. Over time, the policy graph grows richer; the rate of escalations falls; the system’s ability to operate without human intervention increases without the corresponding governance loosening. This is the architectural inverse of the typical agent-system trajectory, where the governance starts strict and gets loosened over time as friction accumulates.

The constitutional substrate is one design pattern, not the only one, but the AWS reference is the most fully-worked-out public example as of mid-2026. Reading the source (src/governance/engine.py, src/governance/models.py, src/supervisor/index.py) is the fastest way to understand what governance-first agent fabric design looks like in practice. Teams designing their own fabric do not need to adopt the full pattern; the four primitives --- Authority Scopes, Composition Contracts, Arbitration Patterns, Legibility Ledger --- transfer to any architecture that wants deterministic governance on every agent dispatch.

Part 2 — The Substrates

Eight sections follow. Each opens with a short essay on what its entries have in common and how they relate to alternatives. Representative substrates are presented in the same Fowler-style template used by the prior four catalogs.

Sections at a glance

  • Section A --- Cloud reference architectures

  • Section B --- Sandbox runtimes

  • Section C --- Agent identity and federation

  • Section D --- Service mesh and networking

  • Section E --- Registries and discovery

  • Section F --- Kubernetes-shaped fabrics

  • Section G --- Personal and modular fabrics

  • Section H --- Curation hubs

Section A — Cloud reference architectures

Open-source reference implementations of agent fabric on AWS, Azure, and Microsoft Fabric

Four references dominate the public discourse as of mid-2026. AWS’s Sample Agentic Fabric (the “Arbiter” repo, MIT-0) is the most ambitious --- a full CDK-deployable governance-first agent fabric demonstrating the constitutional-substrate pattern. AWS’s Sample Agentic Platform is the companion broader-scope monorepo showing multiple containerized compute choices and how to navigate them. Microsoft’s Agent Framework targets the same architectural problem from the .NET and Python side. The Azure Agentic Fabric App Sample is the Microsoft Fabric (analytics platform) integration variant.

The shared property: each is an opinionated wiring of substrate primitives (event bus, async queue, container or serverless compute, identity provider, observability) into a coherent agent platform. Teams building their own fabric do not need to adopt any reference in full; the references are pedagogical artifacts more than products. Read them as designs to learn from, not as code to deploy unmodified.

AWS Sample Agentic Fabric (“Arbiter”)

Source: github.com/aws-samples/sample-agentic-fabric (MIT-0, Python + TypeScript CDK)

Classification Constitutional substrate reference architecture for AWS.

Intent

Demonstrate a complete governance-first agent fabric on AWS, with deterministic policy evaluation on every dispatch, dynamically-generated worker agents, and explicit authority modeling.

Motivating Problem

Most reference architectures for agent fabrics demonstrate plumbing: how an event bus connects to a Lambda that calls an LLM. They don’t demonstrate governance. The Arbiter is the public exception: a CDK-deployable substrate that wires up event routing, async queues, dynamic agent generation, and deterministic governance into a single coherent design --- with the governance treated as architecturally first-class rather than as a feature bolted onto an existing platform.

How It Works

Three coordinating agent roles run on Lambda: the Arbiter (the supervisor; uses the Bedrock Converse API to select worker agents via an agent-as-tool pattern; dispatches via SQS; tracks fan-out/fan-in completion); Workers (dynamically-generated agents executed by a generic wrapper that downloads code from S3 and injects a governance tool handler); the Fabricator (generates new agent code with the Strands Agents SDK when no existing agent can handle a task, uploads to S3, registers in the Fabric Index).

Event routing happens via EventBridge --- renamed in the architecture as the “Neural Weave.” Twelve DynamoDB tables hold agent registry, workflow state, worker state, tool configuration, five governance tables (authority units, composition contracts, case law, constitutional layers, governance ledger), and two memory tables (agent metrics, workflow outcomes). S3 holds versioned agent code. SNS handles governance escalation notifications.

The Control-Surface Band is the governance layer, deployed as a Lambda Layer (not a service) and so architecturally independent from the agents it governs. It evaluates every dispatch against four primitives --- Authority Scopes, Composition Contracts, Arbitration Patterns (scope-based, priority, conjunction, and state-aware with monotonic reduction), and the Legibility Ledger --- with no LLM calls and a default of denial for actions not explicitly permitted. Mandatory constitutional review applies to every permit; two global invariants are seeded by default (no_irreversible_action_without_audit_trail and no_scope_expansion_under_unconfirmed_state).

Governance is in bypass mode (GOVERNANCE_BYPASS=true) on initial deploy so the system runs as a standard Supervisor pattern; flipping the flag activates the full evaluation. Conflicts that the engine can’t resolve escalate to humans; the human resolution is encoded as case law and applied deterministically on recurrence.

When to Use It

Teams building agent fabrics on AWS who want a governance-first design. Architects studying the constitutional-substrate pattern as a design reference. Read-and-port even if not deploying the exact stack --- the four governance primitives transfer to any architecture.

Alternatives --- AWS Sample Agentic Platform for a broader-scope reference with multiple compute choices but less governance focus. Microsoft Agent Framework when the target stack is .NET / Azure. Roll-your-own when the governance model needs to differ substantially from the constitutional pattern.

Sources

  • github.com/aws-samples/sample-agentic-fabric

  • Aaron Sempf, “Architecting Autonomy” series (aaronsempf.substack.com)

Example

A claim-processing agent fabric for an insurance back-office. The Arbiter receives claim events via EventBridge, selects from a fleet of specialized worker agents (document extraction, fraud detection, payment authorization). The Fabricator generates a new agent the first time a claim type is encountered that no existing agent handles. The Control-Surface Band denies any payment-authorization action that the agent doesn’t hold the corresponding Authority Scope for; escalations get reviewed by a claims supervisor and the resolution is encoded as case law. Every dispatch and every governance decision is recorded in the Legibility Ledger for compliance audit.

Example artifacts

Schema / config.

// Authority Scope (Python dataclass; src/governance/models.py)

\@dataclass

class AuthorityScope:

action: str # e.g. "payment.authorize"

target: str # e.g. "customer.{customer_id}"

conditions: dict # e.g. {"amount_lte": 10000, "region": "us"}

delegated_by: str

expires_at: datetime

// Composition Contract

\@dataclass

class CompositionContract:

pattern: Literal["scope_based", "priority", "conjunction",
"state_aware"]

scopes: list[AuthorityScope]

resolution: ConflictResolution

Setup.

# Deploy the full stack:

cd app

npm install

npm run build

ENVIRONMENT=dev npx cdk bootstrap --profile <your-profile>

ENVIRONMENT=dev npx cdk deploy --all \

--require-approval never --profile <your-profile>

# Submit a test task via EventBridge:

aws events put-events --entries '[{

"Source": "task.request",

"DetailType": "System-Task",

"Detail": "{\"task\": \"Create a greeting agent that says
hello\"}",

"EventBusName": "agentic-fabric-dev"

}]'

AWS Sample Agentic Platform

Source: github.com/aws-samples/sample-agentic-platform (broader-scope companion repo)

Classification Multi-compute reference monorepo for AWS.

Intent

Demonstrate the operational shape of an agent platform across multiple containerized compute choices --- ECS, EKS, Bedrock AgentCore --- with documentation structures designed for coding agents to navigate.

Motivating Problem

The Arbiter sample focuses tightly on the governance design. A broader question that AWS’s Sample Agentic Platform answers: when you build an agent fabric on AWS, which compute primitive should each component run on? Bedrock AgentCore for the agent runtime itself, ECS for the supporting services, EKS for the heavy-traffic API layer? The repo is structured as a navigable monorepo with explicit structural documentation so that coding agents can read it and propose changes; the design philosophy itself demonstrates a fabric property (agent-readable infrastructure).

How It Works

The repo contains multiple example deployments across AWS compute primitives. Each deployment is independently buildable but shares common platform components: identity (via Cognito or Entra), event bus (EventBridge), telemetry (OpenTelemetry into X-Ray and CloudWatch), and shared infrastructure (VPCs, security groups, IAM roles).

The structural documentation pattern is the distinctive feature. Each directory carries a README that explains its purpose, its dependencies, and how to extend it; the documentation is structured for both human readers and for coding agents (Claude Code, Codex) operating on the codebase. The premise is that a maintainable agent platform must be readable by the agents that operate on it.

When to Use It

Teams choosing among AWS compute primitives for their agent platform. Cases where the question “should this agent run on Lambda, ECS, EKS, or AgentCore” needs a worked example with trade-off discussion. Reference for the agent-readable documentation pattern.

Alternatives --- the Arbiter sample (above) when the question is governance design. Direct cloud-vendor documentation when the question is about a single service.

Sources

  • github.com/aws-samples/sample-agentic-platform

Microsoft Agent Framework

Source: github.com/microsoft/agent-framework (.NET and Python; MIT)

Classification Multi-agent framework with built-in safe deployment primitives.

Intent

Provide Microsoft’s official building blocks for orchestrating and safely deploying production multi-agent workflows on the Azure stack.

Motivating Problem

Microsoft’s agent product surface has grown organically --- Copilot Studio, Semantic Kernel, AutoGen, Azure AI Foundry --- with each addressing a different slice of the agent problem. The Agent Framework is the unifying SDK: a common abstraction over those product surfaces with explicit support for production deployment concerns (identity, observability, sandboxing) baked into the framework rather than left to the application.

How It Works

The framework provides typed agent and workflow abstractions with native support for both .NET and Python. Agents have role-defined behavior, optional tool access, and pluggable model backends (Azure OpenAI, Bedrock, OpenAI direct, etc.). Workflows orchestrate multi-agent collaborations with deterministic transitions and built-in state persistence.

Production deployment integrations: Entra ID for agent identity (so each agent shows up as a verifiable principal); Application Insights and OpenTelemetry for observability with the GenAI semantic conventions; Azure AI Foundry for hosted execution with sandboxing; Service Bus for event routing; Container Apps for compute. The framework hides the wiring; the developer writes agent and workflow definitions in code.

When to Use It

Teams on the Microsoft enterprise stack (Azure, Entra, M365) who want a vendor-supported framework for production agent workflows. Cases where Semantic Kernel and AutoGen lessons need to be combined under one SDK. Enterprise deployments where identity into Entra and observability into Application Insights are requirements rather than nice-to-haves.

Alternatives --- LangGraph for the cross-cloud open-source equivalent. AutoGen alone when the agent surface is purely conversational. Semantic Kernel when the focus is plugins-and-prompts rather than multi-agent orchestration.

Sources

  • github.com/microsoft/agent-framework

Example

An enterprise document-review agent fabric: a workflow of three agents (extractor, classifier, summarizer) defined with the Agent Framework, each with an Entra-issued identity scoped to specific SharePoint sites; Application Insights traces every step; the workflow runs on Azure Container Apps with auto-scaling driven by Service Bus queue depth. The same workflow runs in a developer sandbox via Azure Container Apps local emulation.

Azure Agentic Fabric App Sample

Source: github.com/Azure-Samples/agentic-app-with-fabric (Microsoft Fabric integration)

Classification Reference architecture for integrating multi-agent frameworks with Microsoft Fabric (analytics platform).

Intent

Demonstrate the dev-container pattern for building agents that operate directly against Microsoft Fabric workloads: lakehouses, semantic models, notebooks, pipelines.

Motivating Problem

Microsoft Fabric is Microsoft’s unified analytics platform (formerly part of Power BI), with lakehouses, semantic models, real-time intelligence, and data engineering as a unified surface. Agents that operate against Fabric workloads (“summarize this dashboard,” “investigate this anomaly,” “draft this report”) need authentication into Fabric, access to its APIs, and a runtime that’s allowed to read the data. The Agentic Fabric App Sample wires this up as a dev-container template.

How It Works

The sample is a dev container (.devcontainer/) that pre-configures a Python and .NET environment with the Agent Framework, Fabric REST API clients, and Entra authentication. The included sample agent demonstrates reading from a Fabric lakehouse, querying a semantic model, and producing a structured output. Deployment to Azure (Container Apps, Functions, or VMs) is documented; the dev container is the on-ramp.

The pattern transfers beyond Microsoft Fabric: a dev-container template that pre-wires identity, the relevant SDK, the deployment story, and a sample agent is a reusable shape for any vendor’s agent integration. The Azure sample is the canonical example for Fabric; equivalent dev-containers exist for Databricks, Snowflake, and Salesforce agent integrations.

When to Use It

Organizations using Microsoft Fabric who want to add agents that operate on Fabric data. Teams looking for the dev-container pattern as a reusable agent-onboarding template.

Alternatives --- building from scratch when the Fabric integration needs significant customization. Other vendor agent SDKs (Databricks, Snowflake) when the data platform is different.

Sources

  • github.com/Azure-Samples/agentic-app-with-fabric

Section B — Sandbox runtimes

Where untrusted agent code actually runs --- microVMs, container-based, and trusted-VM-runner substrates

Chapter 3 of Part 1 established the five tiers of sandboxing. This section documents the three representative runtimes at the three tiers that matter most for production agents: E2B (Tier 3, Firecracker microVMs, the dominant general-purpose sandbox for AI agents), Modal sandboxes (Tier 3, broader serverless surface with GPU support), and GitHub’s Agentic Workflows substrate (Tier 4, trusted-VM-runners purpose-built for CI-shaped agent workloads). Each is covered briefly here with a cross-reference to the more-detailed Tools Catalog (Vol 3, Section C) treatment.

All three serve the same operational purpose: contain untrusted code so a wrong call costs a sandbox but not the user’s system. The differences are scale (Modal for GPU-heavy work), boot speed (E2B at ~200 ms, GitHub runners at seconds), and trust model (Firecracker hypervisor vs. trusted-runner attestation).

E2B (Firecracker microVMs)

Source: e2b.dev (Apache-2; Python and TypeScript SDKs)

Classification Tier 3 sandbox — microVM runtime for AI agents.

Intent

Provide secure, fast-booting sandboxes for AI-generated code using Firecracker microVMs, suitable for executing untrusted code without compromising the host.

Motivating Problem

For agents that generate and run code, the sandbox is the safety boundary. Process-level isolation is insufficient (the process can be the agent’s host application); container isolation is the default but doesn’t survive kernel-class attacks. E2B provides the right answer for most production agents: Firecracker-based microVMs that boot in roughly 200 milliseconds, give hypervisor-grade isolation, and run a configurable image with Python and JavaScript pre-installed.

How It Works

Sandbox.create() launches a Firecracker microVM in a managed cloud (or in BYOC deployments on the user’s own AWS or GCP). The SDK exposes commands.run for shell commands and run_code for stateful Python or JavaScript with Jupyter-kernel persistence between calls. Filesystem, network (subject to allowlists), and resource limits are all configurable. State persists within a sandbox; sandboxes are reused for performance.

Beyond the core sandbox, E2B Desktop provides a graphical Linux desktop reachable from Anthropic’s Computer Use or OpenAI’s Surf agents --- the same Tier 3 isolation, now with a screen. Fragments is the open-source template for building Claude-Artifacts-style code-generation apps on top.

When to Use It

Any production agent that executes user-provided or LLM-generated code. Computer-use agents that need an isolated desktop. Anywhere the sandbox is the security boundary, not a performance optimization.

Alternatives --- Modal sandboxes for the same isolation with GPUs. Anthropic’s code_execution tool when running inside the Claude API. GitHub Agentic Workflows substrate (below) for the trusted-VM-runner pattern in CI workloads. Covered in more detail in Tools Catalog Section C.

Sources

  • e2b.dev

  • github.com/e2b-dev/E2B

GitHub Agentic Workflows substrate

Source: github.blog Under the hood: security architecture of GitHub Agentic Workflows (2026)

Classification Tier 4 sandbox — trusted-VM-runner for CI-shaped agent workloads.

Intent

Shield enterprise environments from untrusted agent code execution by running agents in attested, kernel-enforced VM runners with hardened communication boundaries.

Motivating Problem

GitHub’s product offering of agentic workflows --- letting customer agents run inside the GitHub-hosted CI environment with access to repositories, secrets, and the GitHub API --- has a security profile dramatically different from a general E2B sandbox. The agent runs against the customer’s real code, with the customer’s real credentials, in an environment connected to the customer’s production deployment pipeline. The blast radius of a successful attack is everything the customer’s CI can touch. GitHub’s substrate addresses this with attested VM runners and kernel-enforced isolation between the agent and the host.

How It Works

Each agentic-workflow run launches a fresh VM with attestation (the runner proves it’s the genuine GitHub-published image, not a tampered one). Communication between the agent inside the VM and the host runner is constrained by an explicit interface; the agent cannot make arbitrary system calls. Network egress is filtered to an allowlist; secrets injection is scoped to the specific workflow step; the workspace is wiped after every run.

The architectural deep-dive (the GitHub blog post in 2026) documents the isolation design patterns: kernel-enforced communication boundaries, attestation hooks, secrets-handling that prevents agent observation, audit logging at the host level. The patterns are educational for any team building Tier 4 sandboxing themselves, even if they’re not using GitHub’s product.

When to Use It

Customers using GitHub-hosted CI who want to add agentic workflows with appropriate safety properties. Teams architecting their own Tier 4 sandbox solution --- read the design as a reference.

Alternatives --- E2B or Modal for the lower-tier microVM cases. Self-built trusted-VM-runners on hyperscaler infrastructure (Nitro Enclaves on AWS, Confidential VMs on GCP, Confidential Computing on Azure) when the agent needs to run on the customer’s own infrastructure with comparable isolation guarantees.

Sources

  • github.blog Under the hood: security architecture of GitHub Agentic Workflows

Source: modal.com/docs/guide/sandbox

Classification Tier 3 sandbox with GPU support.

Intent

Provide AI-agent sandboxes with the same isolation tier as E2B but with the full Modal compute surface --- GPUs, custom images, persistent volumes, scheduled functions.

Motivating Problem

E2B handles general-purpose code execution well. Workloads that need GPUs (training a small model, running CUDA kernels, fine-tuning) outgrow E2B’s headline product surface. Modal’s sandbox primitives provide the same isolation tier with GPU access from T4 through H100 and the broader Modal serverless surface for everything else.

How It Works

Detailed in Tools Catalog Section C. Sandbox.create() launches an ephemeral container with file-system snapshots; sandboxes can stay warm for up to 7 days; the agent calls the sandbox like a local Python function while Modal handles container lifecycle.

When to Use It

GPU-heavy or memory-heavy AI agent workloads. Teams already using Modal for non-AI workloads. See Tools Catalog Section C for fuller treatment.

Sources

  • modal.com/docs/guide/sandbox

Section C — Agent identity and federation

How agents authenticate, get authorized, and have their identity travel across clouds

Chapter 4 of Part 1 framed the agent identity problem. This section documents three representative substrates: Strata’s Agent Fabric product (the commercial Identity Orchestration angle), SPIFFE/SPIRE (the open-source workload-identity standard that’s become the de facto answer for agents inside the data center), and OAuth-for-agents patterns (PKCE, dynamic client registration, RFC 8693 token exchange) for the cross-cloud federation case.

The category is moving fast. None of these substrates is fully settled; expect significant evolution over 2026—2027. The pattern of “give every agent a verifiable identity object, short-lived tokens, registry-backed audit” is stable across all the contenders.

Strata Agent Fabric (Identity Orchestration)

Source: strata.io/maverics-platform/identity-orchestration-for-ai-agents/ (commercial product)

Classification Commercial agent fabric identity control plane.

Intent

Provide an identity security control plane purpose-built for AI agents: discovery, registry, OAuth scope auditing, IDP binding, federated trust across clouds and platforms.

Motivating Problem

AI agents in an enterprise span many platforms: ChatGPT in Azure, LangChain in AWS, CrewAI on-premises, agents in GitHub Actions. Each has its own identity model; none of them surface to the security team as a coherent population. The Strata Agent Fabric makes the agent population visible: every agent gets discovered as it spins up, registered with its identity binding and scopes, categorized by risk, and audited like a human user.

How It Works

The Strata platform sits alongside the customer’s identity fabric (covering humans and IDPs) and app fabric (covering applications), connected through Identity Orchestration. Agents are discovered programmatically across LLM frameworks, API runtimes, CI/CD environments, and cloud-native services. Each discovered agent gets a registry entry tracking: agent identity in the IDP (Entra, Okta, CyberArk, Descope, Transmit, Auth0, Keycloak), scopes and permissions, intent and function, TTL and revocation, audit trails, risk level.

Federated trust is enforced via Identity Orchestration: when an agent in one cloud authenticates against a service in another, the orchestration layer evaluates the request against the agent’s scopes, the target service’s policy, and the current risk profile. Denied requests don’t reach the target service. Approved requests are logged with the full identity chain.

Strata’s framing of the three fabrics (identity / app / agent) is now widely cited as the canonical architecture for enterprise agent identity. The product is the commercialization of that framing.

When to Use It

Enterprises operating agents across multiple clouds and IDPs who need unified identity governance and audit. Cases where the security and compliance teams need a single source of truth about which agents exist and what they’re permitted to do. Industries with strict identity requirements (financial services, healthcare, government).

Alternatives --- cloud-vendor IDPs alone (Entra, Okta) when the agent population fits inside one platform; building your own registry on top of SPIFFE/SPIRE when the team prefers open-source self-hosting. The decision is mostly whether the commercial control plane is worth the cost relative to building.

Sources

  • strata.io/maverics-platform/identity-orchestration-for-ai-agents/

  • strata.io/blog/agentic-identity/agent-fabrics-registries-central-2b/

SPIFFE / SPIRE workload identity

Source: spiffe.io (CNCF graduated project; Apache-2)

Classification Workload identity standard and reference implementation.

Intent

Issue cryptographically-attested, short-lived identities to workloads (containers, VMs, agents) based on attested platform properties, enabling identity-based authentication without long-lived secrets.

Motivating Problem

Long-lived secrets are the enemy of secure agent deployment: they leak, they don’t rotate, they accumulate scope over time. SPIFFE’s answer is to bind identity to the platform: the agent’s container, node, namespace, and image hash become the inputs to identity issuance. SPIRE (the SPIFFE reference implementation) attests these properties at workload start and issues a short-lived SVID (SPIFFE Verifiable Identity Document) that the agent uses to authenticate. No secret to leak; the identity is recomputed every time.

How It Works

A SPIRE server runs in the cluster, holding the trust domain and the issuance policy. SPIRE agents on each node attest local workloads against the server’s registration entries (which describe how to identify a legitimate workload --- e.g., “running as UID 1000 in namespace foo with image hash X”). Attested workloads receive SVIDs (X.509 certificates or JWTs); these are short-lived (minutes to an hour) and auto-rotated.

The pattern works across clouds via SPIFFE federation: trust domains in different clouds can be configured to trust each other’s SVIDs, allowing an agent in one cloud to authenticate against a service in another without going through a central IDP. Major workload-identity products (AWS IRSA, GCP Workload Identity, Azure Workload Identity Federation) are SPIFFE-compatible.

When to Use It

Self-hosted or on-prem agent fabrics where the team owns the identity issuance. Kubernetes-shaped deployments where every pod can be a SPIFFE workload. Cases where eliminating long-lived secrets is a top operational requirement.

Alternatives --- cloud-vendor workload identity products (IRSA, GCP WI, AWS Workload Identity Federation) when running on a single cloud and the team doesn’t need cross-cloud portability. Strata Agent Fabric when the requirement is identity governance with a UI rather than identity issuance.

Sources

  • spiffe.io

  • github.com/spiffe/spire

OAuth-for-agents patterns (PKCE, DCR, RFC 8693)

Source: OAuth 2.1, RFC 7636 (PKCE), RFC 7591 (Dynamic Client Registration), RFC 8693 (Token Exchange)

Classification Standards-based authentication patterns for agents.

Intent

Use existing OAuth 2.x standards --- specifically PKCE, Dynamic Client Registration, and Token Exchange --- to handle the agent’s authentication and authorization story without inventing new protocols.

Motivating Problem

Many agent platforms have re-invented authentication for agents from scratch, producing one-off integrations per agent / per service. OAuth 2.x already has the primitives the agent case needs: PKCE (RFC 7636) for the public-client case where a long-lived secret is impractical; Dynamic Client Registration (RFC 7591) for agents that spin up at runtime and need to register with an authorization server; Token Exchange (RFC 8693) for the cross-cloud federation case where a token in one trust domain becomes a token in another. The pattern is to use these in combination, not to invent something new.

How It Works

PKCE replaces the static client-secret with a per-request proof of possession (code_verifier and code_challenge), so an agent without a stable secret can still complete the OAuth dance. Dynamic Client Registration lets a fabric programmatically register new agents with an authorization server, getting back a client_id without manual provisioning. Token Exchange takes a token issued in one domain (an agent’s SPIFFE SVID, say) and returns a token valid in another domain after policy evaluation.

The pattern in production: an agent gets a SPIFFE SVID from SPIRE on startup; uses Dynamic Client Registration with an authorization server (Entra, Okta, Auth0, Keycloak) to register itself; performs a PKCE flow to obtain access tokens for downstream services; uses Token Exchange when reaching across clouds. The OAuth-Protected MCP Server pattern (e.g., the AWS Bedrock AgentCore MCP integration referenced by Strata) is the canonical example: short-lived JWTs (24-second TTL), scoped to specific MCP operations, exchanged via RFC 8693 from the agent’s native identity.

When to Use It

Anywhere standards-based agent authentication is preferred over vendor-specific schemes. The default for agent-to-MCP-server authentication. Any cross-cloud federation case where Token Exchange is the right primitive.

Alternatives --- mTLS-only for service-mesh-internal traffic where the cert is the identity. Vendor-specific IAM (AWS Sig V4, Azure AD JWT) when staying inside one cloud and the standards add unnecessary complexity.

Sources

  • datatracker.ietf.org/doc/html/rfc7636 (PKCE)

  • datatracker.ietf.org/doc/html/rfc7591 (Dynamic Client Registration)

  • datatracker.ietf.org/doc/html/rfc8693 (Token Exchange)

Section D — Service mesh and networking

How agents reach each other and the services they consume

The networking layer is the least-developed of the five fabric sub-concerns as of mid-2026. Two patterns are emerging: traditional service-mesh products (Istio, Linkerd, AWS App Mesh) applied to inter-agent traffic, which transfers cleanly but doesn’t add agent-specific features; and purpose-built agent networking layers like Pilot Protocol that issue agents encrypted tunnels and virtual addresses for communication that doesn’t expose routable IPs. The category will likely consolidate over the next 18 months.

Service mesh for agents (Istio / Linkerd / App Mesh)

Source: istio.io, linkerd.io, aws.amazon.com/app-mesh

Classification General service mesh applied to inter-agent traffic.

Intent

Use a standard service mesh (Istio, Linkerd, App Mesh, Consul Connect) to provide mTLS, traffic management, and observability for agent-to-agent and agent-to-service communication.

Motivating Problem

For agent fabrics on Kubernetes, the network sub-concern is mostly solved by adopting a service mesh that’s already running for the rest of the platform. mTLS between agents and services, traffic shifting for canary deploys of new agent versions, fine-grained authorization policies tied to workload identity --- the same patterns the service mesh provides for microservice traffic transfer directly to agent traffic.

How It Works

A sidecar proxy (Envoy in Istio, the Linkerd2-proxy in Linkerd) injects into each agent pod and handles mTLS, observability, and policy enforcement. The agent code remains unaware of the mesh; identity, encryption, and routing are operational concerns of the mesh layer. SPIFFE-compatible meshes (Istio, Consul Connect) integrate cleanly with the workload-identity story from Section C.

Specific agent-adjacent features are starting to appear: AuthorizationPolicies tied to OAuth scopes the agent holds; per-agent rate limiting; tail-sampled traces that capture the full inter-agent call chain. These are mostly natural extensions of existing mesh features, not separate products.

When to Use It

Kubernetes-shaped agent fabrics where a service mesh is already running. Teams that want mTLS between agents without writing the certificate-management code themselves. Multi-cluster agent deployments where the mesh handles cross-cluster traffic.

Alternatives --- cloud-vendor mesh products (App Mesh, GCP Anthos Service Mesh) when on a single cloud. mTLS hand-rolled into each agent when the operational tax of a service mesh isn’t justified.

Sources

  • istio.io

  • linkerd.io

Pilot Protocol (purpose-built agent networking)

Source: Referenced in kyrolabs/awesome-agents as an agent networking standard

Classification Agent-specific network layer.

Intent

Issue agents encrypted tunnels and virtual addresses so they can communicate across networks and clouds without exposing routable IPs or requiring traditional service-discovery infrastructure.

Motivating Problem

Agents that run on personal devices, behind NATs, or across multiple clouds face the same problem peer-to-peer applications have always faced: how do two agents talk to each other when neither is directly reachable. Service mesh products assume Kubernetes; cloud-vendor mesh products assume one cloud. The Pilot Protocol and similar emerging standards target the heterogeneous case: an agent on a developer’s laptop talking to an agent in a partner organization’s cloud, with neither side needing to expose a public IP.

How It Works

Each agent registers with the Pilot Protocol relay infrastructure and receives a virtual address (analogous to a Tailscale or Twingate address). Connections between agents flow through the relay with end-to-end encryption; the relay never sees plaintext. The pattern is essentially WireGuard-style mesh networking adapted for the agent-population shape.

The protocol is one of several emerging in this space; the standard hasn’t consolidated as of mid-2026. The pattern --- give every agent a virtual address, route through an encrypted overlay, expose nothing on the public internet --- will likely become a standard fabric primitive within a year or two.

When to Use It

Heterogeneous agent deployments where agents span personal devices, multiple clouds, and partner organizations. Privacy-sensitive workloads where exposing agent IPs is unacceptable. Early adopters who can tolerate moving standards.

Alternatives --- Tailscale or Twingate for the same shape without agent-specific framing. VPN-and-firewall for the classic enterprise pattern. Wait-and-see when the use case isn’t pressing --- the category will consolidate.

Sources

  • github.com/kyrolabs/awesome-agents

Section E — Registries and discovery

How agents find each other, find their tools, and find what they’re authorized to do

The registry is the fabric’s answer to discovery. Three kinds matter operationally: tool registries (the MCP Registry and its alternatives) which tell agents what tools exist, agent registries (the Strata-style agent fabric registry from Chapter 4) which tell the platform what agents exist, and capability catalogs which tell governance what each agent is authorized to do. The first two are covered in detail in the prior volumes; the third is emerging.

MCP Registry (cross-reference)

Source: registry.modelcontextprotocol.io

Classification Tool registry maintained by the MCP community.

Intent

Index verified, production-grade MCP servers and tools so agents can discover capabilities at runtime.

Motivating Problem

Covered in detail in Tools Catalog Appendix D. The relevant fabric-level point: the MCP Registry is the canonical example of a public agent-facing registry. Other categories (agent identity registries, capability catalogs) are evolving to match similar patterns --- a public hub indexes the population, vendors publish their offerings, and clients (agents) consume the index at runtime to discover what’s available.

How It Works

Detailed in Tools Catalog. The fabric-level question is whether your agent platform connects to a public registry, hosts a private one, or both. Most production deployments use both: a public registry for community-published tools and a private registry for internal tools and policy.

When to Use It

Discovery of MCP servers and tools. See Tools Catalog Appendix D for the full discussion.

Sources

  • registry.modelcontextprotocol.io

Agent fabric registry (private)

Source: Pattern from Strata and from the AWS Arbiter Fabric Index

Classification Private agent registry for identity, scope, and capability tracking.

Intent

Maintain a source of truth for which agents exist in the deployment, their identity bindings, scopes, intent, TTL, audit trail, and risk level.

Motivating Problem

Public tool registries solve the discovery half. The other half is the agent population itself: who are the agents running right now, who do they act on behalf of, what scopes do they hold, when do their tokens expire, what have they done. Without this, the agent population is invisible to security and operations. With it, the agent fabric becomes auditable and governable.

How It Works

The implementation depends on the platform. The Strata Agent Fabric provides this as a commercial product with a UI. The AWS Arbiter Fabric Index is a DynamoDB table (the agent_register table) tracking identity bindings, scopes, and capability metadata, with the Control-Surface Band reading from it on every dispatch. Roll-your-own options exist on top of any datastore.

The shape across implementations: a per-agent record with identity (bound to IDP), scopes (current OAuth grants), intent (“purchase assistant”, “build bot”), TTL (when the agent’s authorization expires), audit trail (recent actions), risk (computed from behavior). The registry is consulted by governance, by observability, by compliance reporting; it’s a hub artifact.

When to Use It

Any agent fabric serious enough to need governance and audit. Required for regulated workloads. Increasingly required for non-regulated workloads as agent populations grow past dozens.

Alternatives --- putting up with invisibility for prototypes and pilots. The registry is the foundational compliance artifact for production deployments.

Sources

  • strata.io/blog/agentic-identity/agent-fabrics-registries-central-2b/

  • github.com/aws-samples/sample-agentic-fabric (see agent_register table)

Section F — Kubernetes-shaped fabrics

Agent fabrics that adopt Kubernetes as the substrate --- KEDA, Bedrock AgentCore

Kubernetes inherits most of the operational story for agent fabrics built on it: networking via service mesh, identity via workload identity, observability via the same OpenTelemetry stack as the rest of the platform. The agent-specific additions are mostly around event-driven scaling (so an agent population can grow to handle bursts without overprovisioning) and around agent-as-pod patterns (each agent is a pod with the right policies, identity, and sandbox). Two representative substrates: KEDA for event-driven autoscaling and AWS Bedrock AgentCore as the managed agent-runtime offering.

KEDA (Kubernetes Event-driven Autoscaling)

Source: keda.sh (CNCF graduated; Apache-2)

Classification Kubernetes scaler driven by external events.

Intent

Scale Kubernetes deployments up and down in response to external event sources (queue depth, message backlog, HTTP request rate) rather than CPU/memory metrics alone.

Motivating Problem

Agent fabrics on Kubernetes face a load shape that traditional HPA (Horizontal Pod Autoscaler) handles poorly: bursts of work arrive as events on a queue or stream, the pod count needs to follow the queue depth rather than CPU, and idle periods should scale to zero. KEDA solves this with a scaler-per-event-source model: a SQS scaler watches an SQS queue, a Kafka scaler watches a Kafka topic, a Cron scaler triggers on schedule, and so on. Pod counts follow the event source’s metric.

How It Works

A ScaledObject resource binds a Kubernetes deployment to one or more scalers. Each scaler watches an external metric source and reports the current value; KEDA scales the deployment according to the desired-pods-per-metric-value relationship. Scale-to-zero is supported (the deployment scales to zero replicas when there’s no work, then back up when new events arrive). Over 70 built-in scalers cover most cloud and on-prem event sources.

For agent fabrics: the trigger that activates an agent (from Vol 4) is often the same event source that should drive its scaling. A SQS-triggered worker agent should have its pod count follow the SQS queue depth, not its CPU. KEDA makes this declarative.

When to Use It

Kubernetes-based agent fabrics with event-driven workloads. Cases where the agent pool size should follow the work backlog. Scale-to-zero workloads (cost-sensitive deployments).

Alternatives --- HPA when the metric is CPU/memory-shaped. Cloud-specific autoscalers (AWS Application Auto Scaling, Azure Service Bus auto-scale) when not on Kubernetes. Manual sizing when the load is predictable.

Sources

  • keda.sh

  • github.com/kedacore/keda

AWS Bedrock AgentCore

Source: aws.amazon.com/bedrock/agentcore (managed agent runtime)

Classification Managed agent runtime on AWS.

Intent

Provide a managed runtime for AI agents on AWS with sandbox isolation, identity, observability, and integration with the AWS Bedrock model surface --- the equivalent of a managed Kubernetes for agents.

Motivating Problem

For teams that don’t want to operate their own agent fabric on EKS or build their own substrate, AgentCore is AWS’s managed offering: bring agent code (written against the Strands Agents SDK or a compatible framework), deploy to AgentCore, get sandbox isolation, identity into AWS IAM, traces into X-Ray, and integration with Bedrock’s model surface and Knowledge Bases. The operational tax shifts from the agent platform to AWS.

How It Works

AgentCore runs agents in managed sandboxes with the same isolation properties as E2B-style microVMs. Each agent is identified to AWS IAM (visible as a principal in audit logs); cross-service calls go through standard AWS auth. The Bedrock Converse API is the model surface; Knowledge Bases provide RAG; Guardrails provide content moderation.

Integration with the broader AWS ecosystem is the differentiator. AgentCore plugs into EventBridge for event triggers, into S3 for file ingestion, into Step Functions for orchestration, into Bedrock Knowledge Bases for retrieval. The agent code is mostly the same as it would be elsewhere; the operational surface is AWS-shaped.

When to Use It

Teams running on AWS who want managed agent infrastructure. Cases where the AWS-native integration is more valuable than cross-cloud portability. Workloads where shifting operational tax to AWS is worth the cost.

Alternatives --- self-hosted on EKS for full control. The AWS Arbiter Sample (Section A) when the governance design is critical. Cross-cloud frameworks (LangGraph, CrewAI) when portability matters more than AWS-native integration.

Sources

  • aws.amazon.com/bedrock/agentcore

Section G — Personal and modular fabrics

Daniel Miessler’s Personal AI Infrastructure and Fabric Prompts Framework

Two related open-source projects by Daniel Miessler occupy a distinctive niche: agent fabric scoped to one person, designed to capture the human’s complete context (notes, calendars, conversations, search history) into a coherent system that AI agents can operate against. The Personal AI Infrastructure repo (PAI) is the systems layout; the Fabric Prompts Framework is the modular CLI tool for composing reusable prompts and workflows. Together they’re a working example of agent fabric for individuals --- different operational concerns than enterprise fabric, but the same conceptual layer.

Personal AI Infrastructure (Daniel Miessler)

Source: github.com/danielmiessler/Personal_AI_Infrastructure

Classification Personal-scale agent fabric architecture.

Intent

Capture a single human’s complete context (notes, calendar, communications, search history, browsing) into a unified data substrate that AI agents can operate against, with explicit memory lifecycles, context-priming pipelines, and background scripts that maintain the substrate over time.

Motivating Problem

Enterprise agent fabrics solve the multi-user, multi-cloud problem. The single-user problem has different requirements: there’s only one principal (you), there’s no IDP federation, the substrate runs on a personal device or a personal cloud instance, and the dominant question is how to make a lifetime’s worth of personal context queryable by agents without losing privacy or coherence. PAI proposes one answer: a layout, a set of conventions, and a small set of scripts that together constitute a personal fabric.

How It Works

PAI is a repository structure plus a set of conventions, not a deployed product. The structure organizes the user’s context into modular concerns: identity (who you are, your stated goals and values), memory (long-term notes and observations), context priming (how to brief an agent on a task), background scripts (cron-style maintenance jobs that keep the substrate up to date). The Fabric Prompts Framework (separate repo, below) is the canonical companion for prompt composition.

Agents --- typically the user’s own Claude Code or local LLM --- operate against the PAI substrate by reading the appropriate context and producing actions. The user manages the substrate manually for the parts that matter and lets agents update the parts that don’t. The pattern’s strongest property is its transparency: everything lives in plaintext markdown the user can read and edit.

When to Use It

Individuals who want to invest in a personal agent fabric and are willing to maintain it manually. Power users of AI tools who want their context to follow them across applications. Privacy-conscious users who want to run agents against personal data without putting it on a vendor’s cloud.

Alternatives --- commercial personal AI products (Rewind, Personal AI, Friend) when the maintenance burden of PAI is unappealing. The Anthropic memory tool (Tools Catalog Section A) when one-conversation memory is enough.

Sources

  • github.com/danielmiessler/Personal_AI_Infrastructure

  • danielmiessler.com/blog/personal-ai-infrastructure

Fabric Prompts Framework

Source: github.com/danielmiessler/fabric (open-source; Go)

Classification Modular prompt composition framework for CLI use.

Intent

Layer highly reusable AI structural contexts (“patterns”) and workflows natively over command-line interfaces, with each pattern shipping as a self-contained markdown file.

Motivating Problem

LLM prompts in practice become a mess: long, ad-hoc strings sprinkled through scripts and config files, hard to reuse, hard to version, hard to share. Fabric’s answer is to treat prompts as composable, versionable units: each pattern is a directory with a single SYSTEM file describing the task, the model, and the expected output; the CLI runs any pattern against any input with one command. The community has contributed over 200 patterns covering summarization, extraction, analysis, and code review tasks.

How It Works

Install the fabric CLI. Patterns live in a versioned patterns/ directory; each is a markdown file. Running fabric —pattern summarize_essay —input my_document.md pipes the document through the named pattern and returns the LLM’s output. Patterns can be chained: fabric —pattern A | fabric —pattern B to compose multiple structural transforms.

The framework deliberately stays at the CLI layer rather than building a platform. Integration with PAI is via shell scripts and pipes; integration with other systems is via stdin/stdout. The simplicity is the design feature.

When to Use It

Personal AI workflows where reusable prompts matter and CLI composition is preferred. Pipelines where prompts need to be versioned alongside other code. Teams that want a lightweight alternative to LangChain-style prompt management.

Alternatives --- LangChain prompt templates when the rest of the stack is LangChain-shaped. PromptLayer for prompt versioning with a UI. Vanilla shell scripts when the workflow is small enough.

Sources

  • github.com/danielmiessler/fabric

  • helpnetsecurity.com/2024/02/14/fabric-open-source-ai-framework/

Section H — Curation hubs

Community-maintained indexes of the agent fabric ecosystem

Two community-maintained lists cover the agent fabric ecosystem broadly. mfornos/awesome-agentic-ai catalogs open-source standards, compute specifications, and coordination fabric infrastructures with comprehensive scope. kyrolabs/awesome-agents focuses on the agent ecosystem with attention to networking and communication layers (Pilot Protocol and similar). Both are useful as discovery tools; like all awesome-X lists, they’re best treated as starting points for further evaluation, not as endorsements.

Awesome Agentic AI (mfornos)

Source: github.com/mfornos/awesome-agentic-ai

Classification Community-maintained agent infrastructure index.

Intent

Catalog open-source standards, compute specifications, and coordination fabric infrastructures for the agent ecosystem.

Motivating Problem

The agent fabric ecosystem moves faster than any one document can track. mfornos’s list has emerged as the most actively-maintained breadth-first index of fabric-relevant projects, covering standards bodies, reference implementations, identity layers, sandboxing options, and coordination patterns.

How It Works

A categorized list in a single README.md, updated by PR. Categories cover standards (MCP, A2A, AAIF), platforms (LangGraph, CrewAI, AutoGen, etc.), runtimes (E2B, Modal), and identity (SPIFFE, Strata). The cataloging is descriptive rather than evaluative.

When to Use It

Initial discovery when starting to build an agent fabric. Quarterly check-ins to see what’s new. Sanity check on whether a category has consolidated or remains fragmented.

Sources

  • github.com/mfornos/awesome-agentic-ai

Awesome Agents (kyrolabs)

Source: github.com/kyrolabs/awesome-agents

Classification Community-maintained agent ecosystem index with networking focus.

Intent

Index the agent ecosystem with particular attention to networking, communication, and control-plane infrastructure.

Motivating Problem

The networking sub-concern of agent fabric is under-represented in most awesome-X lists. kyrolabs/awesome-agents fills that gap, indexing protocols (Pilot Protocol), control planes for orchestration, virtual-address systems, and identity bridges. Useful when the question is specifically about the network layer.

How It Works

Same shape as the mfornos list --- categorized README maintained by PR. The networking and control-plane categories are more developed than in the mfornos list; the framework / runtime categories are comparable.

When to Use It

Discovery for networking-layer agent components. Cross-referencing with mfornos for breadth.

Sources

  • github.com/kyrolabs/awesome-agents

Appendix A --- Fabric Sub-layer Reference Table

Cross-reference between the five fabric sub-layers and their representative technologies. Use this to quickly orient on which component answers which question.

Sub-layerAnswersRepresentative technologies
ComputeWhere agents runContainers (K8s), microVMs (Firecracker), Lambda, Bedrock AgentCore, Container Apps
IdentityWho agents areSPIFFE/SPIRE, IRSA, Entra Workload Identity, OAuth 2.x (PKCE, RFC 8693), Strata Agent Fabric
NetworkHow agents talkIstio/Linkerd/App Mesh (mTLS), Pilot Protocol (encrypted tunnels), Tailscale/Twingate
RegistryHow agents are foundMCP Registry, agent fabric registry (Strata, AWS Fabric Index), capability catalogs
SandboxHow agents are containedProcess (T1), Container (T2), microVM (T3, E2B/Modal), Trusted-VM-Runner (T4, GitHub), Dedicated HW (T5)

Appendix B --- The Five-Volume Series Complete

This catalog completes a five-volume series describing agentic AI at five levels of abstraction. Read top-down (Vol 1 → Vol 5) for the conceptual flow; read bottom-up (Vol 5 → Vol 1) for what an architect builds first.

  • Volume 1 --- Patterns of AI Agent Workflows --- the temporal patterns by which LLM calls, tools, and sub-agents compose. The vocabulary for how an agent run is structured in time.

  • Volume 2 --- The Claude Skills Catalog --- the SKILL.md-format instruction packs that tell the model when and how to use tools. The vocabulary for the model’s domain knowledge.

  • Volume 3 --- The AI Agent Tools Catalog --- the function-calling primitives that the agent invokes. The vocabulary for what the agent can do.

  • Volume 4 --- The AI Agent Events & Triggers Catalog --- the mechanisms by which agents are activated. The vocabulary for what makes the agent run.

  • Volume 5 --- The AI Agent Fabric Catalog (this volume) --- the infrastructure substrate beneath orchestration. The vocabulary for where the agent lives.

The five levels are independent but composable. A production agent system makes choices at all five: a pattern (e.g. evaluator-optimizer from Vol 1), the skills that load when relevant (Vol 2), the tools the agent invokes (Vol 3), the events that trigger the agent in the first place (Vol 4), and the fabric on which all of it runs (this volume). Re-reading the prior volumes through the fabric lens is recommended once: most operational concerns documented across the series ultimately resolve to fabric properties.

A second observation worth keeping: the five layers don’t evolve at the same rate. Patterns are essentially stable (the same evaluator-optimizer loop will look much the same in 2027). Skills, tools, and events evolve quarterly. Fabric --- especially the identity and governance sub-layers --- is where the most active design work is happening in 2026, and where the catalog is most likely to be partially obsolete within a year. Plan for evolution at the fabric layer; expect stability above it.

Appendix C --- The Constitutional Substrate Pattern

The Architecting Autonomy series by Aaron Sempf (aaronsempf.substack.com), which produced the AWS Sample Agentic Fabric, articulates a design pattern worth abstracting from the AWS specifics. The pattern can be applied to any agent fabric regardless of cloud or framework choice; the four governance primitives carry over directly.

  1. Authority as a first-class primitive --- not an afterthought. Every action an agent can take must trace back to an explicit Authority Scope. Implicit permissions accumulate ungoverned overreach; explicit permissions surface as denied requests when they’re missing, which is a feature.

  2. Deterministic governance, not LLM-based oversight. The mechanism that decides what’s allowed should be a small, auditable, deterministic engine --- not another LLM. LLMs are appropriate for many things; deciding whether an action is authorized is not one of them.

  3. Residual authority defaults to denial. Actions not covered by an Authority Scope are denied. The operational friction of “surfacing every gap” is the price of structural correctness.

  4. Write-once audit (the Legibility Ledger). Every governance decision --- permit, deny, escalate --- is logged immutably. Compliance, debugging, and the case-law mechanism all depend on this artifact.

  5. Case law for evolution. Conflicts the engine can’t resolve escalate to humans; the resolution is encoded as case law and applied deterministically on recurrence. The system’s governance improves through accumulated cases rather than through code changes.

The pattern is one design philosophy among several; it’s not the only valid answer. But the four primitives --- Authority Scopes, Composition Contracts, Arbitration Patterns, Legibility Ledger --- are sufficiently universal that any agent fabric serious about governance should have analogues, whether or not it adopts the names. The AWS Arbiter source (src/governance/) is the recommended reference implementation to study.

Appendix D --- Discovery and Awesome Lists

Hubs that consolidate the current state of the agent fabric ecosystem:

  • mfornos/awesome-agentic-ai --- broad index of open-source standards, compute specifications, and coordination fabric infrastructures.

  • kyrolabs/awesome-agents --- networking-focused index with attention to encrypted tunnels (Pilot Protocol) and persistent control planes.

  • GitHub Topic: agentic-orchestration --- the official tag aggregating new orchestration repos; relevant to the fabric layer when the orchestrator dictates fabric choices.

  • CNCF Cloud Native AI Landscape --- the CNCF’s map of cloud-native AI projects; useful for the substrate-as-CNCF-project view.

  • Cloud-vendor reference architecture catalogs --- aws.amazon.com/architecture, learn.microsoft.com/azure/architecture, cloud.google.com/architecture --- increasingly include agent-specific reference designs.

Three pragmatic rules. First, the fabric layer moves faster than any one list can track --- use lists for discovery, then go to the source repository to evaluate currency. Second, agent fabric is the part of the stack where vendor incentives create the most lock-in risk; prefer projects with portable identity and observability stories even if specific compute choices are vendor-specific. Third, governance and identity are where investment compounds; investing in either pays off across many agent generations, while specific compute choices tend to be replaceable.

Appendix E --- Omissions

This catalog covers about 18 substrates across 8 sections. The agent fabric ecosystem is wider; a non-exhaustive list of what isn’t here:

  • General Kubernetes and cloud-native infrastructure (Helm, ArgoCD, Crossplane, Terraform) when used without an agent-specific framing. The patterns transfer but the agent-specific angles don’t.

  • Standalone IDPs (Okta, Entra, Auth0, Keycloak, CyberArk, Descope, Transmit) as products. They’re components of agent-fabric solutions but the IDP itself isn’t the agent-fabric story.

  • Service mesh products in isolation (Istio, Linkerd, Consul Connect) when the agent angle is minimal. They appear in Section D as fabric components, not as standalone entries.

  • Closed agent platforms that bundle the fabric with a product surface (OpenAI Assistants, Anthropic Managed Agents, vendor-specific copilot platforms). The fabric properties are real but inaccessible separately.

  • Specialized agent fabrics for niche compute (edge agents, mobile agents, IoT agents). Each has its own substrate considerations; the patterns transfer but the inventory is large.

  • Vendor-specific identity products in the agent space (Auth0 Fine Grained Authorization, Cerbos, OPA-as-a-product) when the OAuth-pattern entry in Section C covers the architectural concept.

Appendix F --- A Note on the Moving Target

Anthropic published MCP in November 2024. The Linux Foundation’s AAIF took over MCP governance in December 2025. AWS published the Sample Agentic Fabric (Arbiter) in 2026. Microsoft’s Agent Framework reached production status in 2025—2026. Strata’s Agent Fabric framing entered the canonical conversation in mid-2025. The Constitutional Substrate design pattern is months old as of this writing. The category is moving fast; this catalog captures a snapshot in mid-2026.

The deepest structural fact to internalize: the five sub-layers --- compute, identity, network, registry, sandbox --- are stable even though specific implementations move. An agent fabric always answers the same five questions about where agents run, who they are, how they talk, how they’re found, and how they’re contained. Choose well at each sub-layer and the specific product becomes replaceable.

Combined with the five volumes of the series --- Patterns, Skills, Tools, Events, Fabric --- a working architect now has the vocabulary to talk about agentic AI at every layer: timing, instruction, primitive, activation, substrate. The technical landscape will keep shifting; the conceptual vocabulary will not.

--- End of The AI Agent Fabric Catalog v0.1 ---

— End of the Five-Volume Series —