Skip to main content
In plain terms: The AI cannot do whatever it wants. Every move it makes is a defined from a fixed, versioned catalog. There is no “prompt-the-model-and-see-what-happens” path anywhere in Forge.

What a capability is

A is a named, versioned, rule-bound move the AI is allowed to make. Each one has:
  • a typed input (for example: “here is a fuel system geometry, here are the candidate USCG sections”)
  • a typed output (for example: “here is the applicability assessment with citations”)
  • a confidence threshold
  • a risk tier
  • an autonomy bound (observe-only / draft-only / propose-with-review / execute-after-approval)
Examples: “retrieve prior similar designs,” “draft a rule applicability assessment,” “compute a clearance,” “propose a material substitution.”

What agents can and cannot do

Agents in Forge string capabilities together, but they are orchestrators over governed capabilities, not free-form prompts. They cannot:
  • invent new capabilities at runtime
  • use a capability outside its declared autonomy bound
If a task is not in the catalog, it is not doable until the catalog is extended through a governed process. Capability changes go through a review checklist (12 points) before they ship.

Why it is built this way

This is a structural defense against the failure mode where a model confidently fabricates a result because nothing in the architecture would catch it. By making every AI action the invocation of a catalog entry, behavior becomes boxable, auditable, and reversible. If a capability behaves badly, it gets rolled back like any other versioned software change, not patched at runtime with a different prompt.

Confidence and abstention (L5.5)

Sitting just above the catalog is the confidence filter. It catches low-confidence outputs before they reach formal verification. The posture is “abstain over assert.” If the agent is not confident, it asks for more context, retrieves more sources, or routes to human review. This is , and it is treated as a quality behavior with a target above 98%.

What L5 outputs

  • capability_invocation
  • agent_proposal
  • structured_output
  • plus, from L5.5: confidence_score, abstention_reason, retry_context_request