Summary
We turned the rules-and-retrieval foundation into governed engineering decisions. The system can now answer “which rules apply to this vessel,” with citations, and record that answer as a signed, auditable decision. We shipped compliance-gap analysis that only ever proposes findings for a human, closed the loop from AI proposal to human-approved signed record, built the safeguard for when a rule changes, and started turning the interpretive layer a shipyard owner described into real structure. With that, M2, governed decisions, is complete.What we shipped
Milestone M2.- A citation manager that guarantees every rule claim links back to its source, enforced as a gate, plus rules search that returns the likely-applicable rules for a vessel ranked by relevance, each with its citation and confidence, completing the Rules & Retrieval milestone.
- Prior-vessel memory: a yard’s own past vessels captured as private, searchable precedent, the basis for “we have built something like this before.”
- The decision ledger made append-only and tamper-evident, with a gate runner that blocks rather than silently skips and a single choke point for emitting a decision.
- Compliance-gap analysis: the AI compares what a rule requires against the evidence a vessel actually has and flags the gap as a possible finding for review. It never issues a compliance verdict on its own. A companion evidence-matching capability lines up existing evidence against what each requirement asks for. Both are review-only and citation-checked.
- A review queue and promotion path: a proposal lands in a yard-owned staging area, a person reviews it, and only an approved item becomes a permanent record and signed engineering decision, written through the single choke point that enforces the gates, the human approval, and the signature together.
- The first operator signing tool: a command-line surface where a real person enrolls as a signer, signs an approved decision, and verifies that it replays exactly from what was stored. Local key custody keeps each signer’s keys on disk behind a swap-able seam (for managed or hardware-backed storage later). One signature covers the whole record: who decided, who approved, and what was approved.
- A supersession record and re-evaluation queue: when a rule is superseded, every signed decision that relied on it is found and flagged for re-approval, matched by a stable citation path that survives renumbering. The original signed decision is never edited. Its as-approved state is kept forever and the change is recorded alongside it. Routing is graded: some superseded rules grandfather automatically, others raise an engineer-review task, others force a re-approval.
- The start of the interpretive layer as structure: authority tiers on every source (codified regulation, NVIC, marine technical note, policy letter, Marine Safety Manual), interpretation links recording that one document interprets or clarifies another, and an interpretive-guidance reader that parses a Coast Guard circular into citable rules.
What we learned
- A confident “this complies” answer the system cannot actually prove is the exact failure we are building against. Deciding whether a rule is satisfied is a matter of judgment, so we deliberately avoided producing a tidy automatic compliance verdict.
- The official MSC Plan Review Guides are “unofficial cheat sheets” a reviewer leans on, useful as an index into the rules, never legally binding. So the structure keeps the cheat sheet and the actual authority clearly apart.
- Keeping the signed record immutable shaped the whole design. The current status of a decision lives in a separate projection, so a later change never overwrites the history of what was approved and why.
- A code review caught a cross-tenant leak where a global rule change was about to expose one yard’s list of affected decisions to others. We fixed it before it could matter and extended the immutable-isolation tests to cover the new tables.
- A design-basis letter, the engineering justification a yard writes for a novel vessel with no prescriptive rule, is already expressible as one of our signed decisions, so it needs no new artifact type.
- A second field meeting (Conrad’s design and estimating leads) gave the interpretive layer back in their own words and added that local Coast Guard offices interpret the same rule differently by region, so an interpretation can depend on the issuing office, not only on the vessel.
Blockers & open questions
- The compliance-gap capability is built and tested end to end but has little real data to work on until the Plan Review Guides are ingested and the vessel side is populated, and the vessel side depends on the geometry work in M3. Connecting evidence to the vessel it belongs to is left for M3, where design objects and CAD data live.
- Production key custody (managed key services, hardware-backed signing) is deliberately deferred. The local keystore fits the current local workflow, and the swap-able seam is what keeps a development shortcut from quietly becoming the production path.
- The signing tool consumes an already-approved decision rather than offering a full review screen. The richer review-and-approve interface waits for the structured web UI in a later milestone.
- The NVIC 9-97 proof case (structural fire protection, where reading the regulation alone gives the wrong answer) is still gated on a walkthrough with the owner. We built the structure and tested it on synthetic interpretive text; we will not claim the finished regulation-to-guidance proof until the source and walkthrough are in hand.
- How to represent regional office differences is still open. The link model has room for an office qualifier; the decision waits until the interpretation graph is actually traversed in M3.
Next week
- Begin the geometry baseline (M3), starting with the breakdown the yard actually works in: the work package against a product-oriented structure, then the data backbone for measurements.