CoverProof / Methodology

How CoverProof achieves Section 250 compliance

Six steps from SM&CR register import to PDF/A-3B board evidence pack. Every step documented, every decision auditable, every timestamp server-recorded.

TL;DR

CoverProof takes your SM&CR register, uses AI (Claude) to classify who is in scope for Section 250, requires human approval before sending any declaration, delivers declarations via zero-login unique links, tracks completion in real time, and generates a PDF/A-3B evidence pack with a SHA-256 cryptographic hash and immutable audit trail. The full process takes under 1 hour for a typical firm.

The six-step compliance process

Import your FCA register extract

2–5 minutes

Upload your SM&CR register as a CSV — the same format exported from the FCA Register Extract Service (RES). CoverProof accepts the FCA standard 17-file pipe-delimited format or a simplified CSV with individual names, roles, and function codes. The import process validates the data structure and flags malformed rows before any classification runs.

What we check: FRN consistency, duplicate IRNs, malformed function codes, missing mandatory fields. What we do not check: the accuracy of your source data — CoverProof is a classification and workflow tool, not an auditor of your HR records.

Deterministic classification pipeline against the s.250(3) test — compliance officer review required

Under 2 minutes for 500 individuals

Each individual is run through a deterministic, versioned classification pipeline — not a free-form chat with an LLM. The pipeline applies the verbatim s.250(3) statutory test (does this person play a significant role in (a) the making of decisions about how the whole or a substantial part of the activities of the organisation are to be managed or organised, or (b) the managing or organising of the whole or a substantial part of those activities?) alongside an SM&CR role taxonomy, optional FCA Register cross-reference, and a tiered confidence model with explicit thresholds. Every layer is versioned and fingerprinted; identical inputs reproduce the same verdict on demand, with one documented exception — Medium-tier rows are intentionally re-sampled three times at non-zero temperature so disagreement itself can flag them for review.

Inside the pipeline. Parameters: Claude Sonnet 4.6, temperature 0, output bound to a Zod schema (structured JSON, not free text). Versioning: every classification is pinned to a methodology version (e.g. s250-v12); the system prompt is SHA-256 hashed and a silent edit produces a mismatch the worker detects on next run. Prompt engineering: role-taxonomy guidance distinguishes SMF holders, Certification Regime employees, and out-of-scope administrators; statute extracts and reasoning scaffolding are version-controlled artefacts, not free text in a config file. Confidence tiers with explicit routing: High Confidence (auto-passed, still reviewable), Moderate Confidence (routed to your reviewer queue), Low Confidence — Review Required (mandatory review before any declaration is sent). Self-consistency check: Medium-tier rows are deliberately re-sampled three times at non-zero temperature; intra-model disagreement is the signal we want and itself escalates the row to human review. Caching: inputs are SHA-256 hashed and identical hashes return the cached verdict instead of re-invoking the model — provable equality on demand. Pre-merge gate: every methodology version is run against a versioned benchmark suite; a regression blocks the version bump. Output per individual: exposure likelihood (HIGH/MEDIUM/LOW), confidence score 0–100 and tier, reasoning steps, plain-English rationale, and uncertainty flags. Your compliance team makes the final call on every individual.

Review the gap report and approve declarations

10–30 minutes depending on register size

The gap report shows every individual's provisional AI classification — High, Medium, or Low exposure likelihood — with a plain-English explanation of the reasoning. High-exposure individuals are recommended for immediate action; Medium-confidence classifications are flagged for mandatory human review. Compliance staff can review and override any AI classification — upgrading, downgrading, or confirming the provisional rating with a documented reason for any change. Override decisions are recorded in the audit trail.

CoverProof does not send any declaration without explicit approval. The review step is mandatory. This is not a configuration option.

Send zero-login declarations

Minutes — declarations send in parallel

Once approved, CoverProof generates a cryptographically random zero-login URL for each declaration recipient. The URL contains an unguessable server-stored token (UUIDv4 via crypto) that resolves to one specific declaration; the server checks expiry, cancellation, and single-use state before accepting a submission. Recipients click the link, review the declaration text, and submit. No account creation. No password. No friction.

Declaration emails are sent via Resend with delivery tracking. CoverProof records: send timestamp (UTC), delivery confirmation, open event, completion timestamp, IP address at completion. All timestamps are server-side — not self-reported by the recipient.

Monitor completion with RAG tracking

Ongoing — dashboard view at any time

Every declaration has a RAG (Red/Amber/Green) status. Green: completed. Amber: sent but not yet completed, within the expiry window. Red: not completed and approaching or past expiry. CoverProof sends automated expiry reminders and allows one-click re-sending to non-responders. The evidence pack records every re-send attempt — creating a contemporaneous record of the firm's documented compliance activity.

Declaration expiry periods are configurable. The default is 30 days from send date. Expired declarations that have not been completed trigger an automatic alert to the compliance team.

Download your PDF/A-3B board evidence pack

Seconds — generated on demand

CoverProof generates a PDF/A-3B document (ISO 19005-3 compliant) containing: the gap analysis report with all classifications and overrides; a declaration status log showing every send, re-send, completion, and expiry event; the SHA-256 document hash recorded in the CoverProof audit database at generation time; an RFC 3161 Timestamp Token from an RFC 3161-compliant TSP providing a third-party time anchor; and an immutable XML audit log embedded in the PDF.

PDF/A-3B is the archival standard for long-term document integrity. The SHA-256 hash recorded at generation allows verification at any future point that the file has not been modified. The RFC 3161 trusted timestamp provides an independent time anchor from an RFC 3161-compliant Trust Service Provider — meaning tamper-evidence does not rely solely on CoverProof's database. These properties are designed to meet the documentation standards required in legal and regulatory proceedings. Courts ultimately determine admissibility — the pack provides the strongest evidential foundation currently achievable for this type of compliance record.

Inside the classification pipeline

Every classification is the output of a versioned pipeline with explicit rules, thresholds, and reproducibility guarantees. Here is what runs on every row before a verdict is recorded.

01 · Statutory and taxonomic anchoring

The verbatim s.250(3) functional test is embedded in the prompt as a version-controlled artefact — not free text in a config file.
SM&CR role taxonomy maps every row against SMF holders, Certification Regime employees, and out-of-scope administrators, with explicit reasoning scaffolding for each.
Optional FCA Register cross-reference verifies live function codes against the authoritative source when API access is configured.

02 · Determinism, by design

Temperature 0 — same prompt produces the same response.
Structured Zod output schema (typed JSON, not free text) — no parsing ambiguity, no fallback to a chat response.
Methodology version pinned per classification; the system prompt is SHA-256 fingerprinted so any silent prompt edit fails the worker on the next run.
Input SHA-256 cache — identical inputs return the cached verdict without re-invoking the model. Provable equality on demand for any future audit.

03 · Confidence tiers with explicit routing thresholds

High Confidence — auto-passed, still reviewable from the queue.
Moderate Confidence — routed to the compliance reviewer queue.
Low Confidence — Review Required — mandatory human review before any declaration is sent.
Medium-tier self-consistency check — Medium rows are deliberately re-sampled three times at non-zero temperature; intra-model disagreement is itself the signal and escalates the row to human review.

04 · Pre-merge benchmark gate + post-deployment drift monitoring

Every candidate methodology version is run against a versioned suite of SM&CR fixtures before it ships. A regression on any fixture blocks the version bump.
After deployment, a deterministic 5% sample is re-classified by a judge model nightly; disagreement above a fixed threshold triggers an internal alert.
Every classification carries the methodology version it was produced under — an evidence pack a year from now is reproducible against the exact pipeline that generated it.

05 · Human review gate — not a configuration option

Every declaration requires explicit compliance-officer approval before it is sent. The review step cannot be disabled.
Overrides are logged with reason, timestamp, and reviewer identity — recorded in the audit trail and embedded in the evidence pack.
The reviewer queue is part of the product, not an add-on or a configuration toggle.

The pipeline is the product. The AI is one component inside it.

On scope

The pipeline reasons over the structured data your register actually contains — role title, function codes, seniority indicators. Individual-level context (day-to-day responsibilities, employment contracts) is the compliance officer's domain on review.
The s.250(3) test is a functional test that courts apply to the facts of each case. CoverProof produces a structured risk-screening verdict against that test — one input to your compliance decision, not a legal opinion.
CoverProof complements independent legal advice on your firm's specific Section 250 position. It is not a substitute for it.

Technical specification

AI classification model	Claude Sonnet (Anthropic) — temperature 0, methodology version pinned per classification
Output format	Structured JSON via Anthropic SDK zodOutputFormat
Human review required	Yes — mandatory before any declaration is sent
Evidence pack format	PDF/A-3B (ISO 19005-3)
Tamper detection	SHA-256 cryptographic hash, database-recorded at generation
Trusted timestamp	RFC 3161 Timestamp Token from RFC 3161-compliant TSP — third-party anchor independent of CoverProof database
Audit trail location	Embedded as XML attachment in PDF/A-3B document
Timestamp authority	Server-side UTC — not client-reported
Declaration delivery	Cryptographically random unique URL per recipient (UUIDv4 via crypto)
Delivery tracking	Send, delivery, open, completion — all server-recorded
Data residency	GDPR-compliant data processing with appropriate international transfer safeguards
Multi-tenancy isolation	PostgreSQL Row Level Security — tenant-isolated by design
FCA register source	Live FCA Developer API (when configured) — name-based individual lookup against approved persons register

Methodology questions

How does CoverProof determine who is in scope for Section 250?

CoverProof uses Claude (Anthropic's AI model) to assess each individual against the verbatim Section 250(3) statutory test: does the person play a significant role in (a) the making of decisions about how the whole or a substantial part of the activities of the organisation are to be managed or organised, or (b) the managing or organising of the whole or a substantial part of those activities? The test covers all organisational activities — not only financial ones — and applies regardless of FCA approval status. The AI reasons over role title, function codes, and seniority indicators. Every Medium and LOW-confidence classification is flagged for human review. Your compliance team makes the final decision — CoverProof does not send declarations without explicit approval.

What documentation standard do CoverProof evidence packs meet?

Four factors: (1) PDF/A-3B format — the ISO 19005-3 standard for archival documents with embedded metadata; (2) SHA-256 cryptographic hash — recorded in the CoverProof database at generation time, allowing verification that the document has not been modified; (3) RFC 3161 trusted timestamp — a Timestamp Token from an RFC 3161-compliant Trust Service Provider (TSP) is obtained at generation time, providing a third-party anchor to the audit chain that is independent of CoverProof's database; (4) Immutable audit trail — every declaration event is timestamped in UTC server-side and embedded as XML in the PDF. These properties are designed to meet the documentation standards required in legal and regulatory proceedings. Courts determine admissibility on the facts of each case — the pack provides the strongest evidential foundation currently achievable for this type of compliance record.

Can CoverProof override the AI classification?

Yes. Compliance staff can review and override any AI classification — upgrading or downgrading the provisional High/Medium/Low rating with a documented reason for the change. Every override is recorded in the audit trail with the user, timestamp, and stated reason. The evidence pack includes the full override log.

What happens if a declaration recipient does not respond?

CoverProof tracks non-response and allows one-click re-sending. Every re-send attempt is recorded with a timestamp. The evidence pack includes the full re-send history — a contemporaneous record of the firm's documented compliance activity, material to prosecutorial discretion even when a declaration is not ultimately obtained.

How does CoverProof integrate with the FCA Register?

When FCA API access is configured (using the free FCA Developer API at register.fca.org.uk/Developer), CoverProof cross-references each individual against the live FCA approved persons register during classification — verifying their registration status and SMF function codes against the authoritative source. This enrichment is passed to the AI classifier as additional context. Bulk import via the FCA Register Extract Service (RES) is on the product roadmap. Until FCA API access is configured, classifications proceed using the data in your uploaded register.

Same input → same output (and how to verify it)

Reproducibility is a load-bearing claim for an audit-grade compliance tool. If two classifications of the same individual on different days returned different verdicts without a methodology change, the evidence pack would be unverifiable. CoverProof controls this on five levers:

Temperature 0. The model is invoked with temperature 0, so the same prompt produces the same response (subject to the next two levers).
Model pinning. The model id is recorded with the classification (e.g. claude-sonnet-4-6). A model upgrade is a methodology change, not a transparent swap.
Methodology version pinning. Every classification records the methodology version (e.g. s250-v12) it was produced under. The version pins the prompt content, the statute extract, the schema, and the verification gates.
Prompt fingerprinting. The system prompt is SHA-256 hashed and stored alongside the methodology version. Any silent edit produces a mismatch detected by the worker on next run.
Input hash cache. The user-facing inputs (name, role title, function codes, firm context) are SHA-256 hashed before the model call. An identical hash returns the cached verdict rather than re-invoking the model — provable equality, no second call needed.

How to verify on your data: in your gap-analysis review table, click any classified row to expand it. The row shows its input hash, methodology version, and model id. Re-importing the same SM&CR file does not produce new classifications — the file’s SHA-256 is detected and the import is short-circuited. The same row would yield the same verdict on demand.

The one deliberate exception

MEDIUM-tier classifications — the ambiguous middle band — are deliberately re-sampled three times at a non-zero temperature to detect intra-model disagreement. The disagreement itself is the signal we want; rows where the samples disagree are flagged review_required and escalate for human review. This is the only non-deterministic step in the pipeline, and it is non-deterministic by design.

Scope of CoverProof

An honest scope statement is more useful than a long feature list. Here is what CoverProof does, and where it deliberately stops — so a compliance director can place it correctly inside their existing process.

This methodology has not been reviewed by external legal counsel. CoverProof does not provide legal advice; firms should seek independent advice on their s.250 compliance obligations.

Not legal advice. CoverProof is a structured workflow that documents your firm’s reasonable steps under Section 250. It does not, and is not authorised to, advise on the application of the law to your facts. Your qualified legal advisers do that.
Not a regulatory determination. The AI classification produces a risk-screening verdict against the s.250(3) functional test. It is not a finding by the FCA, by a court, or by any other competent authority. Treat it as one input to your decision, not the decision.
Not a substitute for the s.250 functional test. The statutory test is fact-dependent — actual responsibilities, actual decision-making authority, actual scope of activities. CoverProof reasons over role title, function codes, and seniority indicators because those are the data we have. The model cannot see the individual’s employment contract or actual day-to-day responsibilities.
The s.250 test and SM&CR approval are distinct questions. Whether an individual meets the s.250(3) functional “senior manager” test, and whether they hold an FCA Senior Management Function, are two independent questions. Holding an SMF does not by itself settle the s.250(3) test, and lacking one does not by itself make a person s.250-exposed. Identifying an SM&CR-uncovered significant-role individual as a gap is a reasoned inference from the distance between those two questions — not an equation written into the statute. A court decides the functional test on the facts.
Not per-firm tuned. Thresholds (confidence tiers, drift alert cutoff, self-consistency escalation) are fixed across all firms. Per-firm tuning would weaken the “uniform methodology” defence in any future challenge — it would let an adversary argue that a verdict turned on a configurable knob rather than on the statutory test.
Not retroactively revised. A classification produced under methodology version v12 stays a v12 classification. If we ship v13, prior rows are not silently re-classified. A board evidence pack generated under v12 is still verifiable as a v12 record years later.
Designed to sit alongside your compliance review. Every CoverProof classification flows through your team’s reviewer queue before any declaration is sent. The internal benchmark catches drift in the model on known scenarios; your reviewers catch anything specific to your firm’s register. The two layers together are the product.
Not a confidentiality boundary for our sub-processors. Your data is processed by the sub-processors listed in the Trust Centre. The DPA documents the terms and the safeguards.

When the sources disagree

CoverProof reads three sources of truth about who is at your firm: the SM&CR register you upload, the live FCA Register Extract Service, and any in-product corrections the compliance officer makes. These rarely agree perfectly. CoverProof never silently reconciles a disagreement; the policy is explicit.

SM&CR upload — your firm’s declared truth. What you uploaded is what your compliance officer has signed off as in scope today.
FCA Register — what the regulator records. The FCA’s view of approved-persons status at the firm. Useful for cross-checking SMF approval, but it does not capture everyone who meets the s.250(3) functional test.
CO in-product corrections — final. When the CO confirms, overrides, or excludes a row in the review table, that decision becomes the audit-trail record for the gap analysis.

When the SM&CR upload and the FCA Register disagree on who is at the firm, both populations are surfaced as in-scope for review. Names appearing only in the SM&CR upload are reviewed alongside names appearing only in the FCA Register. The CO reconciles either by accepting the row (confirm), substituting a different verdict (override), or removing it from the declaration list (exclude).

The reconciliation is recorded in the audit trail. A 2028 auditor reading the evidence pack can see which source each individual was first surfaced from and which decision the CO made.

Fixed thresholds, by design

Several numbers in CoverProof are configurable in the implementation but not in the product. We use the same thresholds for every firm and we do not let them be tuned at the firm or contract level.

The confidence tier boundaries that drive review-required escalation.
The 10pp disagreement-rate alert threshold on the drift monitor (see Quality & drift).
The MEDIUM-tier self-consistency sample count (k=3) and temperature (0.3).
The counsel-review trigger cutoffs (see below).

Why fixed: per-firm tuning would let an adversary later argue that a verdict turned on a configurable knob rather than on the statutory test. The uniform-methodology defence is part of the product’s value — trading it for configurability would be a downgrade, not a feature.

When the product recommends counsel review

CoverProof surfaces a non-blocking “Consider commissioned counsel review” banner on the gap-analysis page when one or more of three fixed conditions hold. The triggers are deterministic, evaluated on each page load, and disclosed here in full. Counsel review is always available; the product flags specific situations where it is worth the cost of asking.

An AI low-confidence row was confirmed without override. The override path is how the CO records disagreement with the AI. Confirming a low-confidence row without using it means the CO accepted the AI’s uncertainty as the firm’s position. Counsel can advise whether that is the right call for the specific boundary case.
Deadline pressure. More than five declarations have bounced, or more than 20% remain pending within 14 days of the 29 June 2026 deadline. The board pack generated in this state may understate residual exposure; counsel input ensures the pack accurately reflects the programme’s actual coverage.
A low-confidence individual is being declared rather than excluded. The firm has staked part of its defence on an AI uncertain call. Counsel sign-off before the evidence pack is minuted is recommended.

The banner is advisory and never blocks declarations or evidence-pack generation. It does not replace your firm’s standing relationship with qualified counsel.

Primary sources

The statutory and regulatory materials this methodology relies on. Every claim about scope, applicability, or the s.250(3) test should be verified against these directly.

Crime and Policing Act 2026, s.250 — legislation.gov.uk — the verbatim enacted text of the senior-manager attribution provision.
FCA — Senior Managers and Certification Regime (SM&CR) — the regulatory framework that defines the SM&CR baseline against which the s.250(3) test is gap-analysed.
FCA Handbook — SYSC — primary source for Senior Manager Function (SMF) definitions used in classification.
FCA Financial Services Register — the authoritative public record of FCA-approved persons; used at classification time when FCA API access is configured.
FCA Developer Portal — free authenticated API access to the Register, used by CoverProof for live SMF cross-referencing during gap analysis.

The evidence standard: PDF/A-3B

PDF/A-3B (ISO 19005-3) is the international standard for archival documents intended for long-term preservation. It requires: embedded metadata, no external dependencies (fonts, images, and content must be embedded), and a self-describing document structure. CoverProof generates to this standard because evidence packs may be required in proceedings years after the June 29, 2026 deadline.

The SHA-256 hash recorded in the database at generation time means that if your evidence pack is ever challenged — if someone claims the document was altered after the fact — you can produce the original hash and prove the file is unmodified. This is the same tamper-evidence mechanism used in legal e-discovery.

The first gap analysis is free — see your Section 250 exposure in under 10 minutes.

Start your free gap analysis

Questions about the methodology? Email us