Governance

The Decision Log as an AI Model Governance Artefact

April 17, 2026

The decision log is the shortest artefact in AI governance and often the most valuable. It records, in a structured way, the decisions made about a model: why the model was built, what alternatives were considered, what data was used, what risks were identified, what controls were imposed, and what acceptance criteria were applied. When a regulator asks why your firm is using this model for this purpose, the decision log is the document you open first.

This post covers the decision log as a governance artefact for AI models, informed by the EU AI Act obligations for high-risk systems, the PRA SS1/23 on model risk management, and the emerging supervisory expectations under DORA for AI operational resilience. The patterns work for any model used in a regulated decision path.

Why the decision log exists as a category

Firms produce many decision documents: product approvals, policy decisions, architectural choices, vendor selections. In traditional model risk management, these are scattered across committee minutes, review documents, email threads and memoranda. The decision log consolidates them into one structured record, organised by decision rather than by committee or document type.

The consolidation matters for three reasons. First, auditability: one authoritative record is easier to defend than a forensic reconstruction. Second, consistency: the structure enforces a quality bar on every decision. Third, retrievability: when a decision is challenged months or years later, finding the record is trivial rather than archaeological.

For the broader context on how decisions sit in AI governance, see our posts on decision rights in AI-native operating models and embedded governance vs bolt-on.

What a good decision log entry contains

Each entry in the log has eight fields. The fields are the quality bar. An entry missing a field is incomplete.

1. Decision reference

A stable identifier. Decisions are often referenced from other artefacts (the model risk assessment, the change impact assessment, the audit working papers). A stable reference matters.

2. Decision summary

One or two sentences describing what was decided. Short enough to be scannable. Specific enough to be unambiguous. "Approved the use of the credit scoring model v4.2 for unsecured lending to UK retail customers with transaction values between 500 and 25,000 GBP, subject to the controls in section X."

3. Context and rationale

Why the decision was needed. What the business case was. What constraints applied. The context is usually short (one to three paragraphs) and answers the question "why are we doing this".

4. Alternatives considered

What other options were evaluated and why they were not selected. This is the field most commonly absent from weak decision logs. Without alternatives, the log records a choice but not a deliberation. Regulators and auditors value the deliberation evidence because it signals genuine consideration rather than post-facto rationalisation.

5. Assumptions

What assumptions underpinned the decision. Assumptions change. Recording them makes the decision testable when the assumptions are questioned later. "Assumed monthly volume of 200,000 applications based on current pipeline; decision may need revisiting if volume exceeds 400,000 due to operational processing capacity".

6. Risks and controls

What risks were identified and what controls are in place to manage them. This is where model risk management discipline shows up. Identified risks (bias, drift, data quality, adverse selection, gaming) with the specific controls (monitoring metrics, review cadence, override protocols) that manage them.

7. Decision maker and governance route

Who made the decision and under what authority. The accountable individual (named, not a committee) and the governance forum that approved or endorsed the decision. Under SMCR in the UK, this ties to the named SMF with the relevant accountability.

8. Evidence and supporting artefacts

References to the underlying documents: model validation report, data quality assessment, fairness analysis, stress test results, pricing impact analysis. The decision log is the index. The evidence lives elsewhere. The log references it.

Specific decision types for AI models

Not all model decisions are the same. A decision log covers several distinct decision types, each with slightly different content expectations.

Model selection decisions

The decision to use this specific model for this specific purpose. The alternatives considered section is critical (what other models or approaches were evaluated). The rationale section explains why this model's characteristics are the best fit. The risks section covers model-specific risks (training data provenance, fairness profile, explainability level, drift susceptibility).

Scope and use decisions

The decision to use the model for specific customer segments, product types, or decision paths. Scope changes are decisions in their own right. A model approved for unsecured lending between 500 and 10,000 GBP being extended to 25,000 GBP is a new decision requiring new rationale.

Control and monitoring decisions

The decision to impose specific controls (human-in-the-loop thresholds, override policies, champion-challenger structures, monitoring metrics). These decisions are often where the operational detail lives and where the alignment with the EU AI Act's high-risk provisions is demonstrated.

Threshold and parameter decisions

The decision to set specific thresholds (cut-off scores, confidence bands, override rates). Threshold decisions are often revisited. The log preserves the rationale for each threshold at the time it was set, which is essential when the threshold is questioned later.

Retirement and replacement decisions

The decision to retire a model or replace it with a new one. Often under-documented. The replacement decision needs its own rationale, its own alternatives analysis, and its own risk assessment.

Worked example: a credit scoring model approval

Consider a bank approving a new gradient-boosted credit scoring model for unsecured lending.

Decision reference: MOD-APR-2026-012

Decision summary: Approved the use of credit scoring model CS-UNSEC-v4.2 for unsecured lending applications between 500 and 15,000 GBP from UK retail customers, with the controls and monitoring described in sections 6 and 7 below. Effective from 2026-06-01 for a one-year review period.

Context and rationale: The existing model (CS-UNSEC-v3.8) has shown increasing drift since Q3 2025 as reflected in the monthly PSI metrics and the declining Gini coefficient. The proposed model v4.2 is a gradient-boosted refresh trained on the 2022 to 2025 portfolio, validated under the SR 11-7 equivalent internal framework, with a 6 percent improvement in Gini on holdout. The refresh is aligned with the credit strategy set out in the 2026 Credit Risk Committee pack.

Alternatives considered:

Retraining the existing logistic regression model: rejected due to structural inability to capture non-linear relationships that the portfolio now exhibits.
Adopting vendor scorecard X: rejected due to transparency limitations and the vendor's refusal to provide sufficient documentation for internal model risk management.
Extending the current model's scope rather than replacing: rejected because extension would not address the observed drift.
Delaying the refresh to Q4 2026 to align with the data architecture programme: rejected because the current drift is already at the high end of the tolerance range and further delay would require a risk acceptance.

Assumptions:

Production volumes remain within the range used for stress testing (up to 400,000 applications per month).
The underlying data pipeline maintains quality at or above the SLA thresholds specified in the data quality assessment.
The regulatory environment does not impose changes within the review period that would require model redevelopment.

Risks and controls:

Bias risk on protected characteristics: controlled by monthly fairness analysis per the fairness framework, with a 2 percent absolute difference threshold in adverse action rates by demographic group.
Drift risk: controlled by weekly PSI monitoring with a 0.15 action threshold and 0.25 escalation threshold.
Data quality risk: controlled by the data quality dashboard with 5 data quality metrics monitored daily.
Operational risk on model deployment: controlled by the model deployment pipeline per the MLOps standard, with rollback capability tested.
Human oversight per EU AI Act Article 14: control is a 100 percent human review of declined applications above 10,000 GBP until the first post-deployment review.

Decision maker and governance route: Approved by the Chief Risk Officer under delegated authority from the Credit Risk Committee. Endorsed by the Model Risk Committee on 2026-04-10. Noted at the Risk and Compliance Committee on 2026-04-17.

Evidence and supporting artefacts:

Model validation report MV-2026-027
Data quality assessment DQA-2026-014
Fairness analysis FAIR-2026-008
Stress test results ST-CS-2026-003
Pricing impact analysis PIA-2026-006
Deployment plan DPL-2026-019

Each field is concrete. An auditor opening the log 18 months later can follow the trail to every supporting artefact and understand the decision in its full context.

Common failure modes

Failure: the missing alternatives

Entries describe what was decided without describing what was considered. The deliberation evidence is absent. Auditors and regulators infer (correctly) that alternatives were not seriously evaluated. Fix: alternatives is a mandatory field, and reviewers reject entries that do not populate it meaningfully.

Failure: the committee-minutes log

The log is a reconstruction of committee minutes organised by committee rather than by decision. Finding a specific decision requires searching. Fix: the log is organised by decision with a stable reference. Committee minutes are supporting artefacts referenced from the decision, not the decision itself.

Failure: the decision without an owner

Entries are recorded as "decided by the committee" without a named accountable individual. Accountability is diffused. Under SMCR, accountability must be personal. Fix: a named individual with the attestation, with the committee as an endorsement route.

Failure: the un-revisited decision

The decision is recorded and never revisited. Three years later the context has changed, the assumptions are invalid, but the decision is still in force because no one has re-examined it. Fix: decisions with expiry dates or review triggers. Decisions without a review date should be flagged and either given one or closed.

Failure: the log as an afterthought

The decision was made, the implementation proceeded, the log entry was written retrospectively as an exercise in documentation. The log reflects what happened, not what was deliberated. Fix: the log entry is written as part of the decision process, not after it. A decision is not made until the log entry exists.

Connection to the broader stack

The decision log connects to:

The requirements traceability matrix: decisions often relate to specific requirements.
The RAID log: accepted risks are decisions recorded here.
The business rules catalogue: rule changes are decisions.
The change request artefacts: formal change requests lead to decisions.
The model risk management framework, the AI governance framework, and the regulatory compliance transformation service.

External references

The NIST AI Risk Management Framework provides the US framework for AI risk documentation. The ICO guidance on AI and data protection covers the UK data protection regulator's expectations. The Federal Reserve SR 11-7 on model risk management remains the canonical model risk reference and has influenced PRA SS1/23.

The short version

The decision log is the index of the AI governance stack. A well-maintained log turns audits, supervisory reviews and internal challenge sessions into straightforward conversations because the reasoning is preserved. A missing or weak log turns every question into a forensic investigation.

The discipline is cheap. Eight fields per entry, written as decisions are made, with real alternatives and named owners. The cost of maintaining it is a fraction of the cost of defending a decision three years later from scattered evidence.

Our AI enablement service and regulatory compliance transformation service include the decision log discipline as part of the AI governance operating model. For the wider embedded governance framing, see governance that accelerates AI deployment.

Ready to do the structural work?

Our AI Enablement engagements are built around the five pillars in this article. We start with a focused diagnostic, then redesign one priority workflow end-to-end as proof — including the data layer, decision rights, and governance machinery.

Explore the AI Enablement service

Monthly newsletter

More like this — once a month

Get the next long-form essay on AI enablement, embedded governance, and operating-model design straight to your inbox. One considered piece per month, written for senior practitioners in regulated industries.

No spam. Unsubscribe anytime. Read by senior practitioners across FS, healthcare, energy, and the public sector.

Related insights

Governance

Change Request Artefacts: From Raise to Release Under DORA

How to structure change request artefacts in a DORA-compliant way, from raise through impact assessment and approval to release, with the controls that prevent supervisory surprises.

April 17, 2026

Business Analysis

Given-When-Then Acceptance Criteria for Regulated Product Teams

How to write acceptance criteria using Given-When-Then that are testable, audit-ready, and connected to the regulatory obligation. Patterns, anti-patterns, and examples from financial services.

April 17, 2026

Business Analysis

BRD vs FRD in Regulated Change: When to Use Which, and How Deep

A practitioner's guide to Business Requirements Documents and Functional Requirements Documents in financial services, with templates, audit-ready structure and common failure modes.

April 17, 2026