CPS 230 for AI: the operational playbook APRA expects you to have

CPS 230 commenced on 1 July 2025. The standard does not mention artificial intelligence, but it regulates every material AI decision your organisation makes.

Published 2026-04-24 12 min read By PolyGovern
Sydney CBD street looking toward the Four Seasons tower.

The first annual material service provider register was due to APRA by 1 October 2025. Pre-existing contracts must be compliant by 1 July 2026. APRA Chair John Lonsdale told the Australian Banking Association Conference on 24 July 2025 that prudential reviews are running, starting with significant financial institutions and extending to non-SFIs through 2026 and into 2027.

This post converts the standard into artefacts a CRO can act on: a register that classifies foundation-model vendors, tolerance levels that hold up on probabilistic systems, an exit strategy that acknowledges foundation-model lock-in, and a board pack your regulator will recognise.

Key dates

1 July 2025 CPS 230 commenced
1 October 2025 First annual MSP register due to APRA
10 December 2025 APRA opened NTSP consultation
30 January 2026 NTSP submissions closed
1 July 2026 Pre-existing service provider contracts must comply

What CPS 230 requires

CPS 230 consolidates five earlier standards (CPS 231, CPS 232, SPS 231, SPS 232, and HPS 231) and applies to every APRA-regulated entity: ADIs, general and life insurers, private health insurers, RSE licensees, and authorised NOHCs.

Three obligations sit at the core:

  • Operational risk management across every process that delivers a critical operation.
  • Business continuity, including documented tolerance levels and tested severe-but-plausible scenarios.
  • Service provider management, including a maintained material service provider register and credible exit strategies.

The board is accountable. Senior management is responsible for implementation. CPG 230, the supervisory guidance APRA finalised in June 2024, is the document APRA staff read before a prudential review.

Artificial intelligence is not a critical operation

Listing "artificial intelligence" on a register is a category error, because AI is an input, not an operation. The critical operation is the decision it supports: a fraud alert, a credit decision, a claims triage, a member-advice interaction.

Example Critical? Why
Credit-decisioning model in retail lending Yes The decision affects customer access to finance
General member chatbot with always-human escalation Maybe Depends on whether silent misrouting breaches tolerance
Internal GenAI summariser of adviser notes No Output is internal, not a regulated decision

The first job of a CPS 230 AI programme is to make this judgement consistently. The second is to document it, because the opening question in any prudential review is "show me the reasoning."

When ANZ migrated its operational-risk stack into ServiceNow to meet CPS 230, the exercise mapped 19 critical operations across roughly 43,000 supporting resources and 16 non-financial risk themes. That is what a tier-one bank's register looks like in practice.

Classifying AI vendors on the MSP register

APRA defines a material service provider as one the entity relies on for a critical operation, or one that exposes the entity to material operational risk. Third, related, and connected parties all count. APRA released the register template in October 2024 and the first annual submission was due on 1 October 2025. PolyGovern's third-party AI risk practice classifies vendors against the test below.

Modern AI deployments rarely have a two-party structure. They usually have four tiers:

Tier Provider Materiality trigger
1 SaaS the entity contracts with (fraud vendor, wealth platform, claims automation) Material whenever the SaaS supports a critical operation
2 Cloud provider hosting the SaaS (AWS, Azure, GCP) Material where the cloud provider carries availability of the critical operation
3 Foundation-model API (OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex) Material where model output meaningfully shapes the decision
4 Training-data vendors, model hosts, fine-tuning providers Material where fourth-party failure reaches the critical operation

CPG 230 is explicit that fourth parties should sit on the register when they are material. That covers the cases where a licensed SaaS silently routes prompts or embeddings to a foundation-model provider the bank never contracted with. Regulated customers of that SaaS carry a documented exposure they did not sign up for, and the register must reflect it.

Five questions for every AI vendor

Two "yes" answers means the provider belongs on the register.

  1. 1.Does the AI output support or constitute a decision on a critical operation?
  2. 2.Is the AI output visible to a customer, a member, or a regulator?
  3. 3.Can the entity substitute the AI provider within the business-continuity tolerance without degrading the critical operation?
  4. 4.Does the AI provider process customer personal information, transaction data, or regulated records?
  5. 5.Is the entity exposed to enforcement, customer harm, or financial impact if the provider fails for 24 hours?

The checklist below expands each question with evidence requirements, scoring guidance, and a sample register row.

On the NTSP consultation

APRA opened a targeted consultation on non-traditional service providers in December 2025. Submissions closed on 30 January 2026. The consultation did not carve AI out. Through the first half of 2026, every AI vendor touching a critical operation remains in scope.

Tolerance levels for probabilistic systems

CPS 230 requires tolerance levels across three classical dimensions: maximum tolerable period of disruption, maximum tolerable data loss, and minimum service levels. These assume binary failure. Systems are up or down, data is retained or lost, service levels are met or not met. Large language models do not fail that way.

Four AI failure modes classical tolerance does not model

Output drift

A provider-side model upgrade shifts output distribution. Nothing looks broken. Pass-rate on the entity's golden evaluation set falls by 4% and no one notices for six weeks.

Indirect prompt injection

A retrieved document carries embedded instructions that redirect the model. The operation completes successfully. The wrong person receives the output. See Greshake et al., "Not what you've signed up for" (2023).

Hallucinated factual output

The model asserts non-existent content confidently. Mata v Avianca (2023) in the US and Zhang v Chen (2024) in Canada were the first court sanctions for filings containing fabricated AI citations.

Runaway agentic loops

Token or rate-limit events driven by agent recursion. Both a cost event and an availability event. Neither shows up on a classical service-level dashboard.

The five-dimension tolerance set

For every AI-enabled critical operation, document:

Dimension Type Measured against
Maximum tolerable period of disruption Classical Provider outage, model deprecation, forced migration
Maximum tolerable data loss Classical Prompt history, fine-tune state, embeddings
Minimum service levels Classical Throughput, latency, availability
Maximum tolerable output drift AI-specific Pass-rate on the entity's golden evaluation set
Maximum tolerable hallucination rate AI-specific LLM-as-a-judge evaluation plus sampled human review

Classical tolerances catch deterministic failure. The AI-specific two catch silent drift and hallucinated output, which is where LLMs actually fail. PolyGovern's risk-framework practice sets these tolerances against an entity's existing operational-risk taxonomy.

Exit strategy when your dependency is a foundation model

Classical SaaS exit plans assume three things that foundation-model dependencies break:

Fine-tune lock-in

A fine-tuned model does not port to a different provider's base. Re-tuning is new engineering work with new behaviour regressions and new cost.

Embedding lock-in

Vector indexes built against one embedding model are non-transferable. Re-embedding a corpus of millions of documents is a scheduled programme, not a cutover.

Prompt and evaluation lock-in

Prompts and evaluation suites calibrated against one model need recalibration against a replacement. A prompt that reliably produces structured JSON from one model will over- or under-constrain another.

What a defensible exit strategy looks like

  • Dual-model architecture from day one. Two providers able to serve the operation with minor prompt and adapter changes. The second kept warm through weekly regression runs or a small percentage of live traffic.
  • Vector stores decoupled from the embedding provider. Access control, chunking, and retrieval logic in the entity's own code rather than the provider's.
  • AI escrow. The AI analogue of software escrow. Fine-tune weights, prompt libraries, and evaluation harnesses deposited with a third party under release triggers tied to the CPS 230 materiality thresholds.
  • Tolerances calibrated to deprecation events. Major providers have deprecated production checkpoints with nine months' notice or less.
  • A documented minimum viable replacement. The pared-down capability the entity will operate under for the duration of a replacement programme. This is the bridge state, not the target state.
An exit strategy is engineered, not declared. A document without architectural mitigation will not survive a real model deprecation.

Where AI sits in APRA's broader framework

CPS 230 is one of four intersecting instruments. An entity that implements CPS 230 alone has a compliant operational-risk frame with an incomplete AI control set.

CPS 234 Information Security

The information-security counterpart. APRA's warning to RSE licensees on CPS 234 in mid-2025 previews the CPS 230 supervisory posture.

Financial Accountability Regime

FAR makes end-to-end management of a significant business activity personally attributable. AI-enabled critical operations are significant business activities. An accountable person must be named for each. Personal penalties reach $1.565 million per contravention.

ASIC Report 798

ASIC's October 2024 review of 23 AFS and credit licensees across 624 AI use cases found only 12 licensees had AI policy documents referencing fairness, discrimination, and bias risks. Only 43% had policies requiring AI disclosure to consumers. For dual-regulated entities, REP 798 is the conduct-side companion to CPS 230.

ISO 42001 and the Voluntary AI Safety Standard

Where CPS 230 is silent on AI-specific controls, ISO/IEC 42001:2023 and the Voluntary AI Safety Standard fill the gap. PolyGovern's ISO 42001 certification practice layers both on top of an existing CPS 230 programme.

The board report CPS 230 actually asks for

The quarterly pack needs six elements. Not ten.

  1. 1MSP register with AI classification flagged, row count, change month-over-month, and any additions or removals that crossed the materiality threshold.
  2. 2Critical operations enabled by AI with performance against each tolerance dimension and pass-fail on golden evaluation sets.
  3. 3Output drift and hallucination metrics with monthly trend against the tolerances. Breaches are escalated items, not reported items.
  4. 4AI incident log covering disruption events, prompt-injection events, and model-deprecation events, with resolution status and lessons applied elsewhere.
  5. 5Exit-strategy testing results for each material foundation-model dependency, including the minimum viable replacement walk-through.
  6. 6Material supervisory correspondence: any APRA letter referencing CPS 230 or CPS 234 for an AI operation, and any heightened-supervision flag.

FAR statements that name AI

An accountable-person statement for an AI-enabled critical operation names four things:

  • The accountable person (usually the CRO, CIO, or a named business-line executive; rarely the CDO or Chief AI Officer).
  • The critical operation in CPS 230 language, aligned to the register.
  • The material service providers in the AI chain, referenced by register entry.
  • The escalation paths for tolerance breaches.

Statements that describe AI as "a distributed capability across the organisation" are not FAR-compliant. A single person owns each AI-enabled critical operation. PolyGovern's board-level AI governance practice drafts the statements with the named accountable persons.

The first prudential review

APRA's supervisory posture is reviews starting with SFIs and extending to non-SFIs through 2026 and 2027. The CPS 234 cadence translates forward: letters first, formal directions next, enforceable undertakings where issues remain unresolved.

APRA member Therese McCarthy Hockey framed the regulator's AI expectation at FINSIA in September 2025: AI can be a useful co-pilot, but it should not be the autopilot.

What to have ready

Current MSP register with AI vendors classified against the five-question test, date-stamped, board-approved, plus the fourth-party subprocessor list for material SaaS
Tolerance-level documentation across five dimensions with evidence of calibration, evidence of testing, and annual review dates attached
Business continuity test results, including the last live test of the AI exit strategy, the minimum viable replacement walk-through, and the last dual-provider substitution drill
Incident response runbook covering the four AI failure modes, integrated into the broader IR process
CPS 234 evidence for AI vendors: penetration tests, SOC 2 Type II reports, and AI-specific addendums
The last four quarterly board AI reports

Entities that can produce this evidence within a week pass. Those that cannot get a letter.

What to do in April to June 2026

  1. 1.Re-test the MSP register. Run every AI-vendor entry through the five-question test. The entries written before 1 October 2025 pre-date the first review cycle.
  2. 2.Write tolerance levels on the five-dimension set. Output drift and hallucination rate are new. The team will not have baselines. Establish them on production traffic this quarter so the September board pack has real numbers.
  3. 3.Engineer the exit strategy. Dual-provider architecture, AI escrow, minimum viable replacement. Raise any of the three lock-ins without mitigation to the risk committee.
  4. 4.Refresh vendor contracts for 1 July 2026. Sub-processor disclosure is the most common gap. The 24-hour incident notification window is the second.
  5. 5.Rewrite the FAR statements. The statements filed before the AI programme existed are incomplete by construction.
  6. 6.Rebuild the board AI report on the six-element structure. The next pack the risk committee sees should be the first one APRA would recognise in a review.

Two CROs read CPS 230. The one who reads it as an operational-risk standard passes a generic review and fails the AI questions. The one who reads it with the AI overlay (four-tier MSP mapping, the five-dimension tolerance set, an engineered exit strategy, the six-element board pack, FAR statements that name AI operations) passes both.

Free download ยท PDF ยท 11 pages

MSP classification checklist for AI vendors

The five-question test with scoring, evidence requirements, a board sign-off block, and review-trigger cadence. Versioned for quarterly review.

CPS 230 AI readiness review

A thirty-minute call with a senior consultant. No deck. A live walkthrough of your current register against the five-question test.

Book a call

Download the checklist

Tell us who you are and the PDF will download straight away.

By submitting you consent to PolyGovern contacting you about the checklist. No spam; unsubscribe any time.

Get in Touch