CPS 230 for AI: the operational playbook APRA expects you to have
CPS 230 commenced on 1 July 2025. The standard does not mention artificial intelligence, but it regulates every material AI decision your organisation makes.
The first annual material service provider register was due to APRA by 1 October 2025. Pre-existing contracts must be compliant by 1 July 2026. APRA Chair John Lonsdale told the Australian Banking Association Conference on 24 July 2025 that prudential reviews are running, starting with significant financial institutions and extending to non-SFIs through 2026 and into 2027.
This post converts the standard into artefacts a CRO can act on: a register that classifies foundation-model vendors, tolerance levels that hold up on probabilistic systems, an exit strategy that acknowledges foundation-model lock-in, and a board pack your regulator will recognise.
Key dates
| 1 July 2025 | CPS 230 commenced |
| 1 October 2025 | First annual MSP register due to APRA |
| 10 December 2025 | APRA opened NTSP consultation |
| 30 January 2026 | NTSP submissions closed |
| 1 July 2026 | Pre-existing service provider contracts must comply |
What CPS 230 requires
CPS 230 consolidates five earlier standards (CPS 231, CPS 232, SPS 231, SPS 232, and HPS 231) and applies to every APRA-regulated entity: ADIs, general and life insurers, private health insurers, RSE licensees, and authorised NOHCs.
Three obligations sit at the core:
- Operational risk management across every process that delivers a critical operation.
- Business continuity, including documented tolerance levels and tested severe-but-plausible scenarios.
- Service provider management, including a maintained material service provider register and credible exit strategies.
The board is accountable. Senior management is responsible for implementation. CPG 230, the supervisory guidance APRA finalised in June 2024, is the document APRA staff read before a prudential review.
Artificial intelligence is not a critical operation
Listing "artificial intelligence" on a register is a category error, because AI is an input, not an operation. The critical operation is the decision it supports: a fraud alert, a credit decision, a claims triage, a member-advice interaction.
| Example | Critical? | Why |
|---|---|---|
| Credit-decisioning model in retail lending | Yes | The decision affects customer access to finance |
| General member chatbot with always-human escalation | Maybe | Depends on whether silent misrouting breaches tolerance |
| Internal GenAI summariser of adviser notes | No | Output is internal, not a regulated decision |
The first job of a CPS 230 AI programme is to make this judgement consistently. The second is to document it, because the opening question in any prudential review is "show me the reasoning."
When ANZ migrated its operational-risk stack into ServiceNow to meet CPS 230, the exercise mapped 19 critical operations across roughly 43,000 supporting resources and 16 non-financial risk themes. That is what a tier-one bank's register looks like in practice.
Classifying AI vendors on the MSP register
APRA defines a material service provider as one the entity relies on for a critical operation, or one that exposes the entity to material operational risk. Third, related, and connected parties all count. APRA released the register template in October 2024 and the first annual submission was due on 1 October 2025. PolyGovern's third-party AI risk practice classifies vendors against the test below.
Modern AI deployments rarely have a two-party structure. They usually have four tiers:
| Tier | Provider | Materiality trigger |
|---|---|---|
| 1 | SaaS the entity contracts with (fraud vendor, wealth platform, claims automation) | Material whenever the SaaS supports a critical operation |
| 2 | Cloud provider hosting the SaaS (AWS, Azure, GCP) | Material where the cloud provider carries availability of the critical operation |
| 3 | Foundation-model API (OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex) | Material where model output meaningfully shapes the decision |
| 4 | Training-data vendors, model hosts, fine-tuning providers | Material where fourth-party failure reaches the critical operation |
CPG 230 is explicit that fourth parties should sit on the register when they are material. That covers the cases where a licensed SaaS silently routes prompts or embeddings to a foundation-model provider the bank never contracted with. Regulated customers of that SaaS carry a documented exposure they did not sign up for, and the register must reflect it.
Five questions for every AI vendor
Two "yes" answers means the provider belongs on the register.
- 1.Does the AI output support or constitute a decision on a critical operation?
- 2.Is the AI output visible to a customer, a member, or a regulator?
- 3.Can the entity substitute the AI provider within the business-continuity tolerance without degrading the critical operation?
- 4.Does the AI provider process customer personal information, transaction data, or regulated records?
- 5.Is the entity exposed to enforcement, customer harm, or financial impact if the provider fails for 24 hours?
The checklist below expands each question with evidence requirements, scoring guidance, and a sample register row.
On the NTSP consultation
APRA opened a targeted consultation on non-traditional service providers in December 2025. Submissions closed on 30 January 2026. The consultation did not carve AI out. Through the first half of 2026, every AI vendor touching a critical operation remains in scope.
Tolerance levels for probabilistic systems
CPS 230 requires tolerance levels across three classical dimensions: maximum tolerable period of disruption, maximum tolerable data loss, and minimum service levels. These assume binary failure. Systems are up or down, data is retained or lost, service levels are met or not met. Large language models do not fail that way.
Four AI failure modes classical tolerance does not model
Output drift
A provider-side model upgrade shifts output distribution. Nothing looks broken. Pass-rate on the entity's golden evaluation set falls by 4% and no one notices for six weeks.
Indirect prompt injection
A retrieved document carries embedded instructions that redirect the model. The operation completes successfully. The wrong person receives the output. See Greshake et al., "Not what you've signed up for" (2023).
Hallucinated factual output
The model asserts non-existent content confidently. Mata v Avianca (2023) in the US and Zhang v Chen (2024) in Canada were the first court sanctions for filings containing fabricated AI citations.
Runaway agentic loops
Token or rate-limit events driven by agent recursion. Both a cost event and an availability event. Neither shows up on a classical service-level dashboard.
The five-dimension tolerance set
For every AI-enabled critical operation, document:
| Dimension | Type | Measured against |
|---|---|---|
| Maximum tolerable period of disruption | Classical | Provider outage, model deprecation, forced migration |
| Maximum tolerable data loss | Classical | Prompt history, fine-tune state, embeddings |
| Minimum service levels | Classical | Throughput, latency, availability |
| Maximum tolerable output drift | AI-specific | Pass-rate on the entity's golden evaluation set |
| Maximum tolerable hallucination rate | AI-specific | LLM-as-a-judge evaluation plus sampled human review |
Classical tolerances catch deterministic failure. The AI-specific two catch silent drift and hallucinated output, which is where LLMs actually fail. PolyGovern's risk-framework practice sets these tolerances against an entity's existing operational-risk taxonomy.
Exit strategy when your dependency is a foundation model
Classical SaaS exit plans assume three things that foundation-model dependencies break:
Fine-tune lock-in
A fine-tuned model does not port to a different provider's base. Re-tuning is new engineering work with new behaviour regressions and new cost.
Embedding lock-in
Vector indexes built against one embedding model are non-transferable. Re-embedding a corpus of millions of documents is a scheduled programme, not a cutover.
Prompt and evaluation lock-in
Prompts and evaluation suites calibrated against one model need recalibration against a replacement. A prompt that reliably produces structured JSON from one model will over- or under-constrain another.
What a defensible exit strategy looks like
- Dual-model architecture from day one. Two providers able to serve the operation with minor prompt and adapter changes. The second kept warm through weekly regression runs or a small percentage of live traffic.
- Vector stores decoupled from the embedding provider. Access control, chunking, and retrieval logic in the entity's own code rather than the provider's.
- AI escrow. The AI analogue of software escrow. Fine-tune weights, prompt libraries, and evaluation harnesses deposited with a third party under release triggers tied to the CPS 230 materiality thresholds.
- Tolerances calibrated to deprecation events. Major providers have deprecated production checkpoints with nine months' notice or less.
- A documented minimum viable replacement. The pared-down capability the entity will operate under for the duration of a replacement programme. This is the bridge state, not the target state.
An exit strategy is engineered, not declared. A document without architectural mitigation will not survive a real model deprecation.
Where AI sits in APRA's broader framework
CPS 230 is one of four intersecting instruments. An entity that implements CPS 230 alone has a compliant operational-risk frame with an incomplete AI control set.
CPS 234 Information Security
The information-security counterpart. APRA's warning to RSE licensees on CPS 234 in mid-2025 previews the CPS 230 supervisory posture.
Financial Accountability Regime
FAR makes end-to-end management of a significant business activity personally attributable. AI-enabled critical operations are significant business activities. An accountable person must be named for each. Personal penalties reach $1.565 million per contravention.
ASIC Report 798
ASIC's October 2024 review of 23 AFS and credit licensees across 624 AI use cases found only 12 licensees had AI policy documents referencing fairness, discrimination, and bias risks. Only 43% had policies requiring AI disclosure to consumers. For dual-regulated entities, REP 798 is the conduct-side companion to CPS 230.
ISO 42001 and the Voluntary AI Safety Standard
Where CPS 230 is silent on AI-specific controls, ISO/IEC 42001:2023 and the Voluntary AI Safety Standard fill the gap. PolyGovern's ISO 42001 certification practice layers both on top of an existing CPS 230 programme.
The board report CPS 230 actually asks for
The quarterly pack needs six elements. Not ten.
- 1MSP register with AI classification flagged, row count, change month-over-month, and any additions or removals that crossed the materiality threshold.
- 2Critical operations enabled by AI with performance against each tolerance dimension and pass-fail on golden evaluation sets.
- 3Output drift and hallucination metrics with monthly trend against the tolerances. Breaches are escalated items, not reported items.
- 4AI incident log covering disruption events, prompt-injection events, and model-deprecation events, with resolution status and lessons applied elsewhere.
- 5Exit-strategy testing results for each material foundation-model dependency, including the minimum viable replacement walk-through.
- 6Material supervisory correspondence: any APRA letter referencing CPS 230 or CPS 234 for an AI operation, and any heightened-supervision flag.
FAR statements that name AI
An accountable-person statement for an AI-enabled critical operation names four things:
- The accountable person (usually the CRO, CIO, or a named business-line executive; rarely the CDO or Chief AI Officer).
- The critical operation in CPS 230 language, aligned to the register.
- The material service providers in the AI chain, referenced by register entry.
- The escalation paths for tolerance breaches.
Statements that describe AI as "a distributed capability across the organisation" are not FAR-compliant. A single person owns each AI-enabled critical operation. PolyGovern's board-level AI governance practice drafts the statements with the named accountable persons.
The first prudential review
APRA's supervisory posture is reviews starting with SFIs and extending to non-SFIs through 2026 and 2027. The CPS 234 cadence translates forward: letters first, formal directions next, enforceable undertakings where issues remain unresolved.
APRA member Therese McCarthy Hockey framed the regulator's AI expectation at FINSIA in September 2025: AI can be a useful co-pilot, but it should not be the autopilot.
What to have ready
Entities that can produce this evidence within a week pass. Those that cannot get a letter.
What to do in April to June 2026
- 1.Re-test the MSP register. Run every AI-vendor entry through the five-question test. The entries written before 1 October 2025 pre-date the first review cycle.
- 2.Write tolerance levels on the five-dimension set. Output drift and hallucination rate are new. The team will not have baselines. Establish them on production traffic this quarter so the September board pack has real numbers.
- 3.Engineer the exit strategy. Dual-provider architecture, AI escrow, minimum viable replacement. Raise any of the three lock-ins without mitigation to the risk committee.
- 4.Refresh vendor contracts for 1 July 2026. Sub-processor disclosure is the most common gap. The 24-hour incident notification window is the second.
- 5.Rewrite the FAR statements. The statements filed before the AI programme existed are incomplete by construction.
- 6.Rebuild the board AI report on the six-element structure. The next pack the risk committee sees should be the first one APRA would recognise in a review.
Two CROs read CPS 230. The one who reads it as an operational-risk standard passes a generic review and fails the AI questions. The one who reads it with the AI overlay (four-tier MSP mapping, the five-dimension tolerance set, an engineered exit strategy, the six-element board pack, FAR statements that name AI operations) passes both.
Free download ยท PDF ยท 11 pages
MSP classification checklist for AI vendors
The five-question test with scoring, evidence requirements, a board sign-off block, and review-trigger cadence. Versioned for quarterly review.
CPS 230 AI readiness review
A thirty-minute call with a senior consultant. No deck. A live walkthrough of your current register against the five-question test.
Book a callRelated services
Third-party AI risk
MSP classification, vendor due diligence, ongoing monitoring.
Risk framework development
CPS 230-aligned AI risk frameworks, tolerance sets, and control libraries.
Board-level AI governance
Agendas, charters, and quarterly AI reports your regulator will recognise.
Regulatory compliance
APRA, ASIC, OAIC, and ACCC. Multi-regulator AI compliance.