Overview
ISO 27001 Controls A.5.29 and A.5.30 require organisations to maintain information security during disruption and ensure ICT readiness for business continuity. Auditors expect operationally detailed plans — not just a high-level BC/DR policy. They want to see a BIA with defined RTO/RPO targets, recovery sequences prioritized by criticality, DR failover procedures, and evidence that the plans have been tested.
This skill produces three artifacts: a Business Continuity Plan (BCP) with BIA-driven recovery sequences, a Disaster Recovery Plan (DRP) with IT system recovery procedures, and an exercise log for tracking tests. Validation tools enforce that every critical function has defined recovery targets and that exercise records are complete.
Distinct from the high-level BC/DR procedure produced by the policy generator — this skill produces the operational plans.
For a step-by-step walkthrough of how this skill fits into the full ISMS flow, see the ISO 27001 ISMS workspace tutorial .
What It Covers
- Business Impact Analysis — identifies critical functions with MTD, RTO, RPO, MBCO, dependencies, and impact levels
- BCP — crisis management team, communication tree, recovery sequences per function, alternate operating procedures, and A.5.29 security-during-disruption controls
- DRP — DR activation criteria, IT system recovery priority, failover procedures, backup restoration, infrastructure recovery, return to normal operations
- Exercise program — tabletop, walkthrough, simulation, and full test records with findings and corrective actions
- RTO/RPO validation — enforces RTO < MTD constraint, flags unrealistic targets, cross-checks DRP system coverage
Example Conversation
You: Create the BCP and DRP for CloudSync Solutions. Our critical functions are the SaaS platform, customer data processing, payment processing via Stripe, and customer support. We run on GCP with GKE clusters in us-east1.
Agent: I’ll start with the Business Impact Analysis to establish recovery priorities. Let me load your organization profile for infrastructure details.
Based on your profile, here’s the BIA I recommend:
Function MTD RTO RPO MBCO Impact Level Dependencies SaaS Platform Availability 8h 4h 1h 50% of API throughput Critical GKE, Cloud SQL, Cloud Storage, Okta Customer Data Processing 12h 4h 1h Read-only access to existing data Critical Cloud SQL, Cloud Storage Payment Processing 24h 8h 4h Manual invoice processing High Stripe API, Cloud SQL Customer Support 48h 24h 24h Email-only support Medium Google Workspace, Zendesk All RTOs are within their MTD limits. Now let me draft the BCP with recovery sequences, the DRP with IT procedures, and an exercise log.
I’ve drafted all three documents. The BCP includes an A.5.29 section documenting which security controls must remain operational during disruption — for example, Okta authentication stays active even in degraded mode, and Cloud Audit Logs continue running. Let me validate.
Running bcp_rto_rpo_validator…
RESULT: PASS — All 4 critical functions have RTO/RPO defined, owners assigned, dependencies identified, and recovery procedures documented. RTO < MTD constraint satisfied for all functions. DRP cross-check confirms all IT dependencies (GKE, Cloud SQL, Cloud Storage, Okta) have recovery procedures in the DRP.
Running exercise_record_checker…
RESULT: PASS — Exercise log contains 1 planned tabletop exercise entry with date, type, scope, and planned participants. Completed exercise fields are correctly marked as pending.
You: What if our GKE cluster goes down completely? Walk me through the DRP sequence.
Agent: The DRP recovery sequence for a full GKE cluster loss:
- Declare DR — CTO activates DR per the activation criteria (>2h unplanned outage affecting >50% of users)
- Security infrastructure first — Verify Okta (IdP) and Cloud Audit Logs are operational before proceeding
- Restore GKE — Provision new cluster from Terraform state in us-central1 (DR region), apply latest node pool config (~45 min)
- Restore Cloud SQL — Point-in-time recovery from automated backup to DR region (~30 min)
- Deploy applications — Apply Kubernetes manifests from GitHub, verify service mesh routing (~20 min)
- Validate data integrity — Run checksum verification against last known-good backup
- Update DNS — Switch Cloud DNS to DR region load balancer
- Recovery security checklist — Rotate service account keys, verify TLS certificates, confirm monitoring alerts active
- Declare recovered — CTO confirms all critical functions operational within RTO
Sample Output Excerpt
Below is a representative excerpt from a generated BCP showing the BIA summary and a recovery procedure.
Business Continuity Plan (excerpt)
Business Impact Analysis Summary
| Function | Owner | MTD | RTO | RPO | MBCO | Impact | Dependencies |
|---|---|---|---|---|---|---|---|
| SaaS Platform | João Silva — CTO | 8h | 4h | 1h | 50% API throughput | Critical | GKE, Cloud SQL, Cloud Storage, Okta |
| Customer Data | João Silva — CTO | 12h | 4h | 1h | Read-only access | Critical | Cloud SQL, Cloud Storage |
| Payments | Maria Santos — CISO | 24h | 8h | 4h | Manual invoicing | High | Stripe, Cloud SQL |
| Support | Carlos Mendes — Support Lead | 48h | 24h | 24h | Email-only | Medium | Google Workspace, Zendesk |
Recovery Procedure — SaaS Platform (RTO: 4 hours)
| Step | Action | Target Time | Responsible | Success Criteria |
|---|---|---|---|---|
| 1 | Activate crisis management team | T+0:15 | CTO | All CMT members notified and on bridge call |
| 2 | Assess scope of disruption | T+0:30 | Platform Engineer | Root cause identified, affected components listed |
| 3 | Verify security infrastructure (Okta, audit logs) | T+0:45 | Security Engineer | IdP operational, logging active |
| 4 | Execute GKE cluster recovery in DR region | T+2:00 | Platform Engineer | Cluster healthy, node pools scaled |
| 5 | Restore Cloud SQL from point-in-time backup | T+2:30 | DBA | Database accessible, data integrity verified |
| 6 | Deploy application workloads | T+3:00 | Platform Engineer | All services running, health checks passing |
| 7 | Switch DNS and validate end-to-end | T+3:30 | Platform Engineer | Users can access platform, API latency normal |
| 8 | Complete recovery security checklist | T+4:00 | Security Engineer | Keys rotated, certs valid, monitoring confirmed |
Information Security During Disruption (A.5.29)
| Security Control | Normal Mode | Degraded Mode | Risk Acceptance |
|---|---|---|---|
| Authentication (Okta SSO) | MFA required for all users | MFA required — no degradation accepted | N/A |
| Encryption at rest (CMEK) | All data encrypted via CMEK | CMEK maintained in DR region | N/A |
| Audit logging | Full Cloud Audit Logs | Logging continues — may have 5-min delay | Accepted by CISO |
| Network segmentation | VPC firewall rules enforced | Same rules applied in DR VPC | N/A |
| Vulnerability scanning | Weekly automated scans | Paused during active DR (max 72h) | Accepted by CISO, resume within 72h |
Extension Tools
bcp_rto_rpo_validator
Validates the BCP for recovery target completeness and consistency:
| Check | What It Does |
|---|---|
| RTO defined | Every critical function has a non-placeholder Recovery Time Objective |
| RPO defined | Every critical function has a non-placeholder Recovery Point Objective |
| MTD constraint | Validates RTO < MTD for each function (ERROR if violated) |
| Owner assigned | Each function has a named responsible person |
| Dependencies identified | System and supplier dependencies are documented |
| Recovery procedure exists | A recovery procedure section exists for the function |
| Unrealistic RTO flagging | WARNING for RTOs under 1 hour on complex multi-component systems |
| DRP cross-check | When DRP path is provided, verifies that IT systems listed as BCP dependencies have corresponding DRP recovery procedures |
| Composite time parsing | Handles RTO/RPO expressed as “4h 30min”, “2 hours”, “30 minutes”, etc. |
exercise_record_checker
Validates exercise log entries for completeness:
| Field | Required For | What It Checks |
|---|---|---|
| Date | Completed exercises | Valid date, not in the future |
| Type | All entries | Tabletop, walkthrough, simulation, or full test |
| Scope | All entries | Non-empty description of what was tested |
| Participants | Completed exercises | Named individuals or teams |
| Findings | Completed exercises | Documented observations (even if “no issues found”) |
| Corrective actions | Completed with findings | Actions identified for each finding |
| Next exercise date | All entries | Planned date for next exercise |
| Frequency check | Completed exercises | WARNING if >13 months between tabletop exercises, >25 months between simulations |
Planned/in-progress entries are handled gracefully — only type and scope are required until the exercise is completed.
Getting Started
Activate the ISO 27001 Business Continuity & Disaster Recovery Plan skill. If you’ve completed the Organization Profile skill, load it — the agent uses your critical processes, technology stack, and locations to drive the BIA and recovery procedures.
Have this information ready:
- Critical business functions and their relative importance
- Current infrastructure architecture (cloud regions, database setup, backup schedules)
- Acceptable downtime per function (helps establish RTO/RPO)
- Key personnel and their alternates for crisis management
- Any existing BC/DR plans, even if outdated
- Supplier dependencies and their SLA commitments
The agent walks you through a BIA, creates both operational plans with detailed recovery sequences, documents A.5.29 security-during-disruption controls, and sets up an exercise log — all validated by tools that check the same things an auditor would.