Overview
Create service continuity and availability management plans per Clause 6.3 of ISO/IEC 20000-1:2011. This skill guides business impact analysis, availability target definition, continuity strategy selection, disaster recovery planning, and testing schedules to ensure services can be maintained or restored within agreed timeframes.
Continuity and availability planning answers the question auditors always ask: “What happens when this service goes down, and how fast can you recover?” Without documented RTO/RPO values, continuity strategies, and tested recovery procedures, organizations fail this clause consistently. This skill walks you through the business impact analysis first, then builds the strategies and testing schedule on top of those findings – ensuring every critical service has a documented recovery path.
For a step-by-step walkthrough, see the ISO 20000 SMS workspace tutorial .
What It Covers
- Business impact analysis – criticality classification, RTO, RPO, and MTPD per service
- Availability targets – uptime percentages, service hours, planned maintenance windows, single points of failure
- Continuity strategies – backup, redundancy, failover, alternative processing, and data replication
- Service continuity plan – invocation criteria, escalation, recovery steps, and communication plan
- Testing schedule – tabletop exercises, failover tests, full DR tests, and backup restoration verification
- Continuity-availability register – structured JSON covering BIA, strategies, and testing schedule
Example Conversation
You: We need a service continuity and availability plan for our IT services. We have 5 services in our catalog and some basic backups, but nothing formally documented.
Agent: I’ll build the continuity and availability plan. I found your service catalog. Let me start with the business impact analysis – for each service, I need to understand what happens if it goes down:
- Which services would cause the most business harm if unavailable for more than a few hours?
- Do any services have regulatory or contractual uptime requirements?
- What recovery capabilities do you currently have? (backups, redundant systems, DR site)
You: ERP and email are most critical – ERP being down stops order processing and finance, email disrupts everything. We have daily backups to a secondary data center and the ERP database replicates to a standby server. No formal DR site though.
Agent: Let me propose a business impact analysis:
Service Criticality RTO RPO Impact if Down ERP System Critical 2 hrs 1 hr Order processing stops, financial close delayed Email Service Critical 4 hrs 1 hr Communication disrupted across all departments File Storage High 8 hrs 4 hrs Document access lost, collaboration impacted VPN Service High 4 hrs N/A Remote workers unable to access systems Intranet Portal Medium 1 biz day 24 hrs Internal comms delayed, not business-critical The ERP has the tightest RTO because order processing revenue impact is immediate. Does this match your business expectations?
You: Yes, that looks right. What continuity strategies do you recommend?
Agent: Based on the BIA and your existing capabilities:
- ERP – You already have database replication to a standby. I recommend documenting an automated failover procedure and reducing RPO with hourly transaction log shipping instead of daily backup alone.
- Email – Active-passive failover with async replication. If your email is cloud-hosted, verify the provider’s SLA covers your 4-hour RTO.
- File Storage – Daily backup with 4-hour restore from secondary data center. Consider incremental backups every 4 hours to meet the RPO.
- VPN – Redundant VPN concentrator pair. If one fails, traffic routes to the standby automatically.
- Intranet – Daily backup, restore from backup if needed. Low priority given 1 business day RTO.
For testing, I recommend quarterly tabletop exercises, semi-annual failover tests for ERP and email, and monthly backup restoration verification. Let me build the register and validate it…
Result:
continuity_plan_validator– PASS. BIA completed for all 5 services, RTO/RPO defined for all critical services, continuity strategies documented, and testing schedule present.
Sample Output Excerpt
The skill produces a structured JSON register covering BIA, continuity strategies, and the testing schedule. Here is a representative excerpt.
{
"bia": [
{
"service_id": "SVC-002",
"service_name": "ERP System",
"criticality": "critical",
"rto_hours": 2,
"rpo_hours": 1,
"mtpd_hours": 8,
"business_impact": "Order processing stops, financial close delayed, revenue loss of approx. 50K/hour",
"dependencies": ["SVC-003 (File Storage)", "Database infrastructure", "Network"]
},
{
"service_id": "SVC-001",
"service_name": "Email Service",
"criticality": "critical",
"rto_hours": 4,
"rpo_hours": 1,
"mtpd_hours": 12,
"business_impact": "Communication disrupted across all departments, customer notifications delayed",
"dependencies": ["Network", "DNS", "Email provider SaaS"]
}
],
"continuity_strategies": [
{
"service_id": "SVC-002",
"strategy": "Database replication to standby with automated failover",
"failover_type": "automated",
"backup_frequency": "hourly transaction log shipping",
"recovery_procedure": "Automated failover to standby; manual verification within 30 min",
"alternative_processing": "Manual order entry via phone queue"
},
{
"service_id": "SVC-001",
"strategy": "Active-passive failover with async replication to secondary MX",
"failover_type": "automated",
"backup_frequency": "hourly",
"recovery_procedure": "DNS failover to secondary MX; mail queue drains within 1 hour"
}
],
"testing_schedule": [
{
"test_type": "tabletop_exercise",
"frequency": "quarterly",
"scope": "All critical and high services",
"owner": "Service Continuity Manager"
},
{
"test_type": "failover_test",
"frequency": "semi-annual",
"scope": "ERP System, Email Service",
"owner": "Infrastructure Lead"
},
{
"test_type": "backup_restoration",
"frequency": "monthly",
"scope": "All services with backup strategy",
"owner": "Backup Administrator"
}
]
}
Extension Tools
continuity_plan_validator
Validates continuity-availability.json against ISO/IEC 20000-1:2011 Clause 6.3 requirements:
| Check | What It Does |
|---|---|
| BIA coverage | All services in scope must have a business impact analysis entry |
| RTO/RPO defined | Every critical and high-criticality service must have numeric RTO and RPO values |
| Criticality classification | Each BIA entry must include a criticality level (critical, high, medium, low) |
| Continuity strategies | Critical services must have a documented continuity strategy |
| Testing schedule | At least one test type must be defined with frequency, scope, and owner |
| RTO vs MTPD | Warns if RTO exceeds or equals the maximum tolerable period of disruption |
Getting Started
Start by activating the ISO 20000 Service Continuity and Availability Management skill. The agent will check for your service catalog and SLA register, then guide you through the business impact analysis for each service before building strategies and a testing schedule.
Have this information ready:
- Your service catalog and SLA register (if already created)
- Business impact estimates for each service – what happens operationally and financially if it goes down
- Current backup strategy, recovery capabilities, and any existing DR infrastructure
- Regulatory or contractual requirements around uptime and recovery
- Key personnel responsible for continuity planning and disaster recovery
The BIA results and continuity strategies documented here feed into the service reporting skill for ongoing monitoring and into management review evidence for audit readiness.