Ensuring business recovery from a business continuity event (BCE) depends heavily on properly estimating and planning for maximum tolerable downtime (MTD) for each critical process. In this post, we examine the relationship between MTDs, business impact analysis, and incident recovery capabilities for non-catastrophic events.
What is MTD?
MTD is the maximum time a critical process can be down, or hindered in some way, without irreparable harm to the business. It’s typically calculated as part of a business impact analysis (BIA).
At a high level, a BIA begins with identifying critical processes, describes resources necessary to maintain them, calculates the financial or PR impact on the business if the process fails, and determines process MTD. Other processes which depend on the failed process’ output also suffer. So a BIA must also describe the interrelationship between all critical processes and factor this into their MTDs.
A process MTD is an adjustable value, affected by several variables.
Calculating MTD
Processes can be “down” because of one or more resource issues. There might be in interruption in the supply chain. A server or other supporting infrastructure component failed. Key employees are not available to perform a critical task. These and other scenarios are identified and planned for in a Business Continuity Plan (BCP). The recovery process when one or more failure scenarios occur is tightly coupled with the MTD.
The period from failure to MTD defines the time available for recovery. All efforts to restore process operation must be designed to fit within this constraint. Sometimes, however, it’s apparent during planning that recovery from a specific BCE before the MTD is reached isn’t possible. In these cases, the BCP team must work together to mitigate hourly or daily business impact, extending the MTD.
BCE impact mitigation activities (workarounds) should be defined in the BCP, including:
- Secondary human resources. If key employees are not available, has the organization made arrangements to have a service provider step in to fill the gap? Are managers cross-training employees to eliminate skill set single points-of-failure? Are cross-trained employees given the opportunity to renew or upgrade their skills as technology or processes change?
- Alternative supply sources. Has the organization developed relationships with more than one supplier of critical materials or services?
- Alternative work space. In addition to a hot, warm, or cold site, has an alternate office location been identified? How will employees temporarily access servers and other infrastructure restored after a BCE that partially limits access to normal work areas?
- Tested plans and trained recovery teams. How well are recovery plans documented and tested? How often do recovery teams train? What is the depth of training? The ability of recovery teams to quickly react and perform the right tasks is an important element of pre-MTD recovery.
- Alternative production resources. Does the organization have redundant product delivery infrastructure, either on site or as part of a relationship with a third party?
- Comprehensive communication plan. How well will the organization communicate delivery interruptions? Will customers be kept informed about what is going on? Does a plan exist to work with them to help mitigate loss of good will?
- Meeting payroll or other critical payables. If this process interrupts receivables, how will the organization continue to have access to sufficient working capital? Are there other sources of cash, such as business interruption insurance?
Adjusting these and other workarounds can extend MTD deadlines, giving recovery teams sufficient time to restore a process.
The final word
MTD is a critical line in the sand, a line an organization does not want to cross. This is a brief discussion about an often complex process. Additional information about BIA and MTD is found in the following resources:
Business Impact Analysis (Fletcher)
Risk Management and Business Continuity Planning (Krause and Tipton)






