What is the formula for calculating FFI?

The standard RCM formula for FFI is: FFI = 2 x MTBF x P(unavailability). In this formula, MTBF is the mean time between failures of the protective device (the average interval between hidden functional failures), and P(unavailability) is the maximum acceptable probability that the device will be found in a failed state at the time of a demand. For example, if a smoke detector has an MTBF of 10,000 hours and the organisation accepts a maximum unavailability of 5% (0.05), then FFI = 2 x 10,000 x 0.05 = 1,000 hours.

What is the difference between an FFI and a standard preventive maintenance interval?

A standard preventive maintenance interval is scheduled based on the expected wear or degradation rate of a component that is in active service and whose failure would be immediately evident. An FFI applies only to hidden functions: equipment that is dormant and whose failure would not be apparent unless a specific demand is placed on it. The FFI is derived from the desired probability that the equipment will work when needed, not from a wear-out or degradation mechanism. The calculation approach, the objective, and the failure mode being addressed are all different.

What types of equipment require an FFI?

Any equipment that performs a protective or standby function requires an FFI. Common examples include: fire suppression systems, smoke and gas detectors, pressure relief valves, emergency shutdown valves (ESDVs), standby pumps and generators, safety instrumented system (SIS) components, circuit breakers and protective relays, and overflow and high-level alarms. All of these can fail in a way that is not detectable during normal operation. Their failure only becomes apparent when the protective demand occurs.

Can the FFI be shortened or lengthened based on experience?

Yes. The FFI should be reviewed when actual failure data is collected. If multiple functional tests reveal no failures, the true MTBF may be longer than the estimate used in the calculation, which would support lengthening the FFI. If failures are found frequently, the MTBF estimate was too optimistic and the FFI should be shortened. The initial FFI is always an estimate based on available data; it improves as maintenance history accumulates.

Who determines the acceptable unavailability probability used in the FFI formula?

The acceptable unavailability probability is determined by the risk assessment for the specific protective function. For safety-critical systems, the target unavailability is set by regulatory requirements, the site safety case, or the functional safety standard (such as IEC 61511 for process safety instrumented systems). For non-safety protective functions, it is typically determined by the RCM analysis team based on the operational and financial consequences of the multiple failure: the scenario in which the primary failure occurs and the protective device is also in a failed state.

Failure Finding Interval (FFI): Definition

Name: Condition Monitoring System
Brand: Tractian
Rating: 4.7 (200 reviews)

Definition A failure finding interval (FFI) is the maximum time permitted between successive functional tests of a hidden or protective function. It specifies how frequently maintenance teams must test dormant equipment, including alarms, emergency shutdowns, and fire suppression systems, to ensure those systems will respond correctly when demanded. FFI is a core output of reliability-centered maintenance (RCM) analysis and is calculated from the desired system unavailability and the device's mean time between failures.

What Is a Failure Finding Interval (FFI)?

A failure finding interval is a scheduled inspection frequency designed for one specific class of asset: equipment that performs a hidden function. Hidden functions are those that will not be demanded, and therefore whose failure will not be noticed, under normal operating conditions. The only way to detect whether such equipment is in a failed state is to deliberately test it.

Examples include a fire suppression deluge system, an emergency shutdown valve, a standby diesel generator, and a high-pressure relief valve. These assets sit dormant for extended periods. If one fails while dormant, no one knows until a fire, process excursion, or power outage actually occurs, at which point the failure has catastrophic consequences.

The FFI answers a precise question: how often must we test this device to keep the probability that it is currently failed below an acceptable threshold?

The concept originates from RCM methodology, formalized in documents such as SAE JA1011 and popularized by John Moubray's RCM II. It is now standard practice in oil and gas, power generation, chemical processing, aviation, and any industry that relies heavily on protective layers.

Why Hidden Failures Need Their Own Maintenance Task Category

Most maintenance tasks address evident failures: degradation that produces noise, heat, vibration, or performance loss that operators or sensors will notice. For these failures, the task frequency is set by the rate at which the asset deteriorates, as governed by the P-F curve.

Hidden failures follow completely different logic. A functional failure of a protective device leaves the system looking entirely normal. No alarm sounds. No performance metric changes. No operator notices anything unusual.

The hazard is not the hidden failure itself. It is the combination of the hidden failure with a second, separate event: the demand on the protective function. This is called a multiple failure. A pump seal may fail with the fire deluge system simultaneously out of service. Neither event alone causes a catastrophe. Together, they can.

Because the failure mode is different, the maintenance logic is different. The task is not to prevent the failure: protective equipment often fails at random, with no age-related pattern that a time-based task could intercept. The task is to find the failure before the demand occurs, by testing the device at a frequency that keeps the probability of an undetected failure acceptably low.

The FFI Formula

The standard RCM formula for deriving an FFI is:

FFI = 2 × MTBF × P(unavailability)

Where:

MTBF is the mean time between failures of the protective device. This is the average interval between hidden functional failures: how often, on average, does this device fail silently?
P(unavailability) is the maximum acceptable probability that the device is currently in a failed state at any given moment. This is expressed as a decimal (for example, 0.05 for 5%).

Worked Example

A facility has a gas detection system. Based on manufacturer data and industry records, the system has a mean time between hidden failures of 8,000 hours. The site safety case requires a maximum unavailability of 5% (0.05) for this protective function.

FFI = 2 × 8,000 × 0.05

FFI = 800 hours

The gas detection system must be functionally tested at least every 800 hours to keep the probability of it being in a failed state below 5%.

If the MTBF or the acceptable unavailability changes, for example if the risk assessment tightens the target from 5% to 2%, the FFI must be recalculated.

Where the Formula Comes From

The derivation assumes that hidden failures occur at a constant, random failure rate (exponential distribution). Under this assumption, the probability that the device has failed at any moment between two successive tests rises linearly from zero immediately after the last test to a maximum just before the next one. The average probability over the full interval is half the probability at the end of the interval, which is why the factor of 2 appears in the denominator, or equivalently, why 2 appears in the numerator in the standard form.

This is an approximation appropriate for early planning. More precise calculations using actual failure distributions (Weibull, for example) can be applied when sufficient failure data is available.

FFI vs Preventive Maintenance Interval: Key Differences

Attribute	FFI (Failure Finding Task)	Standard PM Interval
Applies to	Hidden functions: standby and protective equipment	Active, in-service equipment with evident failure modes
Failure visibility	Failure is not apparent during normal operation	Failure produces an immediate, observable symptom
Interval basis	Desired unavailability probability and MTBF	Degradation rate, P-F interval, or manufacturer recommendation
Task objective	Detect a failure that has already occurred but is undetected	Prevent or reduce the likelihood of the next failure
Task type	Functional test	Inspection, lubrication, calibration, component replacement
Source of interval	Risk/safety analysis and statistical formula	OEM data, engineering analysis, historical records
Failure pattern assumed	Random (no age relationship)	Age-related (wear-out or fatigue pattern)

Understanding this distinction is important when building a maintenance interval library. FFI tasks should not be treated as ordinary time-based PMs: their logic, documentation, and scheduling rationale are fundamentally different.

How FFI Fits Into RCM

In a formal RCM analysis, every function of every asset is assessed through a structured logic tree. Each function is first classified as either evident or hidden. For evident functions, the maintenance task options include condition-based, time-based, or redesign responses. For hidden functions, the first question is always: can a failure finding task be identified that will reduce the multiple failure risk to an acceptable level?

The FMEA component of the RCM study identifies each hidden failure mode and its effects. The FFI calculation then provides the test frequency needed to manage that risk. If no practical failure finding task can reduce the risk sufficiently, the RCM process escalates to redesign: adding redundancy, changing the system architecture, or modifying the operating context to eliminate the hidden failure mode.

FFI tasks are documented in the maintenance plan with a specific task description (what exactly constitutes a functional test), the required frequency, the acceptance criteria (what result confirms the device is functional), and the restoration action if the device is found failed.

Applying FFI in Practice

Common Asset Classes That Require FFI Tasks

Any system whose sole purpose is to respond to an abnormal demand condition is a candidate for FFI management. The most common examples are:

Fire and gas detection systems. Smoke detectors, heat detectors, combustible gas detectors, and flame detectors are dormant until a fire or gas release occurs. Their hidden failure rate and the consequences of unavailability during a fire drive the FFI calculation.
Fire suppression systems. Sprinkler systems, deluge systems, and gaseous suppression systems must be periodically actuated or inspected to confirm they will operate correctly under demand.
Emergency shutdown systems (ESD/ESDV). These valves and logic systems are designed to close on a process excursion. They may be dormant for months or years. Spurious trip rates must be balanced against the risk of failing to close when demanded, which determines both the FFI and the acceptable failure probability.
Standby equipment. Standby pumps, standby generators, and standby HVAC systems require regular run tests to verify that they will start and perform to specification when the primary system fails. The FFI governs how often these run tests must occur.
Pressure relief valves. Relief valves that protect vessels from overpressure are a classic hidden function. They are tested by lifting to confirm the set pressure has not drifted and that the valve will open freely when required.
Protective relays and circuit breakers. In electrical systems, protective relays detect fault conditions and command circuit breakers to open. If the relay or breaker has failed silently, a fault will not be interrupted. FFI testing involves injecting a test signal to verify relay pickup and breaker operation.

Setting the Acceptable Unavailability Target

The unavailability probability P(unavailability) used in the FFI formula is not arbitrary. It must be determined by a risk assessment that considers:

The severity of the multiple failure consequence (safety, environmental, operational)
Regulatory and industry standards that specify minimum integrity levels (for example, IEC 61511 Safety Integrity Levels for process safety systems)
The frequency at which the primary failure, or the demand on the protective function, is expected to occur
Whether other protective layers are in place that reduce the net risk

For safety-critical functions, unavailability targets are typically in the range of 1% to 10%, depending on the consequence severity and the SIL (Safety Integrity Level) assigned to the function. For less critical protective functions, higher unavailability may be acceptable. The conditional probability of failure framework used in risk-based maintenance programs provides a structured basis for these decisions.

What Happens When a Device Is Found Failed

A functional test that reveals a failed state is not a maintenance failure: it is the system working exactly as designed. The purpose of the FFI is to find hidden failures before a demand occurs. When a failure is found:

The device is restored to a functional state immediately (repair or replacement).
The failure is recorded with the date of the last successful test. This provides an upper bound on the time the device was unavailable.
The failure event is added to the historical record for the device. As this record accumulates, the actual MTBF can be estimated and compared to the value used in the FFI calculation.
If failures are being found at a rate that suggests the actual MTBF is significantly shorter than assumed, the FFI must be shortened to maintain the target unavailability.

This feedback loop of test, find, record, analyze, and adjust is what makes FFI management a living program rather than a static schedule. It is also what separates a mature risk-based maintenance program from one where intervals are set once and never reviewed.

Integrating FFI Into the Maintenance Schedule

FFI tasks are scheduled in the same maintenance management system as all other work orders. However, a few practical considerations are specific to failure finding tasks:

The task must be a genuine functional test. A visual inspection of a sprinkler head is not a functional test of the sprinkler system. The test must actually verify that the protective function will operate correctly under its required conditions. Partial tests or proxy measures that do not confirm full functionality do not satisfy the FFI requirement.

Access and safety during testing. Many functional tests involve temporarily defeating or bypassing the protective function in order to test it. This creates a window of unavailability. Good maintenance practice minimizes this window, documents it, and ensures that other protective layers are in place during the test period.

Record keeping. Regulatory audits and safety cases require evidence that FFI tasks have been carried out at the required frequency and that the results have been recorded. Work orders must capture the test procedure followed, the result (pass or fail), and any corrective action taken.

FFI and Modern Condition Monitoring

For some protective devices, continuous or periodic condition-based maintenance techniques can supplement or replace traditional functional testing. Self-diagnostic features in modern safety instrumented systems, for example, detect some failure modes continuously, which effectively reduces the detectable failure rate and may support a longer FFI without increasing the unavailability.

Online partial stroke testing of emergency shutdown valves allows a portion of the valve's travel to be tested during normal operation without fully closing the process. This tests some failure modes (mechanical binding, actuator fault) while avoiding the operational disruption of a full stroke test, and it can support higher test frequencies that would otherwise be operationally impractical.

Predictive maintenance technologies, including vibration analysis, electrical current signature analysis, and thermal imaging, can detect degradation in standby equipment that would not be caught by a binary pass/fail functional test. Integrating these signals alongside scheduled FFI tasks gives a fuller picture of protective system health.

The key principle is that any monitoring technique used to reduce or replace an FFI task must be demonstrably effective at detecting the specific failure modes that the FFI was designed to find. The logic is the same; only the technology changes.

Common Mistakes in FFI Management

Using a fixed schedule without a calculation. Many maintenance programs assign test intervals to protective equipment based on OEM recommendations, regulatory minimums, or habit rather than calculating from MTBF and unavailability targets. This may result in intervals that are far longer than the risk profile justifies.

Treating the FFI as a maximum rather than a target. The FFI defines the maximum allowable interval consistent with the target unavailability. Testing more frequently is always permissible and may be appropriate when operational access makes it convenient. Testing less frequently violates the safety or risk objective.

Not recording test results consistently. The value of the failure finding program depends entirely on the quality of the failure records. If failed devices are restored without being documented, the MTBF estimate is never corrected and the FFI remains based on assumptions rather than evidence.

Conflating the FFI task with the restoration task. The FFI task is the test. If the test reveals a failure, a separate corrective maintenance work order should be raised to restore the device. Mixing these two activities in the same work order makes it harder to track failure occurrences accurately.

Ignoring common cause failures. When multiple identical protective devices are installed in parallel (redundant safety loops, for example), a single cause can fail all of them simultaneously. FFI calculations that assume independent failure modes will underestimate the true unavailability of the system. Staggering the tests of redundant devices helps surface common cause failures that synchronized testing would miss.

Frequently Asked Questions

Is FFI the same as a proof test?

The terms are often used interchangeably in practice. In IEC 61511 and process safety literature, the term "proof test" is used for functional tests of safety instrumented system components. The FFI is the RCM-derived interval at which the proof test must be performed. The calculation method, objective, and documentation requirements are the same.

What if no MTBF data is available for the protective device?

When MTBF data is unavailable, engineers use generic industry databases (OREDA, EXIDA, IEEE 493), manufacturer reliability data, or conservative estimates from similar device classes. The initial FFI should err on the side of more frequent testing, with the interval extended as actual site data accumulates. A sensitivity analysis, recalculating the FFI across a range of MTBF assumptions, helps quantify the uncertainty and set a conservative starting point.

Does the FFI apply to redundant systems differently?

Yes. When redundant protective channels are installed (for example, a 2-out-of-3 voting configuration), the unavailability of the overall system is lower than the unavailability of any single channel. The system unavailability formula accounts for the redundancy configuration. This means that each individual channel may be tested less frequently than a single-channel system would require, while still maintaining the same overall system unavailability target. The calculation must be done at the system level, not the individual component level.

How does the FFI relate to asset availability?

The FFI is calculated to control the unavailability of the protective function, which is a specific type of asset availability concern. However, the testing itself introduces a brief planned unavailability period. For protective systems that must be taken offline to be tested, the time the system is out of service for testing must be factored into the overall availability calculation for the safety layer. This is one reason why modern online testing methods (partial stroke testing, self-diagnostics) are preferred for critical applications.

Is FFI used outside of RCM programs?

FFI calculations are used whenever maintenance intervals for protective or standby equipment need to be formally justified. Regulatory frameworks for process safety, nuclear power, aviation maintenance, and defence systems all require evidence that protective equipment test frequencies are grounded in a quantitative risk assessment. The FFI formula is the standard tool for providing that evidence, regardless of whether the overall maintenance program is formally RCM-structured.

The Bottom Line

The failure-finding interval is the quantitative foundation for testing standby and protective equipment at the right frequency. It replaces guesswork with a defensible, risk-grounded calculation that balances the cost of inspection against the probability of the hidden failure going undetected and contributing to a dangerous or production-impacting event.

In regulated industries such as oil and gas, nuclear power, and aviation, FFI calculations are not optional — they are audit requirements tied to formal safety cases. For maintenance teams in less strictly regulated environments, applying FFI methodology to hidden function tests improves program quality and provides the documentation needed to justify inspection intervals to engineers, managers, and regulators alike.

Detect Hidden Failures Before a Demand Occurs

FFI management requires accurate failure data, consistent test records, and the ability to act quickly when a protective device is found failed. Tractian's condition monitoring platform continuously tracks the health of critical and standby assets, generates work orders automatically at the required FFI interval, and logs every test result in one place, giving your reliability team the data needed to validate and refine your intervals over time.

See Condition Monitoring