Failure Mode: Definition
Key Takeaways
- A failure mode describes how a component fails, not why it fails or what results from the failure.
- One component can have several independent failure modes, each requiring its own maintenance response.
- Failure modes are categorized by physical mechanism: fatigue, corrosion, wear, overheating, electrical breakdown, and others.
- FMEA and RCM both use failure modes as the starting point for maintenance task selection.
- Knowing the failure mode tells you which monitoring technique or maintenance strategy to apply.
- Condition monitoring is most effective when the targeted failure modes have a detectable P-F interval.
What Is a Failure Mode?
In maintenance and reliability engineering, a failure mode is one of three linked concepts. You identify the failure mode (what failed and how), trace its failure cause (why it happened), and document its failure effect (what happened as a result). All three must be understood before you can select the right maintenance task.
Failure modes are the foundational input to FMEA (Failure Mode and Effects Analysis), Reliability-Centered Maintenance, and most structured reliability programs.
Failure Mode vs Failure Cause vs Failure Effect
These three terms are frequently confused. They are distinct concepts that form a chain: a cause triggers a failure mode, and the failure mode produces an effect on the system.
| Concept | Definition | Example (Pump) |
|---|---|---|
| Failure Mode | The specific way in which an item fails to perform its function | Mechanical seal leaks |
| Failure Cause | The underlying physical, chemical, or human reason the failure mode occurred | Dry running caused face erosion |
| Failure Effect | The consequence of the failure mode on the system, process, or safety | Fluid loss, process contamination, pump shutdown |
| Failure Code | A structured code used in a CMMS to classify the failure mode, cause, and remedy | Problem: P07 / Cause: C14 / Remedy: R03 |
Understanding this chain is critical. Maintenance tasks that only address the failure effect (shutting down and restarting the pump) do not prevent recurrence. Tasks must be designed to address the failure cause or to detect the failure mode before it fully develops.
For a deeper look at how failures are classified in work orders, see the glossary entry on failure codes.
Common Types of Failure Modes
Failure modes are grouped by the physical or functional mechanism involved. The table below covers the most common categories in industrial maintenance.
| Category | Common Failure Modes | Typical Assets Affected |
|---|---|---|
| Fatigue | Fatigue fracture, crack initiation and propagation, surface spalling | Shafts, gears, bearings, structural members |
| Corrosion | General corrosion, pitting, galvanic corrosion, stress corrosion cracking | Pipework, tanks, heat exchangers, structural steel |
| Wear | Abrasive wear, adhesive wear, erosion, fretting | Bearings, seals, pump impellers, conveyor components |
| Overheating / Thermal | Thermal deformation, insulation degradation, seized components, warped surfaces | Motors, transformers, gearboxes, brakes |
| Electrical | Insulation breakdown, short circuit, open circuit, arcing, ground fault | Motors, switchgear, cables, control panels |
| Deformation | Plastic deformation, buckling, creep, overload yielding | Pressure vessels, structural frames, shafts |
| Leakage / Seal | Seal extrusion, O-ring degradation, flange leakage, valve seat leakage | Pumps, valves, compressors, hydraulic systems |
| Contamination / Blockage | Filter clogging, foreign object ingestion, fouling, scale buildup | Heat exchangers, filters, lubrication systems, nozzles |
| Control / Instrumentation | Signal loss, sensor drift, false trip, software fault, communication failure | PLCs, transmitters, safety systems, control valves |
One component can have several independent failure modes. A centrifugal pump, for example, may be susceptible to seal leakage, bearing wear, impeller erosion, cavitation damage, and motor insulation breakdown as entirely separate failure modes. Each one may require a different maintenance approach.
How Failure Modes Are Identified and Documented
Failure modes are not assumed from design specs alone. They are identified through a combination of structured analysis and operational experience.
Structured Analysis Methods
FMEA workshops bring together engineers, operators, and maintenance planners to systematically ask: "In how many ways could this component fail to perform its function?" Each answer is a potential failure mode.
Reliability-Centered Maintenance (RCM) analysis uses a structured decision logic to work through every function and every potential failure of each asset. See the entry on Reliability-Centered Maintenance for a detailed breakdown of the RCM process.
Fault Tree Analysis works in reverse: starting from an undesired top-level event and tracing down through the failure modes and causes that could produce it.
Operational and Historical Data Sources
Historical failure analysis records, work order histories, and FRACAS (Failure Reporting, Analysis, and Corrective Action System) data are among the most valuable inputs. They reveal which failure modes have actually occurred, how frequently, and under what conditions.
Operator and technician interviews capture failure modes that have been observed but never formally recorded. Field experience often surfaces failure modes that no design engineer anticipated.
Documentation in a CMMS
Once identified, failure modes should be documented in a structured format. In a CMMS, this is typically done through failure codes linked to work orders. Each closed work order records the problem (failure mode), the cause, and the remedy applied. Over time, this creates a queryable database of failure mode frequency and consequence data.
Standardized failure code libraries prevent the data quality problems that result when technicians describe the same failure mode in dozens of different ways.
Failure Modes in FMEA
FMEA is the most widely used formal process for identifying and analyzing failure modes. The approach is the same regardless of FMEA type: list functions, identify failure modes, analyze effects and causes, score risk, and assign corrective actions.
FMEA Types and Their Focus
| FMEA Type | Focus | Primary Use Case |
|---|---|---|
| Design FMEA (DFMEA) | Failure modes inherent in product or component design | New product development, engineering change review |
| Process FMEA (PFMEA) | Failure modes in manufacturing or operational processes | Quality control, process improvement |
| System FMEA | Failure modes across an integrated system or subsystem | Complex equipment, production lines, safety systems |
| FMECA | Failure modes plus criticality ranking by probability and severity | Defense, aerospace, high-consequence industrial assets |
For more detail on Design FMEA, see the entry on DFMEA. For process-level analysis, see PFMEA. For criticality-weighted analysis, see FMECA.
How FMEA Uses Failure Modes
In an FMEA worksheet, each failure mode is scored across three dimensions:
- Severity (S): How serious is the effect of this failure mode on safety, operations, or quality?
- Occurrence (O): How likely is this failure mode to occur in a given timeframe?
- Detectability (D): How easily can the failure mode be detected before it causes a full failure?
The three scores are multiplied to produce a Risk Priority Number (RPN). High RPN failure modes receive corrective actions first: redesign, additional inspection, new monitoring, or modified maintenance tasks.
Failure Modes in RCM
Reliability-Centered Maintenance uses failure modes as the basis for selecting the right maintenance task for each asset function. The RCM decision logic asks, for each failure mode:
- Is the failure mode evident or hidden?
- Does it have safety or environmental consequences?
- Is there a condition-based or predictive task that can detect it in time to take action?
- Is a scheduled restoration or replacement task applicable and cost-effective?
- Is redesign or redundancy the only viable option?
RCM analysis often surfaces hidden failure modes in protective devices, where a component has failed but the failure has not been detected because the device has not been called upon to act. For these, a Failure Finding Interval (FFI) is established to test the device periodically.
The bathtub curve is also relevant here: it illustrates that different failure modes dominate at different stages of an asset's life, and the appropriate maintenance response changes accordingly.
Failure Mode Patterns and the P-F Interval
Not all failure modes behave the same way over time. Some develop gradually and give advance warning. Others occur suddenly with no detectable precursor.
The P-F curve describes the interval between the point at which a failure mode becomes detectable (potential failure) and the point at which it becomes a functional failure. A longer P-F interval gives maintenance teams more time to respond.
Failure modes with long, detectable P-F intervals are the best candidates for condition monitoring and predictive maintenance. Failure modes with no detectable interval require a different approach: redundancy, redesign, or run-to-failure if consequences are low.
Six Failure Mode Patterns (Nowlan and Heap)
Research by Nowlan and Heap identified six statistical patterns of failure mode occurrence in industrial and aerospace equipment:
| Pattern | Description | Best Maintenance Response |
|---|---|---|
| A - Bathtub | High early failure, stable useful life, increasing wearout | Break-in inspection, then condition monitoring |
| B - Wearout | Increasing failure probability after a useful-life period | Scheduled replacement before wearout |
| C - Gradual aging | Steadily increasing failure probability with no distinct wearout point | Condition-based monitoring |
| D - Initial low / then high | Low failure probability early, then increasing | Monitoring after initial break-in period |
| E - Random | Constant failure probability at any age | Redundancy, run-to-failure (if low consequence) |
| F - Infant mortality / then random | High early failures, then low random rate | Burn-in testing, quality control at installation |
The practical implication is significant. Age-based maintenance tasks, such as replacing a component every fixed interval, are only appropriate for failure modes that follow patterns A or B, where a clear wearout age exists. For the majority of industrial failure modes, condition-based approaches or redesign deliver better results at lower cost.
How Failure Mode Data Improves Maintenance Strategy
Failure mode data, when systematically collected and analyzed, changes the quality of maintenance decisions at every level.
Task Selection
Each failure mode has a technically feasible maintenance response. Knowing the failure mode tells you whether vibration analysis, oil sampling, thermography, ultrasound, or a time-based replacement interval is the right tool. Applying condition monitoring to a failure mode that has no detectable precursor is waste. Applying a fixed interval to a failure mode that follows a random pattern is equally wasteful.
Vibration analysis, for example, is highly effective at detecting bearing fatigue, misalignment, and imbalance before they become functional failures. But it provides no early warning for a sudden electrical short circuit.
Maintenance Budget Justification
When failure modes are documented with their frequency, severity, and detection lead time, maintenance teams can quantify what each monitoring task is worth. A failure mode that causes four hours of unplanned downtime per event, occurs six times per year, and is detectable two weeks in advance with continuous monitoring provides a clear return-on-investment case for the monitoring investment.
Criticality Analysis
Failure mode data feeds directly into criticality analysis, where assets are prioritized by the risk their failure modes create. High-frequency, high-consequence failure modes on critical assets receive the highest maintenance attention. Low-consequence failure modes on non-critical assets may be intentionally managed through run-to-failure.
Failure Lifecycle Management
Tracking failure modes across an asset's life allows teams to identify patterns, compare performance across similar assets, and manage the full failure lifecycle from detection through resolution and prevention. Over time, this data becomes the basis for revising PM intervals, updating spare parts inventories, and informing procurement decisions when assets are replaced.
Failure Mode vs Functional Failure
A functional failure is the state in which an asset can no longer perform its required function at the required standard. A failure mode is the mechanism that leads to that state.
A pump has the function "deliver 500 liters per minute at 4 bar." Failure modes such as impeller erosion, bearing seizure, or seal leakage can each independently cause this functional failure, but through different physical mechanisms and at different rates. Each failure mode must be addressed separately.
This distinction matters in RCM, where the analyst first defines functions and functional failures, then works down to identify the failure modes responsible for each functional failure.
How Condition Monitoring Targets Failure Modes
Condition monitoring is most valuable when it is applied to specific, known failure modes with measurable P-F intervals. The monitoring technique must match the physical signature of the failure mode being targeted.
| Failure Mode | Detectable Indicator | Monitoring Technique |
|---|---|---|
| Bearing fatigue | Increased vibration at bearing frequencies | Vibration analysis |
| Gear tooth wear | Metallic particles in oil, gear mesh frequency change | Oil analysis, vibration analysis |
| Motor insulation degradation | Rising winding temperature, increased current draw | Thermal monitoring, current analysis |
| Seal leakage (early) | Ultrasonic emission at seal face | Ultrasonic testing |
| Corrosion / wall thinning | Reduced wall thickness measurement | Ultrasonic thickness testing, corrosion monitoring |
| Electrical connection overheating | Thermal hotspot at connection point | Infrared thermography |
| Cavitation | High-frequency acoustic signature, vibration spike | Acoustic monitoring, vibration analysis |
When condition monitoring is deployed against known failure modes, every alert carries context. The team knows what is failing, how fast it is likely to progress, and what corrective action is required, rather than receiving a generic alert and spending diagnostic time narrowing down the cause.
Frequently Asked Questions
What is a failure mode?
A failure mode is the specific way in which a component, system, or process fails to perform its required function. It describes what goes wrong physically or functionally, such as a bearing seizing, a seal leaking, or a motor overheating, without explaining why it happened or what effect it produces.
What is the difference between a failure mode and a failure cause?
A failure mode is what fails and how it fails (for example, fatigue fracture of a shaft). A failure cause is the underlying reason the failure mode occurred (for example, misalignment causing cyclic stress). A failure effect is the consequence of the failure mode on system operation (for example, conveyor stops production). FMEA analyzes all three in sequence.
What are the most common types of failure modes in industrial equipment?
The most common failure modes in industrial equipment include fatigue fracture, corrosion, erosion, wear, overheating, deformation, electrical short circuit, insulation breakdown, leakage, contamination, blockage, and control or sensor signal loss. The specific failure modes that matter most depend on the equipment type, operating environment, and criticality.
How are failure modes identified in practice?
Failure modes are identified through structured methods including FMEA workshops, review of historical work order and failure code data, field interviews with operators and technicians, equipment teardowns, and analysis of sensor and monitoring data. RCM programs use a systematic function-by-function analysis to ensure no failure mode is overlooked.
How does understanding failure modes improve maintenance strategy?
When a team knows the specific failure modes an asset is susceptible to, they can select maintenance tasks that directly address those modes. Detectable failure modes with a measurable P-F interval are suited to condition-based and predictive maintenance. Random or sudden failure modes may require redesign or redundancy rather than scheduled inspection.
What is the role of failure modes in FMEA?
In FMEA, failure modes are the starting point of the analysis. For each function, analysts list every potential failure mode, then identify the effects of each failure mode and the causes that could trigger it. Each failure mode is scored for severity, occurrence, and detectability to calculate a Risk Priority Number (RPN), which drives corrective action priorities.
Can one component have multiple failure modes?
Yes. Most industrial components have multiple failure modes. A pump, for example, may experience seal leakage, cavitation, impeller erosion, bearing wear, and shaft misalignment as separate and independent failure modes. Each failure mode must be analyzed separately because each may require a different maintenance response.
The Bottom Line
A failure mode is more than a label on a work order. It is the specific physical or functional mechanism by which an asset stops meeting its required standard. Identifying failure modes accurately, and distinguishing them from failure causes and failure effects, is the foundation of every effective reliability program.
When failure modes are systematically captured in FMEA, RCM, and CMMS failure codes, they enable maintenance teams to select the right task for the right asset at the right time. Condition monitoring adds the most value when it is matched to specific failure modes with detectable P-F intervals, turning generic sensor data into actionable early warnings.
Monitor the failure modes that matter most
Tractian's condition monitoring platform continuously tracks vibration, temperature, and current across your rotating assets, alerting your team when a known failure mode is developing, before it becomes a functional failure.
See How Condition Monitoring WorksRelated terms
Equipment Monitoring: Definition
Equipment monitoring measures key machine parameters in real time to detect faults before failure. Learn how it works, the types of monitoring, and how it differs from condition monitoring.
Equipment Repair History: Definition
Equipment repair history is the cumulative record of all corrective maintenance events on an asset. Learn what to capture, how a CMMS stores it, and how it supports replace-or-repair decisions.
Equipment List: Definition
An equipment list is a structured inventory of all physical assets in a facility. Learn what it should include, how it differs from an asset register, and how a CMMS manages it automatically.
Forward Scheduling: Definition
Forward scheduling starts from today and assigns tasks forward in time. Learn how it works, how it compares to backward scheduling, and when to use it in maintenance and production planning.
Preventive Maintenance Schedule: Types
A preventive maintenance schedule defines when PM tasks should be performed on each asset. Learn the types, how to build one, and how a CMMS automates scheduling and work order generation.