Tracking Machine Downtime
Definition: Tracking machine downtime is the systematic process of recording, categorising, and analysing every period in which a machine is unable to perform its intended function. The goal is to identify recurring failure patterns, quantify their financial impact, and reduce both the frequency and duration of future downtime events.
Key Takeaways
- Downtime tracking converts production stops into structured data: timestamp, duration, category, and root cause for every event.
- The two primary categories are planned downtime (scheduled work) and unplanned downtime (unexpected failures), each with distinct sub-categories and cost profiles.
- Key metrics derived from downtime records include MTBF, MTTR, Availability, and OEE.
- Manual paper logs, CMMS platforms, and automated IIoT sensors represent three progressively more capable methods for capturing downtime data.
- Consistent category codes and operator training are the most common failure points in downtime tracking programmes.
- Real-time condition monitoring can automate downtime capture and eliminate under-reporting of short stoppages.
What Is Tracking Machine Downtime?
Tracking machine downtime means creating a timestamped record every time a machine stops producing, noting the duration, the cause, and the category of each event. Without this data, maintenance decisions rely on memory and guesswork rather than evidence.
A well-run downtime tracking programme gives plant managers a factual baseline: which machines fail most often, which failure modes consume the most production time, and whether the maintenance strategy is actually reducing losses over time. That baseline is the prerequisite for every meaningful improvement initiative, from predictive maintenance investment to shift scheduling and spare parts stocking.
Planned vs Unplanned Downtime
Every downtime event belongs to one of two top-level categories. The distinction matters because planned and unplanned downtime carry different cost structures and call for different corrective actions.
Planned Downtime
Planned downtime is any scheduled period in which a machine is intentionally taken offline. Because the stop is anticipated, production schedules can be adjusted and resources can be staged in advance. Common sub-categories include:
- Preventive maintenance: time-based inspections, lubrication, filter changes, and part replacements performed on a fixed schedule.
- Changeovers: time required to reconfigure a machine or line for a different product, format, or batch size.
- Scheduled inspections: regulatory or insurance-driven inspections that require the machine to be de-energised.
Unplanned Downtime
Unplanned downtime occurs without advance notice and therefore carries a much higher cost: emergency labour, expedited parts, and lost production with no buffer time. Sub-categories include:
- Mechanical failure: bearing seizure, gear tooth breakage, belt snapping, seal failure.
- Electrical fault: motor winding failure, contactor burn-out, sensor malfunction, PLC fault.
- Operator error: incorrect setup, missed lubrication, accidental overload.
- Material shortage: raw material or component stockout that halts production upstream.
- Quality-related stop: machine halted because output is out of specification and re-tooling is required.
How to Track Machine Downtime
Three methods are in common use, ranging from simple manual records to fully automated sensor-driven systems. The right choice depends on the volume of events, the criticality of the asset, and the available budget.
Manual Logs
Paper or spreadsheet-based logs ask operators to record the start time, end time, and cause of each stoppage by hand. This approach has zero implementation cost and works on any machine.
The drawbacks are significant: operators under time pressure skip entries, short stoppages go unreported, and handwriting legibility varies. Data must be transferred manually into any analysis tool, introducing transcription errors and delaying insight by days or weeks.
CMMS-Based Tracking
A CMMS (Computerised Maintenance Management System) provides a structured digital form for logging downtime events. Operators or technicians open a work order on a tablet or mobile device, select a downtime code from a pre-built list, and add notes. The CMMS timestamps the entry automatically and links it to the asset record.
This approach eliminates transcription errors, enforces consistent categorisation, and makes downtime data immediately available for reporting. The trade-off is that it still depends on someone physically logging each event.
Automated Sensors and IIoT
Industrial IoT sensors measure current draw, vibration, temperature, or run-state signals to detect the exact moment a machine stops and restarts. The sensor logs the event automatically with millisecond precision. When integrated with a CMMS, the system creates a downtime record and triggers a work order without any operator input.
Automated capture eliminates under-reporting, catches micro-stoppages too brief to log manually, and produces a continuous record that can be analysed in real time. This method is particularly valuable for high-speed packaging, food processing, and automotive assembly lines where dozens of short stops per shift add up to significant lost capacity.
Key Metrics Derived from Downtime Tracking
Raw downtime records become actionable only when they are rolled into standardised metrics. The four metrics below are calculated directly from downtime data and feed into wider maintenance and production performance reporting.
| Metric | Formula | What it tells you |
|---|---|---|
| MTBF (Mean Time Between Failures) | Total operating time / Number of failures | Average run time between unplanned stops; higher is better |
| MTTR (Mean Time To Repair) | Total repair time / Number of failures | Average time to restore the machine; lower is better |
| Availability | Uptime / (Uptime + Downtime) × 100 | Percentage of scheduled time the machine was actually running |
| OEE (Overall Equipment Effectiveness) | Availability × Performance × Quality | Combined production efficiency score; world class is 85% |
Worked Example: Packaging Line Downtime Tracking
A packaging line runs a single 480-minute shift. The maintenance team tracks three unplanned downtime events during the shift using their CMMS:
- Event 1: Conveyor jam, 08:14 to 08:29, 15 minutes, category: mechanical failure.
- Event 2: Label sensor fault, 10:52 to 11:07, 15 minutes, category: electrical fault.
- Event 3: Film web break, 13:40 to 13:55, 15 minutes, category: mechanical failure.
Total unplanned downtime: 45 minutes across 3 events.
Calculations:
- Operating time: 480 - 45 = 435 minutes
- MTBF: 435 / 3 = 145 minutes between failures
- MTTR: 45 / 3 = 15 minutes average repair time
- Availability: 435 / 480 = 90.6%
The downtime category breakdown reveals that two of three events were mechanical failures on the conveyor and film web. The maintenance team schedules a targeted inspection of both components for the following maintenance window, rather than treating each event as an isolated incident.
Downtime Tracking vs OEE Tracking
Downtime tracking and OEE tracking are complementary, not interchangeable. The table below shows where each approach begins and ends.
| Dimension | Downtime Tracking | OEE Tracking |
|---|---|---|
| Primary question answered | Why did the machine stop and for how long? | What percentage of potential output is actually being realised? |
| Data captured | Start/end timestamps, duration, cause code, asset ID | Availability, Performance, Quality percentages and their combined score |
| Loss types covered | Availability losses only | Availability + speed losses + quality losses |
| Root cause detail | High: category codes and technician notes per event | Low: single composite score with no inherent cause attribution |
| Primary audience | Maintenance managers and reliability engineers | Plant managers and production directors |
| Best use | Diagnosing which failures to eliminate first | Benchmarking overall production efficiency against world-class standards |
How to Set Up a Downtime Tracking Programme
A structured setup reduces the two most common failure modes: inconsistent categorisation and operator reluctance to log events. Follow these five steps in sequence.
Step 1: Define Your Downtime Categories and Codes
Build a two-level code structure: a top-level category (planned / unplanned) and a sub-category code (mechanical failure, electrical fault, changeover, etc.). Limit sub-categories to 8-12 codes per asset class. Too many codes create confusion; too few hide useful detail.
Step 2: Map Codes to Your Asset Hierarchy
Assign relevant codes to each asset in your CMMS. A packaging machine and a CNC machining centre have different failure modes. Letting operators see only the codes relevant to their equipment speeds up logging and reduces mis-categorisation.
Step 3: Train Operators on Why It Matters
Operators who understand that downtime data drives maintenance investment and reduces their own workload are far more likely to log accurately. A 20-minute briefing showing the link between downtime records and the maintenance schedule is more effective than a policy memo.
Step 4: Connect Your CMMS to the Shop Floor
Make logging as frictionless as possible. Mobile CMMS apps that allow a technician to log a downtime event in under 60 seconds from a tablet on the machine reduce the barrier significantly. QR codes on each machine that open the correct asset record directly are a simple additional aid.
Step 5: Review Weekly and Close the Loop
A weekly downtime review meeting of 15-20 minutes, focused on the top three contributors by total minutes lost, drives continuous improvement. When operators see that their reported events result in actual maintenance actions, reporting quality improves further. This feedback loop is the difference between a data collection exercise and a genuine reliability programme.
Common Pitfalls in Downtime Tracking
Inconsistent Categorisation
When two operators classify the same failure type under different codes, trend data becomes unreliable. The fix is a code definition sheet posted at each workstation and a monthly calibration review where the maintenance manager checks for miscoded events and retrains as needed.
Operator Reluctance to Report
Operators sometimes avoid logging events because they fear blame for unplanned stops. This is a culture problem, not a system problem. Plants that treat downtime data as a tool for improvement rather than a performance scorecard for operators consistently see higher reporting rates.
Logging Only Long Stops
Stoppages under 5 minutes are frequently skipped on paper-based systems. In high-volume production, these micro-stoppages collectively account for a significant share of total lost time. Automated sensor-based capture is the only reliable solution for this category.
No Feedback to the Front Line
Data collected and never acted upon quickly loses credibility with operators. Displaying a weekly downtime summary on a screen at the cell or line entrance closes the feedback loop and reinforces the value of accurate reporting.
How Real-Time Condition Monitoring Automates Downtime Capture
Condition monitoring sensors attached to motors, gearboxes, pumps, and other rotating assets measure vibration, temperature, and current continuously. When a machine stops, the current drop and vibration cessation are detected within seconds, and the system logs a downtime event automatically.
More importantly, condition monitoring can detect the fault signature that precedes a failure, often days or weeks in advance. When that early warning triggers a planned maintenance intervention, the resulting planned stop replaces what would have been an unplanned failure, shifting the event from the high-cost unplanned column to the lower-cost planned column in the downtime record.
This is the highest-value application of condition monitoring in a downtime tracking context: not just capturing stops faster, but preventing the worst stops from happening at all.
The Bottom Line
Tracking machine downtime is not a reporting exercise. It is the data foundation on which every reliability improvement is built. Without accurate downtime records, MTBF and MTTR are guesses, maintenance downtime budgets are unverifiable, and the case for capital investment in condition monitoring or predictive maintenance cannot be made with evidence.
The most effective programmes combine a CMMS for structured logging with sensor-based capture for automated detection, underpinned by a consistent category code structure and a weekly review cycle. Plants that close the loop between downtime data and maintenance action see measurable reductions in equipment downtime within the first 90 days.
The cost of downtime in manufacturing is rarely fully visible until it is tracked systematically. Once it is, the investment case for better maintenance practically makes itself.
Stop Guessing. Start Tracking Downtime Automatically.
Tractian's condition monitoring sensors capture every machine stop in real time, log the event to your CMMS automatically, and alert your team before the next failure occurs.
See How It WorksFrequently Asked Questions
What is the difference between tracking machine downtime and tracking OEE?
Downtime tracking focuses specifically on recording and categorising the events that stop a machine from running, capturing start time, end time, duration, and root cause for each event. OEE tracking is broader: it combines Availability (which uses downtime data), Performance (speed losses), and Quality (defect losses) into a single percentage. Downtime tracking feeds the Availability component of OEE, but OEE does not capture the root-cause detail that downtime tracking provides. You need downtime tracking to understand why your OEE is low, and you need OEE to understand the full production loss picture.
How do you calculate MTBF and MTTR from downtime records?
MTBF is calculated by dividing total operating time by the number of failure events. If a machine runs a 480-minute shift and experiences three unplanned stoppages, MTBF is 480 / 3 = 160 minutes. MTTR is the average duration of each stoppage: divide total downtime minutes by the number of events. If those three stoppages account for 45 minutes total, MTTR is 45 / 3 = 15 minutes. Both calculations require accurate timestamps on every downtime event.
What downtime categories should every manufacturer track?
At a minimum, manufacturers should separate planned downtime from unplanned downtime. Planned categories include preventive maintenance, changeovers, and scheduled inspections. Unplanned categories include mechanical failure, electrical fault, operator error, material shortage, and quality-related stops. Each category should have a numeric or alphanumeric code so operators can log events quickly and consistently.
Can condition monitoring automate downtime capture?
Yes. Condition monitoring sensors that measure vibration, temperature, current draw, and other parameters can detect the moment an asset stops running and log the event automatically with a precise timestamp. When the sensor data is integrated with a CMMS, the system creates a downtime record, triggers a work order, and closes the record when the machine restarts. This removes the dependency on operator logging, eliminates under-reporting, and captures micro-stoppages that are too short to log manually.
Related terms
PDCA Methodology
The PDCA methodology (Plan-Do-Check-Act) is an iterative four-step management cycle used to continuously improve processes, products, and systems in manufacturing and maintenance.
Performance Degradation
Performance degradation is the gradual decline in an asset's output, efficiency, or reliability over time as components wear, foul, or experience fatigue.
Piece Count
Piece count is the total number of units produced by a machine or line in a set time period. Learn how it is tracked, how it feeds OEE, and how it differs from production volume.
Planned Downtime
Planned downtime is a scheduled period when equipment is intentionally taken offline for maintenance, inspections, or changeovers. Learn how it affects OEE and how to minimize it.
PFMEA
PFMEA (Process Failure Mode and Effects Analysis) identifies process failure modes, rates their Severity, Occurrence, and Detection, and prioritizes corrective actions to prevent defects.