8 Essential KPIs for Maintenance Management

Key Performance Indicators (KPIs) are one of the main ways of measuring maintenance results. After all, what is not measured can’t be managed.

They are very important for maintenance managers since routines, teams, processes, and equipment can only be analyzed with their help. Do you know if your equipment and facilities are ready to handle the job?

Basically it is possible to measure any activity that generates numbers or values for maintenance routines. The point is to find out which are the most important KPIs, in order to avoid wasting time following those that are not very relevant. 

By focusing on the right indicators, you can increase your production efficiency. Below is a list of the main maintenance KPIs.

  • MTBF: Mean time between failures
  • MTTR: ​​Mean time to repair
  • Availability
  • Reliability
  • Backlog
  • Machine Downtime
  • MC/ERV: Maintenance Cost as a percent of your Estimated Replacement Value
  • Distribution by types of maintenance

1st Key Performance Indicator (KPI) – MTBF: Mean Time Between Failures

MTBF or mean time between failures is one of the most important indicators for the maintenance sector. It measures the average of good functioning time between the failures of a repairable equipment, and is a great tool to measure machine reliability.

The most efficient way to manage this indicator is to apply it to each piece of equipment. It facilitates the process, as it considers that they will each have a different life cycle. (In this other article we further explain the typical behavior of a machine).

To exemplify, imagine that during a certain period of time,  the electric motor of an industrial plant operated for 140 hours until it failed, then another 190 hours and finally 215 hours. In this case the calculated MTBF would be:

How to calculate MTBF

Once the average time from one failure to another has been identified, we can determine the frequency with which we must place our preventive maintenance activities and inspections within the MPC (Maintenance Planning and Control). 

It is recommended that you calculate 70% of the mean time between failures to perform this inspection. That is, if the electric motor has an MTBF of 181.6 hours, every 127.1 hours (181.6 x 0.7) the inspection on this equipment must be performed.

Logically, the higher the MTBF the better, since the equipment is taking longer to fail, that is, you managed to obtain a lower frequency of breaks.


  • Add the MTBF of all equipment to find the global average
  • Calculate MTBF on irreparable equipment
  • Reset MTBF every month (you need to add it up).

Currently, there are some predictive maintenance solutions that connect hardware and software in order to help maintenance teams control their assets. Through real time monitoring, the hardware, often with IoT (Internet of Things) and AI (Artificial Intelligence) as allies, collects machine data, such as temperature and vibration, and detects when machine repair is needed before it truly breaks.

Maintenance teams that aim for excellence understand that adopting predictive maintenance is the best way to avoid unexpected failures and machine downtime. If you want to learn more about predictive techniques and online asset tracking, check out our Complete Guide about Predictive Maintenance.

2nd Key Performance Indicator (KPI) – MTTR: ​​Mean Time To Repair

This indicator is closely associated with maintainability, that is, the ease with which a maintenance team replaces equipment in conditions to perform their functions after failure. In other words, this KPI indicates the average time to repair an asset.

How to calculate MTTR

Unlike MTBF, the lower the MTTR, the better, so we must work to keep it low.

When applying it to the electric motor example, suppose that during the same period of time, the maintenance team put the engine back into operation in each of the situations:

Failure 1: 9 hours

Failure 2: 15 hours

Failure 3: 12 hours

In that case the calculated MTTR will be:

Calculated MTTR

This way, we are able to measure the profit loss, and even better, how much the company fails to earn when this equipment breaks. If we hypothetically consider that such machinery generates $ 5,000 per hour, the loss of this company with the failure of this equipment would be around $ 60,000 (5,000 x 12).


  • There is no ideal value and reference for MTTR
  • Requiring maintenance teams to maintain a low MTTR can mislead them.

Much better than keeping MTTR low, is to avoid breakdowns. The maintenance manager must encourage the team to use detective and predictive maintenance strategies that are based on asset monitoring. Both of them assess the health condition of the machines, identifying the “symptoms” in real time so that the asset does not lose its performance to the point of reaching a critical situation of failure.

Key Performance Indicators (KPIs) 3 and 4 – Calculating the availability and reliability of assets

These two indicators are really important for Maintenance Planning and Control (MPC), as its main goal is to guarantee and increase the availability and reliability of assets, optimizing productivity. This is why we decided to put them together.

Both are determined based on MTBF and MTTR. But before calculating them, let’s understand the meaning of each one according to the Brazilian National Standard 5462:

Availability: the capacity of an item to be able to perform a certain function at a given time or during a specified period of time.

Reliability: the probability of an item to perform its function specified in the project, according to the operating conditions, in a specific interval of time.

The meaning is similar, right? Let’s try to exemplify better. Equipment availability is the percentage at which the asset has remained available over a given period. Reliability, on the other hand, will be the probability that an equipment will remain available in a given period.

Want to understand how it’s calculated? The availability formula is given by:

How to calculate asset availability

That is, if we take the example of the electric motor (MTBF = 181.6 and MTTR = 12) the inherent availability of the equipment was 93.8%. That is to say, in the period used in the example, the engine normally operated about 93.8% of the time it was turned on. World-class standards determine that good availability is above 90%. That is, in this case the equipment is within the limit.

What if we wanted to calculate the probability that the engine would run in perfect conditions for the next week? In this case the reliability calculation would be:

How to calculate asset reliability

If we apply the formula for the electric motor (MTBF = 181.6) we can come to the conclusion that for the next 7 days (168 hours) the reliability of this equipment, that is, the probability of it operating normally without fail, would be 39.69%. See below:

Asset reliability calculation


  • Indicating reliability without linking it to a period of time

Wrong example: The reliability of this centrifuge is 85.4% – what is the period?

Right example: The reliability of this centrifuge is 85.4% in the next 400 hours

  • Use the formula above for irreparable equipment. For such items, the Weibull Analysis must be used.

5th Key Performance Indicator (KPI) – Backlog

The backlog can be understood as the labor time required to perform all current services, which is the accumulation of activities pending completion. This indicator demonstrates the relationship between the demand for services and the ability to meet them.

We can associate the backlog, as the work load originating from the maintenance activities, among other words , it is the sum of the hourly load of services planned, scheduled, and now pending execution by the maintenance sector.

Since it is a time indicator, its calculation must be given in minutes, hours, days, weeks, months, etc. Let’s calculate.

How is maintenance backlog calculated?

The backlog graph is also of great importance for management decisions, there are basically six types of curves. Consider the vertical axis as Backlog values ​​and the horizontal axis as the months of the year.

Curve A: Stable. It requires analysis to check if it is at an acceptable value for decision making;

Curve B: Decrease in demand. It can generate idle staff due to the drop in services;

Curve C: Backlog with a constant upward trend, which can generate problems such as low maintenance quality;

Curve D: Sudden ascent. It can occur when there are corrective actions with a very high execution time.

Curve E: Sharp drop. In this case, external services might have been contracted, internal mobilization for cost reduction performed, among others.

Curve F: Oscillation. It is usually justified in industries that have a strong seasonality characteristic, such as those related to agriculture.


Associating the backlog with “Overdue activities”. This is a common mistake when using this indicator, as it encompasses much more than that. This KPI refers to all the activities that need to be done, from urgent to normal day-to-day activities.

6th Key Performance Indicator (KPI) – Machine Downtime

One of the maintenance problems that most affects the sector is machine downtime, which is the period of time in which an asset is out of operation, mainly due to an unexpected problem, and it is one of the biggest causes of production loss. 

It happens in all industries and it is not only related to asset downtime, but also impacts production costs, causing financial losses for the companies. 

Thus, we can classify downtime’s negative impacts into two categories:

Tangible costs: these are generally easy to quantify. They include capacity loss, equipment production loss, manpower, inventory, and profit loss (how much the company loses as production stops).

Intangible costs: These are difficult to calculate, but can be even more significant than tangible costs. They include responsiveness, stress, and important aspects of the business.

Generally, factories lose at least 5% of their productive capacity due to downtime, and many of them lose up to 20%.

Among the main negative impacts caused , we can name:

  • Equipment unavailability for production;
  • Sales expectations affected by production downtime;
  • Labor expenses for quality control and maintenance;
  • Continuous time wasted on repairs.

7th Key Performance Indicator (KPI) – MC/ERV: Maintenance Cost as a percent of your Estimated Replacement Value

Another important financial indicator is the MC/ERV, as it is a way of analyzing the Maintenance Cost (MC) used for each piece of equipment and identifying whether it would be more advantageous to keep using the asset or purchase a new one. It is recommended to use this indicator for highly critical equipment.

Its calculation is simple, but first we will understand what is the acronym ERV (Estimated Replacement Value). The Estimated Replacement Value, as the name itself refers, is the amount of capital that needs to be paid to purchase new equipment. Thus, the MC/ERV formula is given by:

Maintenance Cost as a percent of your Estimated Replacement Value calculation

For example, think that $ 4,000 were spent on the maintenance of an overhead crane while the value of a new bridge would be $ 190,000. Therefore, the MC/ERV is 2.10%.

The maximum acceptable value for this indicator is 6% over a period of one year, however this depends on an analysis of the equipment, in some cases 2.5% is enough. If we find a larger number, it means that it is more advantageous to buy new equipment than to keep the old one.

A more effective way to reduce maintenance costs is to change the dynamics of “break, repair”. That is, try to reduce the number of corrective actions to the maximum and use the power of data to predict failures before they actually happen. 

The cost of a predictive maintenance plan is much lower  than the expenses of repairing the equipment and getting it back up and running. (The bathtub curve shows this dynamic very well, click here to learn more).

8th Key Performance Indicator (KPI) – Distribution by types of maintenance

This indicator reveals the percentage of the application of each type of maintenance being developed. Evidently, the type of installation or equipment can determine variations around these values. In general, the maintenance manager must keep the practices of unplanned corrective maintenance up to 20%, and it is always good to restrict them as much as possible. Other practices do not require a close limit, in Brazil, preventive maintenance generally ranges between 30 to 40%. In global reliability standards, companies look for predictive maintenance as the largest percentage in distribution.

Types of maintenance distribution

Maintenance KPI Report: how technology can help your company

Technology, such as maintenance management software and asset condition monitoring devices, has revolutionized maintenance routines, replacing extensive spreadsheets, helping you to collect and manage data from these indicators, making information more trustworthy and giving valuable insights about the reliability and availability of your equipment in real time.

Predictive maintenance: Offline or Online? Discover the differences

If you want to know more about how TRACTIAN is making the routine of maintenance managers more productive and easy, send a message to our team or schedule a demonstration.


Icon - Compartilhe no facebookIcon - Compartilhe no LinkedinIcon - Compartilhe no Whatsapp

About the author:

Foto do Autor

Gabriel Lameirinhas

Founder and Co-CEO of TRACTIAN. Computer Engineer from University of Sao Paulo, Specialist in predictive and passionate about industrial maintenance.

Linkedin do Autor

Are you being surprised by unexpected breakdowns of your machines?

Enter your e-mail and find out how to reduce this cost with our expert!