MTBF and MTTR Explained: How to Calculate and Use These KPIs
MTBF and MTTR are the two reliability KPIs every maintenance manager should track. Here are the formulas, worked examples, and how to read the trends.

Why Two Numbers Can Tell You Almost Everything About Equipment Health
Picture this: it's Monday morning, and you're explaining last week's compressor failure to the plant manager. You know it broke. You know it took most of the shift to fix. But when the plant manager asks, "Is this happening more often than it used to? And are we getting faster at responding?"— you don't have a clean answer. The log is in a spreadsheet that three people edit, timestamps are inconsistent, and the last time someone totaled up repair hours was sometime in Q2.
That gap — between knowing something broke and understanding your equipment's failure pattern — is exactly what MTBF (mean time between failures) and MTTR (mean time to repair) are designed to close.
These two reliability KPIs are the foundation of any honest conversation about equipment health. MTBF tells you how reliably a machine runs between failures. MTTR tells you how efficiently your team responds when it doesn't. Together, they show you whether things are improving, holding steady, or quietly getting worse — and they give you the language to make that case with numbers instead of anecdotes.
By the end of this guide you'll understand exactly how to calculate MTBF and MTTR, read a worked example for each, know what the trends mean in practice, and see how to connect both metrics to the rest of your maintenance KPI stack.
What MTBF Actually Measures (and What It Doesn't)
MTBF — mean time between failures — is the average operating time between one failure and the next, for a repairable asset. A higher MTBF means the machine runs longer before breaking down.
Two things MTBF does not measure:
- Time spent in repair. MTBF counts only operating (uptime) hours — the clock stops the moment a failure occurs and doesn't restart until the asset is back in service.
- Scheduled downtime. Planned preventive-maintenance windows, shutdowns, and changeovers are typically excluded from MTBF calculations, because stopping for a scheduled PM is not a failure. (Some organizations include all non-operating time; just be consistent, and document your definition before you start tracking.)
The MTBF formula
MTBF = Total Operating Time ÷ Number of Failures
That's it. "Total operating time" is the sum of all uptime hours in the measurement window. "Number of failures" is the count of unplanned breakdowns during that same window.
Worked example — MTBF
A hydraulic press ran for a total of 2,400 operating hours in a 12-month period and experienced 6 unplanned failures.
MTBF = 2,400 hours ÷ 6 failures
MTBF = 400 hours
So on average, the press ran about 400 hours between failures — roughly 10 full production weeks if you run one 40-hour shift per week. Whether 400 hours is acceptable depends on your production schedule and what the OEM specifies as the expected service life between maintenance events. The number alone means little; the trend means everything.
The MTBF trend is more valuable than any single snapshot. A rising MTBF over successive quarters is evidence that your PM program is working. A falling MTBF is an early warning that something is degrading — often before a catastrophic failure occurs.
What MTTR Actually Measures
MTTR — mean time to repair — is the average time it takes to restore a failed asset to full operating condition. A lower MTTR means your team finds the fault faster, has the right parts on hand, and gets the machine back up sooner.
MTTR is a measure of response and restoration efficiency, not of equipment quality. Two identical machines in two different plants can have very different MTTRs depending on how well-prepared the maintenance team is: documented repair procedures, stocked spare parts, trained technicians, and clear work-order routing all compress MTTR.
The MTTR formula
MTTR = Total Repair Time ÷ Number of Repairs
"Total repair time" is the sum of all hours spent repairing failures — from the moment the failure is detected to the moment the asset is confirmed back in service. "Number of repairs" is the count of completed repair events in the same window.
Worked example — MTTR
The same hydraulic press had 6 repair events during the year. Total logged repair time across those 6 events was 18 hours.
MTTR = 18 hours ÷ 6 repairs
MTTR = 3 hours per repair
Three hours per failure event. Now you have something to work with: the next question is whether 3 hours is better or worse than last year, and which repair events were outliers. One 10-hour repair in a set of six can mask five 1.6-hour repairs that are actually quite good.
How to Calculate MTBF and MTTR Together
In practice, you calculate both KPIs from the same maintenance history log. Every failure event should record at minimum:
- Asset ID — which machine failed
- Failure start timestamp — when the failure occurred (or was detected)
- Failure end / return-to-service timestamp — when the asset was confirmed operational
- Failure type — unplanned breakdown (counts toward MTBF/MTTR) vs. planned shutdown (excluded from MTBF)
From those four fields you can derive both metrics for any asset, any time window.
Rolling the calculation across your fleet
For a fleet of assets — say, all the motors in your facility — you can calculate MTBF and MTTR per asset (to find your worst performers) or aggregate across asset type (to benchmark a category). The per-asset view is usually more actionable: it tells you which specific machine is eating your repair hours.
How frequently should you calculate these?
Monthly tracking is practical for most SMB plants. Quarterly is a reasonable minimum if your failure rate is low enough that monthly data is too thin to be meaningful. The key is consistency: same definition of "operating time," same definition of "repair complete," every period.
Reading the Trends: What Changes in MTBF and MTTR Are Telling You
Calculating MTBF and MTTR once gives you a baseline. Tracking them over time gives you a management tool.
Rising MTBF
Good news — your assets are running longer between failures. Likely causes: a PM program that's actually being completed on schedule, corrected lubrication intervals, replaced worn components before failure, or better operator care. Research documented by the U.S. Department of Energy's Federal Energy Management Program (PNNL, 2010) found that properly applied preventive-maintenance programs can improve MTBF by 50–75% compared with purely reactive maintenance. That's not a guarantee for your specific assets, but it illustrates the order of magnitude available when PM is done consistently.
If you want to understand what drives MTBF improvement specifically, the strategies for improving MTBF guide walks through the levers in detail.
Falling MTBF
Equipment is failing more frequently. Possible causes: aging assets approaching end-of-life, overdue PMs (check your PM compliance %), changed operating conditions (higher throughput, different materials), or a failure mode that wasn't previously on the PM checklist. A falling MTBF is not an emergency by itself — but it warrants investigation before it becomes one.
Falling MTTR
Your team is getting faster. Possible causes: better fault documentation (so the next technician knows where to start), spare parts stocked closer to the point of use, checklists that guide repair sequence, or simply more experienced technicians. Falling MTTR directly reduces the cost and production impact of every failure event.
Rising MTTR
Repairs are taking longer. Possible causes: parts stockouts, undocumented procedures, technician turnover, increasingly complex faults (often a sign that upstream PM is missing), or work-order routing delays. Rising MTTR is one of the clearest signals that you need better planning infrastructure — not necessarily more people.
The combination that should concern you most
Falling MTBF + rising MTTR — failures are happening more often and taking longer to fix. This combination compounds: each event costs more production time, and the events are closer together. It's the pattern most likely to produce the kind of unplanned downtime that escalates to a plant-manager or ownership conversation. The DOE documents reactive repairs as costing 3–5× more per task than planned maintenance when all costs are counted (U.S. Department of Energy, cited via eWorkOrders, 2026) — a combination of labor overtime, emergency parts premiums, and secondary damage from running degraded equipment.
Connecting MTBF and MTTR to the Rest of Your KPI Stack
MTBF and MTTR don't live in isolation. They connect directly to the other metrics on your dashboard.
PM compliance %
PM compliance % (completed PMs ÷ scheduled PMs, expressed as a percentage) is the leading indicator; MTBF is the lagging result. When PM compliance slips — say, from 92% down to 74% — you will typically see MTBF begin to fall a few weeks or months later as the deferred maintenance accumulates into failures. SMRP Best Practices (cited via eWorkOrders, 2026) set world-class PM compliance at ≥90%, with ≥95% for critical A-class assets and <80% considered not functioning effectively. See the full breakdown in PM compliance % explained.
Planned vs. unplanned work ratio
SMRP and Reliamag (2026) document a world-class planned-to-unplanned ratio of roughly 80/20 (leaders at 90/10). Every unplanned repair event is a data point in your MTTR calculation and a subtraction from your MTBF. Getting above 80% planned work is often as much about scheduling discipline as it is about equipment condition.
OEE (overall equipment effectiveness)
OEE — overall equipment effectiveness — multiplies availability, performance, and quality into a single score. Availability (the "A" in OEE) is directly related to failure frequency and repair duration: more failures (lower MTBF) and longer repairs (higher MTTR) both compress availability and drag OEE down. World-class OEE is ≥85% (Oxmaint, 2026). See the OEE guide for the full availability/performance/quality breakdown.
Maintenance cost as % of RAV
MC/RAV (annual maintenance cost ÷ replacement asset value × 100) benchmarks at 2.0%–3.0% for world-class operations (Factory AI, SMRP-aligned, 2026). Chronic high MTTR typically inflates MC/RAV through elevated labor hours and emergency parts spend — so a rising MC/RAV combined with rising MTTR is a signal worth investigating together.
Practical Steps to Start Tracking MTBF and MTTR at Your Plant
Here's the minimum viable implementation — no enterprise CMMS required.
Step 1: Define your asset list. You can't calculate MTBF and MTTR for assets you haven't identified. Start with your highest-criticality equipment — the machines whose failure stops production — and build outward from there. An equipment asset register is the foundation.
Step 2: Log every failure event with timestamps. This is the step most plants skip or do inconsistently. The technician closes the work order, but the repair-start and return-to-service times aren't recorded, or they're in a notebook that gets transcribed three days later. Decide on a single source of record and enforce it.
Step 3: Separate planned downtime from unplanned failures. Scheduled PM windows don't count as failures. Scheduled changeovers don't count as failures. Draw a clear line in your log so you're not accidentally deflating your MTBF by mixing planned and unplanned events.
Step 4: Calculate monthly, plot the trend. Run the two formulas — MTBF and MTTR — for each asset at the end of each month. Plot them on a simple trend chart. The shape of the trend is the insight; a single data point tells you almost nothing.
Step 5: Act on the outliers first. Sort by MTBF ascending (most failures first) and MTTR descending (longest repairs first). The assets at the top of both lists are where your attention pays off fastest.
From Spreadsheet to System: Keeping MTBF and MTTR Accurate Over Time
The biggest challenge with MTBF and MTTR isn't the math — the formulas are simple. The challenge is data quality. Research from the University of Hawaii (Ray Panko, applied via Oxmaint, 2026) found that approximately 88% of spreadsheets contain errors. In a maintenance context, that means miskeyed timestamps, missed failure entries, or formula references that break when someone adds a row — all of which corrupt the metrics you're relying on to make decisions.
A structured work-order lifecycle — where every failure event flows through Open → In Progress → Completed → Verified stages with timestamped transitions — produces the clean data MTBF and MTTR calculations need. The timestamp discipline is built into the process rather than depending on individual memory or after-the-fact reconstruction.
Maintenance Planning Manager's Professional tier and above automatically tracks MTBF and MTTR per asset from your completed work-order history, so the calculation happens without a separate spreadsheet step. But even if you're not ready for a SaaS tool, the right habit is the same: consistent timestamps, a single source of record, and a monthly review cadence.
If you want a ready-made structure for tracking these metrics in Excel while you're building that habit, the Maintenance KPI Dashboard is a $29 download with pre-built MTBF, MTTR, PM compliance %, and OEE tracking cells — structured to match the formulas in this guide.
What Good Looks Like: Using MTBF and MTTR as a Management Conversation
The goal isn't just to calculate these numbers internally. It's to use them as a structured way to communicate upward and make resource decisions.
"Our hydraulic press MTBF dropped from 400 hours to 280 hours over the last quarter" is a precise, actionable statement. It tells plant leadership that failures are occurring 30% more frequently, it invites a conversation about root cause, and it makes the case for a PM adjustment or a capital repair without requiring the listener to take your word for it.
"Things have been breaking down more" is the same observation with none of the credibility.
Tracking how to calculate MTBF and MTTR is the first step. Using the trends to drive PM program adjustments — changing intervals, adding checklist steps, prioritizing spare parts — is where the reliability gains actually come from. For the full picture of how these KPIs fit into a complete maintenance planning approach, the preventive maintenance planning guide covers the broader framework.
Start Tracking Today
You can start calculating MTBF and MTTR with nothing more than a timestamp log and the two formulas above. The formulas won't change. What will change, as your data quality improves and your trends accumulate, is the quality of the decisions you can make from them.
Download the Maintenance KPI Dashboard (Excel, $29) to get a pre-structured template with MTBF, MTTR, PM compliance %, and OEE cells ready to populate.
Or, if you're ready to move beyond manual tracking entirely, start a 14-day free trial of Maintenance Planning Manager — no per-seat fees, no credit card required to explore, and MTBF/MTTR tracking built into the work-order workflow from day one.
Ready to go beyond the guide?
Get more guides like this in your inbox
Related guides
Maintenance KPI Glossary and Resource Hub
Every maintenance KPI, defined in plain English and linked to a deeper guide — your reference hub for PM compliance, MTBF, MTTR, and OEE.
The Reliability Engineer's Workflow: From Failure Data to PM Intervals
For the reliability engineer, the loop is data → insight → interval change. Here's a practical workflow that turns failure history into reliability gains.
Live KPI Dashboard vs. Spreadsheet: Why Maintenance Metrics Should Calculate Themselves
Hand-calculating KPIs quarterly in Excel means you find problems too late. Here's the case for a live dashboard that updates from real data.