Hi Derek,
Fundamentally, I'd say that how much of planned production is actually produced is the final, ultimate measure of reliability. It measures the overall reliability of the manufacturing capacity of the plant, including the operations department, etc.
Most people on this forum are really interested in machine downtime, so you're about to hear a lot about OEE. Properly calculated, OEE will allow you to separate the major reasons for losses. For instance, in my facility, changeover time is variable. It changes with the type of recipe we were changing from and changing to. So demand planning can change our overall productivity significantly from month to month, even if all the machines run without breaking or maintenance in that period! OEE helps treat changeover losses, quality losses, planned maintenance, and breakdowns fairly.
OEE takes a big commitment from both operations and production planning to get the necessary data pulled in. (It's one of those KPIs that is so complex in practice that you'll spend lots of time arguing about assumptions and manipulation.)
Depending on the CMMS and how you use it, you might be able to infer or estimate production losses from work order data. This is not ideal but is at least a start with data that an operations department doesn't handle. You'll have to decide, or agree with production how machine handover and LOTO gets figured. In my case, I don't have good timing for how long LOTO and work permit take, so I just figure it's part of the repair process (and production loss.)
If your workforce dependably reports labor on work orders, that's a great start. If a maintenance person is on scene for a breakdown type work order, you might assume that is a production loss. If it gets fixed and operations doesn't restart the machine for a few hours or days, that is a different type of OEE loss. A workorder on a facility functional location probably isn't a production loss so you'll have to filter work orders however it makes sense for your operation.
A rougher measure might be the time between work order start and finish (again, depending on how you use your CMMS.)
MTBF: don't do this! If you must, first read https://nomtbf.com/ so you know the limitations of this approach.
Spend your time identifying your top-5 bad actors and keep living Crow-AMSAA charts for them. This is a really informative exercise, makes you understand the CrowAmsaa method, but with only 5 it's a light enough workload to complete in 2 hours. And...it targets the biggest loss causes.
The New Weibull Handbook has an example of how to quantify production losses from daily production data. It's actually why I first bought the book. This used to be on Barringer1.com but the site doesn't exist anymore. Search for "production process reliability" or something like that.
I'm wary of KPIs that are calculated in a percent, but where the denominator is highly variable. It makes the KPI value jump around a lot. If it is a monthly production value, you automatically get variation based on the changing number of days in a month. I would try to simplify a % KPI to a simple count of something (number of widgets, pounds of something). It's simply easier to understand.
------------------------------
Karl Burnett
General Electric
Anderson SC
------------------------------