Hi Mike.
In addition to the points above, I would emphasize that predictive maintenance must be connected to a clear execution model. Many organizations have already invested in monitoring platforms, sensors, and automation tools, but the reliability value is only realized when those insights are converted into prioritized work orders, planned maintenance activities, stocked critical parts, and verified corrective actions.
A strong predictive maintenance program should start with known failure modes and asset criticality. For example, vibration analysis should be targeted toward rotating equipment such as pumps, fans, motors, compressors, and chillers. Ultrasound can support leak detection, electrical discharge detection, steam trap assessments, and compressed air system reliability. Oil analysis can provide early indicators of wear, contamination, lubricant breakdown, and internal component degradation. Coolant and fluid monitoring can help identify corrosion, biological growth, improper concentration, and heat-transfer performance issues. Generator fuel reliability is also important because degraded fuel, water intrusion, microbial growth, and contamination can compromise emergency power availability when it is needed most.
Another important opportunity is to integrate predictive maintenance findings directly into the CMMS. Alerts should not remain isolated in BMS, SCADA, DCIM, or vendor dashboards. They should trigger a defined workflow that includes risk ranking, ownership assignment, job planning, parts verification, execution, and feedback capture. This helps move the organization from simply observing asset conditions to actively managing asset risk.
There is also value in using predictive maintenance data to continuously refine FMECA and RPN scoring. As more condition data becomes available, teams can better understand which failure modes are increasing in occurrence, which assets are becoming harder to detect before failure, and which risks require a change in maintenance strategy. This creates a feedback loop between field observations, sensor data, incident history, asset criticality, and maintenance planning.
Ultimately, the next level of reliability maturity is not just technology deployment. It is building the process discipline, governance, and culture needed to turn condition data into timely decisions. Predictive maintenance should help teams move from reactive response to planned intervention, from isolated alarms to risk-based prioritization, and from historical maintenance schedules to dynamic strategies based on actual asset health.
------------------------------
Robert Gafeney
Sr. Reliability Engineering Manager
CBRE
Kathleen GA
------------------------------
Original Message:
Sent: 05-05-2026 07:12 AM
From: Mike Doolan
Subject: Data Center Reliability
Looking for any members working on driving reliability programs into data centers. Seems to me reliability (uptime) in data centers is driven by redundancy vs condition monitoring and predicative maintenance. There are pockets of thermography and some oil analysis but rarely through a programmatic approach. Any views?
------------------------------
Mike Doolan
Global Technical and Reliability Director
CBRE
------------------------------