All Member Open Forum

 View Only
Expand all | Collapse all

Component failures after a shutdown period

  • 1.  Component failures after a shutdown period

    Posted 02-26-2021 01:48 PM
    ​​Hello everyone,
    I am a new member to SMRP but have been working in the reliability field for several years starting out as a technician in the Navy, then a technician for Bridgestone, and now I am working as a reliability specialist.

    A discussion that always seems to come up after an extended shutdown period is why do the machines always seem to experience failures when we go to fire them back up. The closest thing that I have been able to come up with that seems to help explain it is a term called stress relaxation or a change in material components caused from the normal stress of operating being removed and then subsequently reapplied without proper stress conditioning. I know there are other factors that contribute to the failures such as boundary lubrication conditions before forming of an oil wedge for mechanical components, in-rush surges for electrical components, etc, but on the material component level I feel like there would be documentation associated with how the change in stress caused from operating to non-operating to operating again contributes to failures.

    Does anyone know of any documentation associated with the concept I am talking about or what the terminology for the changes in component material is actually called under these circumstances? Stress relaxation is a term that I found that seems to be close but doesn't seem to line up perfectly.

    Also, from everyone's experience, what actions do you take to help mitigate these types of failures? One thought that I have brought up to my manager is after a shutdown period, rather than just turning the machine over to production and letting them go straight to full speed, what if we started up a day earlier to just start rolling the machines without material being produced at a controlled ramp up speed/rate.

    Thoughts and ideas are appreciated.



    ------------------------------
    Thaddeus Lightner
    Bridgestone Americas Tire Organization
    Trenton
    ------------------------------


  • 2.  RE: Component failures after a shutdown period

    Posted 02-26-2021 02:25 PM
    Edited by Torbjorn Idhammar 02-26-2021 02:27 PM
    I think the reasons could be many, but in my experience the most common are:

    1.  Poor startup/ shutdown procedures causing equipment damage (example trying to push the start button 7 times in a row and burn up the AC motor).

    2.  Poor training and requirements of precision maintenance repairs and installations (Balancing, alignment, heat bearings, clean lubricants, torque wrenches, etc)

    3. Unclear standards for centerlining both of the above, correct operational & maintenance procedures, maintenance.

    In materials there are general physical condition that affects the situation (as you mention), in my experience these can be major or minor depending on equipment.  HOWEVER, if they are major, operational procedures should cover these.   In my experience (Mostly heavy process) these issues are mostly related to temperature changes.  Example:  In a cold shut, steam is turned off, if we forget to heat the pipes by having a low flow for awhile, we generate water hammer.  Again, knowing this, it can be avoided with correct operational procedures.  

    So, I don't know your plant that well, just visited. But I'm guessing most have to do with temperature changes in either the process and/or in materials (Heat up a large bearing too quick and you crack the race).  At the end of the day, I would focus on practically implementing / executing the correct O & M procedures, spend a few week in the shop with the E, I ,  & M & Ops. guys, I'm sure they can tell you most of what you need.  The rest you pick up from RCA's?  Again, I don't know your plant, so I may be off target, but this is from my general experience.


    ------------------------------
    Torbjorn Idhammar
    President & CEO
    IDCON, Inc.
    http://www.idcon.com
    Raleigh NC
    ------------------------------



  • 3.  RE: Component failures after a shutdown period

    Posted 02-27-2021 01:55 AM
    During the overhaul or minor maintenance like bearing inspection or seal replacement etc. on any critical piece of rotating equipment follow the procedures and have QA/QC in place  to inspect the component and also to check the quality of job.

    During the overhaul have the spare rotor which has been sitting in a box or handing vertical in controlled atmosphere to check from run out, balance, NDT inspection of rotor and also the critical dimensions before installing in the machines, which will be replaced. These are best practices will save lot of hassles after the fact anything goes wrong during or after start up.

    Just before start up have a pre-start up check list completed around the machine. Perform an over speed trip both mechanical and electronics (based on what is built in the system) after a major overhaul. 

    Last but not the least have start up procedures followed to dots the i's and crosses the t's. Stroking of governor of steam turbine before rolling is very important in both full opening and closing position to check the output signal alignment with opening or closing of the pilot arm of E/H (electro- hydro device).

    Some steam turbines have turning gears to latch and roll  after the lube oil start up and before steam admission. For some steam turbines, start's up is  programmed in the PLC and cannot be overridden.  Fixed slow roll speed during the startup  for a certain amount of time, followed with next speed step change again programmed time in PLC.  If the first critical speed is before running speed, a ramp program is set in PLC and will take it to the minimum governor speed. This all may take a few hours based on the HP &  design  of the machine,
    If it is condensing or condensing/ extraction or back pressure operating  turbine, the admission of the gland steam should be immediately after the machine is put on slow roll.




    ------------------------------
    Francis (Frankie) Castelino P.Eng
    Machinery Engineer Consultant
    Aramco
    Ras Tanura Refinery
    +966547739406
    Saudi Arabia
    ------------------------------



  • 4.  RE: Component failures after a shutdown period

    Posted 02-28-2021 10:22 AM
    Edited by Amro Kamal Kabsha 02-28-2021 10:34 AM
    Good Day,

    I read your message, I quite understand the situation that you are facing. I would appreciate if you elaborate more on these functional failure(s), as we know for every failure there is a mechanism starting from being a hidden failure developing to a potential failure then eventually maybe after micro seconds or years turn to be a functional failure.
    You seem that you faced a breakdown, which indicates to a functional failure. It would be very beneficial if you describe the functional failure(s), number of previous occurrences, and how did you restore the machine back to service - in case it happened before -.
    Form what I understand that you are in the reliability field, hence there is a strong probability that you have a document for FMEA for such equipment, if not you are highly advised to contact the OEM to address your issue(s) and elaborate more with the OEM if there is FMEA for such functional failure(s) or at least OEM has to investigate in the reasons behind such functional failure(s). It is worth a try.
    The previous replies also are very descriptive in terms of SOP, WI, Maintenance Inspection from the QC/QA point of view.
    Finally I wish you the best of luck and welcome onboard.

    ------------------------------
    Amro Kamal Kabsha
    Electrical & Instrumentation Maintenance Supervisor
    Qatar Petroleum Ras Laffan
    Doha
    ------------------------------



  • 5.  RE: Component failures after a shutdown period

    Posted 03-01-2021 08:09 AM
    Thank you Everyone for your responses so far.

    A little background on the place I currently work. We have never had much of a reliability program since commissioning of the plant. My position as a reliability specialist is a newly created position and I have been tasked with establishing the programs necessary for a successful predictive maintenance program. To say that the facility I am currently working in is in a hole is an understatement as maintenance has been driven by production for the last several years. Many of the emergency repairs that are made are just sufficient to get the machines back up and running and fall short of actual restoration of the machine. I know this makes my question earlier even murkier than it already was as I am sure that many of the failures that I am seeing are just failures that have been waiting to happen due to poor maintenance practices.

    That being said, it also brings up the question as to why these failures seem to happen during startup after a shutdown. I suppose it is possible that it is coincidence, however, in this line of work coincidences are unlikely. What stresses are occurring during startup that makes items such as bearings and conveyor belt rollers fail specifically during startup? As Francis commented earlier, for steam turbines, they go through a warm up period that is slow and controlled before ever being released for full operation/loading. Granted this is geared more towards making sure condensate isn't causing impingement damage and allowing the turbine to reach full rated temperature, there is still a procedure that must be adhered to in order to ensure proper startup and minimize the chance of failure. I guess I was wondering if anyone here has seen where doing a controlled startup of a machine has shown any success in minimizing these failures on production equipment such as conveyors and if there is any information specifically associated with the root cause of the failures specifically associated with startup?

    Some specific examples as requested are pillow block bearings, conveyor belt drive rolls, and drive chains. We have had several of these failures over the last several years that have not demonstrated any signs of immediate catastrophic failure and then after a shutdown and start up, they fail. Some of it as I have alluded to are associated with increased friction due to boundary lubrication conditions, cooler temperatures, and even the act of overcoming the inertia of startup being some of the causes for increased loading. I just feel as if there is a more definitive and scientific answer associated with the materials used such as the rolling elements of a bearing and study data showing how those materials respond to being operated 24 hours a day for months on end and then being allowed to unload for a week before returning to full operation.

    I hope this helps clarify and again, thank you for the responses so far.

      ​

    ------------------------------
    Thaddeus Lightner
    Bridgestone Americas Tire Organization
    Trenton
    ------------------------------



  • 6.  RE: Component failures after a shutdown period

    Posted 03-01-2021 07:57 AM
    Looking at this from the material perspective, steel is nearly perfectly elastic. This means it compresses/stretches in proportion to the loads placed on it, and relaxes to its original state when the loads are released. When you bolt something in place it gets stressed, and when you unbolt it the stress is released. When you bolt it back in place the stress goes right back to where it was, provided you used the same procedure. There is no change to the material. You have exactly what you had to begin with.

    (This all goes out the window if you stress something enough to plastically deform it, which is a permanent shape change. You'll know this is the case if something gets bent, twisted, squashed, etc. The standard bolt torque tables are designed to prevent this from happening.)

    Since, with proper installation procedures, there is no net change at the material level, this means other things must be happening to cause failures. I think the previous posters have done a wonderful job of addressing those.

    ------------------------------
    Dale Nicholson, PE, CMRP, CRL
    Reliability Engineer
    Evonik Corp
    Lafayette IN
    ------------------------------



  • 7.  RE: Component failures after a shutdown period

    Posted 03-01-2021 08:36 AM
    Thank you for your response Dale.

    That is more in line with what I am looking for with this specific thread and agree with what you are saying. I am still working to get some programs in place to start collecting machine data such as temperature and vibration​ but have not yet gotten to the point where I have a baseline set of data points to indicate where the machines I am responsible for are currently at. One thought that comes to mind is that of creep which I understand takes a long time for steel to exhibit these characteristics, however, it is influenced by stress and temperature. I suppose, if I make the assumptions that the items I am seeing fail have been properly designed and implemented then the factors causing the failures isn't so much with the material but rather than the conditions that are being exerted upon them. A bearing operating at a higher temperature than designed will start to anneal and the lubrication will not be as effective leading to even higher temperatures caused by friction and the roll point. Coated components, if left to roll in a contaminated or corrosive environment will see a loss of characteristics due to wearing of the coated elements. Both of these examples will lead to changes of the material that make it more prone to failure and a higher initial load on startup just happens to be where these failures tend to exhibit themselves. 

    Again, thank you for your thoughts.





    ------------------------------
    Thaddeus Lightner
    Bridgestone Americas Tire Organization
    Trenton
    ------------------------------



  • 8.  RE: Component failures after a shutdown period

    Posted 03-02-2021 12:56 PM

    Steel needs to reach several hundred degrees F before creep or annealing can begin, so you won't have any problems with those mechanisms. As you said, the failures are due to the conditions and not the materials. Things like corrosion, wear, cracking/spalling, etc. will cause problems, of course, but that's true whether the equipment has been shut down or not.

     

    A couple of potential shut-down related issues, and I apologize if someone has already covered them, are 1. static vs. dynamic friction, and 2. changes in lubricant viscosity due to temperature. Static friction is higher than dynamic friction, so it takes more force to start a machine from a complete stop than it does to keep it running. If grease is old and has more thickener than oil, or if chain oil has collected a lot of crud, it will get hard when it cools off. It takes more force to start a machine from a complete stop if the lube is cold and stiff. Being stiff will make it hard for it to lubricate the metal surfaces properly, which will also promote failure. High start-up loads and suboptimal lubrication may be contributing to your headaches.

     

    One of my favorite books is "Practical Plant Failure Analysis" by Neville Sachs. When you have a failure you can dissect the broken part(s), and his checklists will step you through the process of figuring out what happened. (I have a lot of fun doing this.)  Address that failure mode, across the whole plant if possible, share with us what you found, repeat the process a few times, and you'll be in pretty good shape.

    ------------------------------
    Dale Nicholson, PE, CMRP, CRL
    Reliability Engineer
    Evonik Corp
    Lafayette IN
    ------------------------------



  • 9.  RE: Component failures after a shutdown period

    Posted 03-01-2021 09:36 AM
    Some good responses already, I'll try not to repeat already good answers around startup errors, maint repair errors, infant mortality, etc.  Here are a few additional ones.

    1. Startup stresses - Stress is force per unit area on part.  Force is mass times acceleration.  Mass or acceleration may be constant or could change.  Keeping the Newton's law of motion, an object at rest tends to stay at rest and an object in motion tends to stay in motion outside of friction.  The acceleration of a mass (rotating equipment) is the highest stress time on machine components.  The inertia creates higher stress on components than when normally running on many machines.  There are different reasons on different types of equipment and too much to elaborate on here but this is generally what is happening.  

    How to mitigate?  Startup stresses are normal and the machine should be designed for such but proper training on startup SOP and having the machine in good condition are necessary.  Starting up a machine not loaded up is a typical SOP for most systems.  In some cases on a VFD, a soft start or extending acceleration times on a drive can lower startup stresses.

    2. Thermal growth - Starting up, the machine will heat up.  The heating and cooling means all the machine parts thermally expand and contract.  This changes the relationships with all the parts which induces more stress on machine components.

    How to mitigate?  Make sure the machine is designed and setup to handle the thermal growth and startup procedures do not create a condition to exceed machine design limits.  Warm up procedures to slowly warm up equipment is a must.

    3. Condensation - Humid air on cold surfaces causes condensation.  Condensation inside our equipment leads to lube deterioration.  While this effect typically does not cause failure instantly, it will eventually.

    How to mitigate?  Try to keep heat in areas.  Also keep any circulating oil systems running as much as possible.  Run dehydrators on larger circulating systems.  

    4. Static failure modes - There are some failure modes which occur while the machine is not running (static).  Static corrosion on a rolling element bearing can only occur while the machine is down not running.  Condensation occurs (see #3) and water can collect locally in machine components causing significant damage.  Some low stiffness shafting may suffer from deflection (sag) on long shutdowns so startup may need to consider that.

    How to mitigate?  Similar to #3 on keeping lube systems on.  Some idle equipment may benefit from a slow crawl condition while down.  Must be careful that a slow roll does not run equipment below minimum speed and cause equipment damage.

    5. Startup vibration effects - Due to all of the above, vibration can be affected upon startup.  Could be a mechanical change or solids buildup on a rotor from shutdown.  A fan rotor could sling a buildup mass off on startup (see that often) that increases imbalance.  Increased vibration will decrease component life and similar to condensation or static failure modes, it may not be instant but delayed failures.

    How to mitigate?  Clean and inspect key rotors upon shutdown if possible.  Vibration monitoring will not prevent but will allow intervention before serious failure modes are initiated.

    Good question and reveals the hidden cost of shutting down equipment.  Any floor operations or maintenance mgr will tell you if running, keep it running.  Infant mortality is a very real thing.
    ​​

    ------------------------------
    Randy Riddell, CMRP, PSAP, CLS
    Reliability Manager
    Essity
    Cherokee AL
    ------------------------------



  • 10.  RE: Component failures after a shutdown period

    Posted 03-06-2021 11:51 AM
    I've encountered many occasions where equipment breaks down upon starting up after a shutdown maintenance. The failures can be an inherent defect that was not detected and thus no intervention was carried out. If interventions were carried out and it failed during start up, there can be many reasons from improper installation to process changes. Therefore, I would advocate a good detail root cause investigation and if required do forensic analysis of the failed components will reveal answers on how to prevent such failures again. Don't limit the analysis on what is obvious, example was the component in good condition when it was taken out from storage? I've encountered fake components and inadequate preservation issues.

    ---------------------------------
    Tee Yeow Hum
    Reliability Manager
    Shell
    Singapore
    ---------------------------------





  • 11.  RE: Component failures after a shutdown period

    Posted 03-07-2021 09:40 PM
    Edited by Karl Burnett 03-07-2021 09:46 PM
    Thaddeus,

    You really might have seen things done the best, Navy-style. If you ever went through a shipyard period, you may remember system checkout, hot ops, and then the annoyance of fast cruise. These are essentially what you've recommended already: start up a little earlier and work the bugs out.

    One way to get better is to go through the evolution more often. As painful as it is, I've seen this done with both SSBNs, but also with a 40-year old polymer extrusion facility that shifted from 24/7 operations to 5 days a week. In the 24/7 operations schedule, the unit would startup once a year or less. In the 5 day-a-week schedule, they started up every Monday. We all (ops, maintenance, MRO, production scheduler) had to learn lots of new lessons.

    Scientifically, it's mostly thermal stresses and some chemistry changes, which Torbjorn and Dale mentioned. Randy's comment on condensation is very important. For bearings for instance, condensation can form in the tiny gap where the stationary rolling element meets the race. This can make a tiny rust spot. And, it may just sit there, causing a very subtle scar called false brinelling. I've seen new leaks spring in compressed air and steam systems, right after we started up after fixing the old leaks. This has happened to me in compressed air and service water also, which can only be explained by pressure cycle stress, not thermal stress.

    In any water fluid system, if flow ceases and an air/water interface forms anywhere, the corrosion rate at the waterline accelerates rapidly there. Then when you restart, the corrosion slug may break off and can move to wherever it will annoy you the most, like a pump seal.

    Preventive measures:
    a) shut down intelligently. Shut down slowly, drain it, blow it down, clean it out, and dry it out. I had a system where the operators would shutdown by hitting the emergency stop. Unknown to anybody, this changed the mode of an important control valve. It took us several startup failures to figure that out....and the production engineer figured it out, not the maintenance group. Also, leak surveys and IR surveys before shutdown.
    b) shutdown PMs. The direct ship example I can think of is turning on the winding heaters on a generator, or other systems operations during duty days that just seemed like make-work at the time, like shifting service from port to starboard periodically. In manufacturing, my group has a policy to jog equipment on a weekly basis during short shutdowns using the control system. For longer shutdowns, we shut down the VFDs and do it manually.
    c) as others have said, good startup procedures.

    ------------------------------
    Karl Burnett
    Solvay, Inc
    Anderson SC
    ------------------------------



  • 12.  RE: Component failures after a shutdown period

    Posted 03-08-2021 06:10 AM
    Thank you all for the responses so far.
    I did get to see how things were done "Navy-style" during my time in while serving on an SSN out of Washington state. As Karl started to mention, there were a lot of maintenance tasks and procedures that often times felt like overkill when you first report to the ship and you are tasked with timing the drips issuing from a seal or performing the almost mind numbing exercise of small valve maintenance. Overtime, with diligence, these simple tasks minimize or prevent much larger failures and contribute to the overall operational readiness of the boat or plant depending on what perspective you want to look at it. Having served in the engine room during my time in, we were always told that it was our job to push the boat as fast as we can for as long as we can and I have carried that same viewpoint to manufacturing. From a reliability perspective, it is our responsibility to identify and hopefully eliminate the small nuisances that contribute to machine failures preventing us from running as long as we can or as fast as we can. My big take away from this discussion so far is that a majority of failures are stemming from small unseen lapses in maintenance or foresight such as condensation in bearing races where procedures have either been missed or have never been identified. Reading these discussions, one of the thoughts that was triggered was that during these extended shutdown facilities, we often shutdown the air handling systems that reduce the amount of humidity in the plant. Combine the increased humidity inside the facility with equipment that is in an act of cool down or cooled down and you setup the conditions for that humidity to enter critical points of the machine.

    I have also mentioned how there is no start up procedure currently, going straight from a cold shutdown to full production. I do think that focusing on establishing a startup procedure where either maintenance or even production starts up the machine into a slow roll before just turning it over to rated speed would help significantly with startup issues. This would not eliminate the condensation but from an implementation aspect, I feel like it is something I can push that will result in an almost immediate outcome without having to argue the financials of cost savings associated with not running the air handlers for a week vs the potential failures that doing so may be causing.

    As for inherent defects that are undetected prior to commencing the shutdown period, that poses a pretty significant challenge. While I am working to improve the methods for detecting abnormal conditions by implementing predictive programs such as thermography, vibration analysis, and ultrasonic testing, it is improbable that any of these will find 100% of defects leaving those undetected anomalies to pose an issue to machine reliability. With all of these programs in the initial stages of implementation, I am looking for action items that can be implemented now for little to no cost that will start providing a return on reliability. Prevention is often the cheapest and most impactful which is why I think focusing on the procedures associated with shutdowns and startups may be a good place to start. Moving forward I feel the actions we are taking in our preventative maintenance programs and periodic maintenance schedules would be a good step in tandem with the procedures to start helping reduce these types of failures. Implementation of the predictive programs can then be used to feed into those efforts to help pin point key failures as I get them up and running.

    As always, thank you for your responses. I look forward to reading them as they come in and having the opportunity to get additional perspectives.     ​

    ------------------------------
    Thaddeus Lightner
    Bridgestone Americas Tire Organization
    Trenton
    ------------------------------