In November 2010, a major mechanical failure in a diesel generator[1]­ caused an engine-room fire on the Carnival Splendor, leaving the ship adrift with no power, refrigeration, or air conditioning[2]. Total cost of the shutdown included towing the ship, flying food to stranded passengers and crew, airfare, refunding cruise tickets, ship repairs, and much more, amounting to losses upwards of $56 million.

The takeaway? Downtime is expensive to both equipment manufacturers and end-users, and amounts to losses greater than the costs of repair. In general, if a typical high-end factory runs 24×7 and generates $4.5 billion in annual revenue, a shutdown would result in losing over half a million dollars per hour or about $144 per second. The sooner a manufacturer can identify the problem and fix it, the sooner those plummeting losses can be stopped.

Thanks to connected machines, manufacturers are beginning to leverage the power of “big data” to avoid failures and gather intimate knowledge about what causes failures in order to prevent breakdown and hemorrhaging costs.

Reactive Maintenance

There are two fundamental approaches to maintaining equipment: reactive and predictive maintenance. Reactive maintenance involves diagnosing and repairing equipment after a breakdown, whereas predictive maintenance focuses on finding ways to address future problems. In order to achieve zero downtime, it’s critical to master effective reactive maintenance, but also dive deep into analytics and learn how to identify patterns proactively.

Deploying a professionally-hosted remote monitoring system can help improve response times by issuing immediate alerts – often in the form of texts and emails – when equipment inevitably fails. Additionally, organizations that can monitor an industrial site via live dashboards, cameras, and machine-performance data logging systems can provide technicians with more information to assess cause-of-failure. This allows them to make the most of their visits with the appropriate tools and replacement parts.

Remote railroad track switches are a great example of an efficient reactive maintenance tool. When cellular modems were first added to track switches, the goal was to inform maintenance crews that the switch was not operating so a crew could be deployed to address it. By adding additional sensors and cameras on-site, technicians gained an understanding of what was broken to correctly diagnose the issue and come prepared to fix it. This eliminates the need for multiple visits, saving significant service costs and minimizing the track’s overall downtime.

Predictive Maintenance

Once a reactive maintenance system is in place to monitor equipment and help diagnose failures when they happen, it’s time to step back and look at the bigger picture:

  • Which specific type of failure is causing the most total downtime?
  •  Are there design changes that can help prevent failure modes from occurring?

Predictive maintenance techniques can help answer these questions, and are used to analyze data from remote assets and alert crews when a breakdown is likely to occur to avoid shutdown altogether. Here’s how it works:

Predictive maintenance systems monitor variables in a machine, process, or product in order to establish a “normal” state. Indication of a pending failure happens when these variables start to deviate significantly from normal values. One of the simplest applications of this concept is monitoring motor current. For example, if a motor in a well-maintained system is observed to be 5.2 amps and over a period of a few months rises to 5.6 amps, this is an indication that the motor is working harder than necessary. It may need lubrication, new bearings, or other routine maintenance. It may also be that there are mechanical issues causing the increased load on the motor. Either way, if this goes unnoticed, it will eventually lead to a shutdown – which can be entirely avoided if the remote monitoring system detects the condition and generates an alert ahead of time.

The previous example depends on the ability to establish a normal operating current, but what if there is no single “normal” value for a variable? For example, a variable speed pump in an outdoor environment pumping a variety of liquids will not have a consistent current draw. It is still possible to establish a normal set of operating parameters, but a broader set of variables (ambient temperature, liquid temperature, viscosity, and pump speed) would have to be measured.

The normal operation in more complex systems is established by running time series calculations on the data which automatically flags data anomalies. Over time, these anomalies are compared to system failures to establish correlations, which are the leading indicator for that particular failure mode.

In the end, identifying failure patterns increases the mean-time-between-failure (MTBF) and reduces the mean-time-to-repair (MTTR). The only downtime necessary in predictive maintenance is the time it takes to service a part, which is a window of time that is strategically scheduled when the right technician, tools, and replacement parts are available.

Tips for implementing remote monitoring

The key to mastering both an efficient reactive and predictive maintenance system is to establish remote connectivity. Many systems today rely on cellular modems. Cell coverage is widespread and continues to grow, and data plans are typically affordable. Many other systems have adapted to transmit encrypted data across secure internet networks, eliminating the need for a cellular plan altogether. Understanding whether an OEM has the expertise to implement secure connectivity and whether they can convince their end-users to allow it is a realistic and often sensitive challenge. Hosted solutions remove this burden from the OEM by providing guaranteed secure connectivity and end-to-end reporting solutions.

While most companies recognize the cost savings from reduced service calls alone, some don’t want to be locked into a system that may not fit their long-term connectivity needs. What they may not realize however, is that the technology supporting remote monitoring is becoming increasingly secure and flexible, and yields more data analytic possibilities than ever before – leading to more strategic insights and dollars saved in the long-run than can even be defined today.

To help transition to a remotely connected environment, choose a vendor and architecture that can scale to accommodate your particular needs. Questions to ask when choosing a vendor include:

  • What additional analytics are available beyond what’s shown?
  • Has the vendor implemented services you may need in the future such as Predictive Maintenance, Downtime Analysis, interfaces to inventory or other business systems, genealogy tracing, etc.?
  • Can your solution be customized? Don’t settle for answers like “we have a do-it-yourself toolkit for that” or “download csv files and build reports from there”.
  • How many assets are currently being monitored by this vendor?

Finally, implement hardware that supports software integration and look for remote connectivity options that can easily interface with the equipment you already have in the field. Minimizing downtime should not require full tear-downs and re-installations.

Remote monitoring isn’t only about telling you the primary cause of equipment failure; it’s also about quantifying downtime and cost-justifying the most appropriate solution. If the total downtime for a failure mode is known, it’s easier to calculate costs in lost revenue and repairs, and make an informed decision as to how quickly a proposed design change will pay for itself.

About the Author

Tom Craven is the Vice President of Product Strategy at RRAMAC Connected Systems.  Prior to RRAMAC, Craven was the Operator Interface Product Manager at GE Intelligent Platforms. He has deep experience working as a sales and applications engineer and is passionate about exploring new ways to bring cloud-based analytics to equipment manufacturers.