How to Evaluate Condition Monitoring Solutions as a Plant Manager in Chemical Manufacturing
Evaluating condition monitoring technology in a chemical plant is not the same exercise as in a packaging facility, a food processing plant, or a logistics hub. The physical environment, the regulatory obligations, and the financial stakes create a distinct set of procurement requirements that most vendor pitches are not designed to address. A plant manager who applies a generic evaluation framework to a chemical manufacturing context will end up with a system that passes procurement review and fails operational requirements.
This guide is written for plant managers in chemical manufacturing who are conducting a serious evaluation: those who need to understand what the correct requirements look like, how to structure a pilot, and where the real financial value sits. It covers the three criteria specific to chemical environments that vendors rarely surface proactively, the role of Turnaround (TAR) optimization as the highest-value use case, and how to involve your reliability engineering team in ways that prevent post-deployment regret.
If you are in early-stage research comparing categories rather than vendors, start with Daily Challenges first. This guide assumes you have already decided that continuous monitoring is the right direction and are now evaluating how to buy it well.
- What most plant managers get wrong when evaluating condition monitoring in chemical
- The three chemical-specific criteria vendors rarely surface
- A full evaluation framework with scoring guidance
- How to design a pilot for a chemical environment
- Workforce considerations: who needs to be in the room
- How Tractian is built for chemical plant environments
- Frequently asked questions
What Most Plant Managers Get Wrong When Evaluating Condition Monitoring in Chemical
The procurement error that costs more than the system itself
>
The most common evaluation mistake in chemical manufacturing is not selecting the wrong vendor. It is writing the wrong requirements document. Plant managers who copy evaluation criteria from a general industrial template or from a peer in a different sector miss the three constraints that actually determine whether a system can be installed, whether it will stay compliant, and whether it will survive the operational environment.
>
The result is a system that passes the RFQ, arrives on site, and then requires either a variance request to install in the process area, a costly hardware swap to meet HAZLOC requirements, or a parallel manual inspection program to satisfy the PSM audit that the platform cannot document adequately.
>
Get the requirements right before the vendor conversations start. Everything else follows from that.
The second common error is scoping the evaluation around alert quality when the highest-value use case in a continuous chemical plant is TAR scope optimization. Alert response is valuable, but the ability to bring condition data into a turnaround planning cycle and retire components based on actual degradation rather than calendar age has a direct, quantifiable impact on TAR capital expenditure. That impact is larger than the cumulative value of alert responses over the same period in most continuous process facilities.
The third error is running the evaluation without the reliability engineer. The plant manager owns the business case. The reliability engineer owns the technical requirements. Separating those conversations leads to a platform that the plant manager approves and the reliability engineer cannot use.
The Three Evaluation Criteria Specific to Chemical Plants
1. HAZLOC Certification
Process areas in chemical plants are frequently classified as hazardous locations: Class I, Division 1 or Zone 1 areas where flammable gases or vapors may be present during normal operations. Any sensor or wireless device installed in a classified area must carry the appropriate certification: explosion-proof (Ex d) for devices that can contain an internal explosion, or intrinsically safe (Ex ia) for devices that cannot generate sufficient energy to ignite the surrounding atmosphere.
The relevant certification bodies are UL and CSA for North American installations, and ATEX or IECEx for international standards. Some vendors offer hardware certified to one standard but not another. Some offer HAZLOC-certified hardware only for specific product lines or sensor types.
Ask every vendor for their HAZLOC certification documentation before any other product conversation. If a vendor cannot immediately produce the UL, CSA, or ATEX certificate for the sensor you intend to install in a classified area, that vendor cannot be deployed in your process areas regardless of how compelling the software platform is. This is a safety and compliance requirement, not a preference.
Document the HAZLOC classification of every intended installation location in your pilot design before you finalize hardware orders. Retrofitting the hardware selection after site classification review adds weeks and cost to every deployment phase.
2. PSM Mechanical Integrity Documentation
OSHA Process Safety Management regulation 1910.119(j) requires chemical facilities that handle highly hazardous chemicals above threshold quantities to maintain documented mechanical integrity programs. The regulation specifies written procedures, inspection and testing, correction of deficiencies, and quality assurance. Continuous condition monitoring creates the inspection record trail that supports PSM compliance, but only if the platform can export that data in a format that satisfies the audit.
Evaluate the following during vendor demos:
- Can the platform export asset health trend data with timestamps and alert history in a structured format (CSV, PDF report, or API)?
- Does the export include the data fields typically required for PSM mechanical integrity records: asset ID, inspection date, parameter measured, measurement value, alert threshold, and action taken?
- Is there an audit trail for alert acknowledgments and corrective actions that could be presented to an OSHA inspector?
Most plant managers underestimate this value proposition at procurement time because the immediate benefit is visible (earlier fault detection) and the compliance value is diffuse. Post-deployment, the reliability engineers who have lived through a PSM audit understand it clearly. Vendors who serve the chemical industry seriously will be able to describe how their platform supports this requirement. Vendors who cannot are not calibrated to the sector.
3. API Standards Alignment for Rotating Equipment
Chemical plants frequently operate vibration monitoring and machinery protection programs built on American Petroleum Institute standards: API 670 for machinery protection systems, API 618 for reciprocating compressor design and application, and API 672 for packaged air compressors. These standards define alarm setpoints, sensor placement, and protection system architecture for critical rotating equipment.
New condition monitoring platforms need to coexist coherently with existing wired protection systems on the most critical machines. Evaluate:
- Does the vendor understand the distinction between a continuous monitoring system (what you are evaluating) and an API 670 machinery protection system (likely already installed on your most critical compressors)?
- Is the platform designed to complement wired protection by covering the broader asset population below the critical tier, rather than to replace protection systems that exist for safety shutdown purposes?
- For machines covered by API 618 or API 672, does the platform support the specific failure modes relevant to those machine types (valve wear, rod drop, inter-stage pressure differentials for reciprocating; surge, bearing wear, and thrust for centrifugal)?
The failure mode specificity question is not academic. A platform that generates a generic "high vibration" alert on a reciprocating compressor is not useful. A platform that identifies which valve is failing and on which cylinder is useful. That specificity difference determines whether your reliability engineer can act without contracting a specialist, which is the only operational model that scales across a multi-unit chemical complex.
The Evaluation Framework
Use the following framework to score vendors against the requirements that matter in a chemical manufacturing context. This is not a comprehensive RFQ template; it is a prioritization tool designed to surface the gaps that generic evaluations miss.
Tier 1: Safety and Compliance Requirements (Mandatory Pass)
These are binary. A vendor either meets them or cannot be deployed in your environment.
| Requirement | Evaluation Question | Pass Criteria |
|---|---|---|
| HAZLOC certification | Provide UL/CSA or ATEX/IECEx certificate for hardware intended for classified areas | Certificate on file, specific to the sensor model being deployed |
| PSM documentation export | Demonstrate how the platform exports inspection records for PSM compliance | Structured export with asset ID, date, measurement, threshold, and action fields |
| Intrinsic safety or explosion-proof rating | Confirm Ex ia or Ex d rating appropriate for the area classification | Rating matches the area classification of the intended installation location |
Tier 2: Operational Requirements (High Weight)
These determine whether the platform will perform reliably in continuous chemical operations.
| Requirement | Evaluation Question | Scoring Guidance |
|---|---|---|
| Operating state discrimination | Can the platform distinguish normal operation from startup, shutdown, and standby? | Essential: platforms without state discrimination generate false positives during the startup transient period and lose reliability team confidence within 90 days |
| Fault diagnosis specificity | For each machine type in scope, can the platform identify the specific failure mode, not just a high vibration condition? | Essential for non-redundant assets; acceptable for secondary equipment where response time matters more than diagnosis specificity |
| 24/7 continuous monitoring | Is data captured and analyzed continuously, or are routes required for data collection? | Continuous is required for process-critical assets; route-based is acceptable only for non-critical utilities |
| TAR trend data usability | Can the platform generate a 12-18 month health trend for a specific asset that is usable in TAR scope planning? | Ask the vendor to demonstrate this with an existing customer data set |
| Alert actionability without analyst | Can your reliability engineer act on an alert without contracting external vibration specialists? | Test this in the pilot: present three pilot alerts to your reliability engineer without vendor coaching and measure whether they can determine the appropriate work order |
Tier 3: Integration and Scale Requirements (Medium Weight)
These determine the total cost of ownership and scalability across your asset base.
| Requirement | Evaluation Question | Scoring Guidance |
|---|---|---|
| CMMS integration | Does the platform integrate with your existing CMMS, or does it require manual work order entry? | API integration is preferable; manual entry is a program sustainability risk at scale |
| Fleet deployment consistency | Does the vendor support consistent hardware and software standards across multiple chemical units or sites? | Important if you are evaluating this as a site-level pilot with intent to scale |
| Data ownership and export | Does your organization own the raw data, and can you export it in full if you change platforms? | Non-negotiable for long-term data integrity and PSM audit defensibility |
How to Design a Pilot for a Chemical Environment
A well-designed pilot answers three questions: Does the hardware survive the installation environment? Are the alerts accurate and actionable? Is the data usable for TAR planning?
Asset Selection
Install on the highest-consequence single asset in the pilot unit: the charge gas compressor, the primary boiler feedwater pump, or the main agitator, depending on your process configuration. The pilot asset should be:
- Non-redundant (failure causes a unit shutdown)
- In service during the pilot period (not scheduled for TAR)
- Accessible for sensor installation by a qualified contractor
If the intended installation location is in a classified HAZLOC area, confirm sensor certification against the area classification before ordering hardware. Do not finalize the pilot asset selection until the HAZLOC assessment is complete.
Baseline Documentation
Before deployment, document the current condition of the pilot asset: existing vibration signature if you have one, operating hours since last major maintenance, any known defects or performance deviations. This baseline gives you a reference point for the platform's learning period and prevents ambiguity about whether a detected anomaly predates the installation.
Pilot Milestones
Day 1-30: Hardware installation, baseline learning period. Do not act on alerts during this period unless they represent an emergency condition. The platform is building its operating model. False positive rate during baseline learning is not a meaningful evaluation metric.
Day 30-60: First alert review. Evaluate: are the alerts that have been generated during normal operations accurate? Can your reliability engineer interpret them without vendor assistance? If the answer to the second question is no, ask the vendor to explain how the alert communication will be improved before the pilot ends.
Day 60-90: TAR data review. Ask the vendor to export a 60-day health trend for the pilot asset and present it to your reliability engineer as if it were a TAR planning input. Evaluate: is the trend usable? Would it change a TAR scope decision? If yes, this is a demonstrable financial benefit. If no, understand why not before scaling.
Metrics to Capture
- Alert volume per 30-day period
- False positive rate (alerts that generated work orders and found no defect)
- Time from alert to work order
- Reliability engineer time spent per alert (pre- and post-platform comparison if available)
- HAZLOC compliance verification (documented certification match to installation location)
- TAR planning utility score (qualitative: 1-5, from reliability engineer)
Workforce Considerations: Who Needs to Be in the Room
The Plant Manager and the Reliability Engineer Are Different Buyers
The plant manager owns the capital budget approval, the safety record, the production targets, and the relationship with the site leadership. The reliability engineer owns the technical implementation, the existing vibration program, the API standards knowledge, and the day-to-day interpretation of platform outputs.
These are not interchangeable perspectives. A vendor evaluation that involves only the plant manager will produce a system that is compelling on the financial case and potentially unworkable on the technical implementation. A vendor evaluation that involves only the reliability engineer may over-index on technical sophistication and underweight the operational questions about TAR data usability and PSM documentation.
Both must be involved from the requirements stage, not introduced sequentially. The requirements document that determines HAZLOC certification, PSM export capability, and API standards alignment should be authored jointly.
Involving Operators in Pilot Design
Operators who work with the monitored equipment should understand what the platform does and what they are expected to do when an alert fires. This is not a training question; it is a program design question. If your alert response protocol requires an operator to verify a condition physically before a work order is created, the operator needs to know that before the first alert fires, not after.
In chemical environments with strict access controls and permit-to-work procedures, alert response protocols need to account for the time and administrative load of issuing the permits required to inspect the alerted asset. Build this into the pilot response protocol from day one.
Critical path asset identification and secondary damage prevention: Evaluate whether the platform lets you tag your highest-consequence rotating assets, the non-redundant compressors, agitators, and process pumps whose failure stops the entire process stream, and surface their health status separately from secondary assets. These bottleneck assets on the critical path carry a fundamentally different consequence of failure than assets with backup capacity. The platform should reflect that distinction. On secondary damage: a bearing failure caught at early stage costs a planned repair. The same failure undetected cascades, bearing destroys shaft seal, contaminates housing, potentially damages secondary process connections. A $500 bearing becomes a $30,000 pump rebuild plus an unplanned shutdown. Evaluate fault detection sensitivity at early severity stages, not just late-stage threshold alarms.
False positive rate, the accountability evaluation criterion: In continuous chemical manufacturing, a false positive that triggers an unnecessary investigation on a PSM-covered asset is not just a production cost, it is a compliance burden. Every false positive that goes uninvestigated is a compliance record gap. Every false positive that triggers an unnecessary shutdown has a full process restart cost. Ask vendors: what is their confirmed fault rate on generated alerts? A monitoring system generating a high rate of false positives in a chemical process environment creates alarm fatigue, compliance exposure, and team distrust that undermines the entire reliability program. Evaluate alert precision as a first-order selection criterion.
Pencil whipping prevention, digital accountability and PSM documentation: Digital condition monitoring creates an inspection record that cannot be pencil-whipped and satisfies PSM mechanical integrity documentation requirements simultaneously. Every alert is timestamped, asset-specific, failure-mode-specific, and severity-graded. The PSM compliance record and the operational reliability record are the same data output. For Plant Managers who have managed manual inspection routes where technicians could mark assets as checked without conducting a real inspection, the shift to digital condition monitoring eliminates both the accountability gap and the documentation gap at once. Evaluate whether the platform generates exportable condition records suitable for OSHA 1910.119(j) mechanical integrity documentation.
Asset lifecycle and CapEx protection: Turnaround scope is the largest capital decision a chemical Plant Manager makes in any given year. Calendar-based scope assumptions, replacing components at fixed intervals regardless of condition, produce both over-specification (replacing components with remaining life) and under-specification (missing components that have degraded beyond the inspection interval). Evaluate whether the platform provides 12–24 months of condition trend data exportable for turnaround scope planning. Condition-based lifecycle management reduces CapEx spend by deferring replacements with condition evidence, and builds the credibility of CapEx requests when replacement is genuinely needed.
How Tractian Is Built for Chemical Plant Environments
Designed for continuous process environments
>
Tractian's hardware carries HAZLOC certifications for installation in classified process areas, addressing the safety and compliance requirement that generic industrial platforms cannot meet in chemical environments.
>
The platform's operating state discrimination capability means that startup transients and process state changes are recognized rather than flagged as anomalies, maintaining alert credibility during the periods when false positives are most likely to erode reliability team confidence.
>
For TAR planning, Tractian provides exportable health trend data across the full monitoring period, giving reliability engineers the longitudinal view needed to make component-level scope decisions based on actual degradation rather than calendar estimates.
>
The fault diagnosis layer is designed to surface actionable findings: the specific failure mode, the affected component, and the recommended action, rather than a generic vibration threshold breach. This is the difference between a platform that informs a decision and one that requires a specialist to interpret before a decision can be made.
>
See Tractian Vibration Analysis
See Tractian Vibration Analysis
Tractian continuously monitors equipment health in real time, detecting faults early and preventing unplanned downtime.
Explore the PlatformDo all condition monitoring sensors require HAZLOC certification for chemical plants?
Not all areas within a chemical plant are classified hazardous locations. Utility areas, control rooms, warehouses, and maintenance shops are typically unclassified. Sensors installed in these areas do not require HAZLOC certification. The requirement applies specifically to areas classified as Class I Division 1/2 or Zone 0/1/2 where flammable gases, vapors, or liquids may be present. Confirm the area classification with your process safety team before specifying hardware for any installation location.
How does continuous condition monitoring satisfy OSHA PSM mechanical integrity requirements?
OSHA 1910.119(j) requires documented inspection and testing, correction of deficiencies, and quality assurance for mechanical integrity of process equipment. Continuous monitoring platforms satisfy these requirements by providing timestamped records of asset condition, alert history, and corrective actions. The critical question is whether the platform can export this data in a format that satisfies an auditor's requirements for documentation completeness. Evaluate this with a specific export demonstration during the vendor process.
Can condition monitoring platforms coexist with existing API 670 machinery protection systems?
Yes, when designed for it. API 670 protection systems are safety-critical: they provide the online protection that triggers automatic shutdown when a machine exceeds a protection setpoint. Continuous monitoring platforms serve a different function: they identify developing faults before protection setpoints are reached and support maintenance planning. The two systems target different tiers of the asset criticality hierarchy and should be evaluated as complementary, not competing. Confirm that the vendor understands this distinction and has reference customers who operate both.
What is the typical learning period before a condition monitoring platform generates reliable alerts?
Most machine learning-based platforms require 30 to 90 days to establish a reliable operating baseline, depending on the consistency of the operating regime and the stability of process conditions. During this period, alert credibility is lower and false positives are more likely. Design your pilot to include this learning period before conducting any alert quality evaluation. Platforms that claim instant accuracy without a learning period are typically using fixed threshold alarms rather than adaptive baselines, which means they generate higher false positive rates in variable operating environments.
How should TAR scope optimization be measured as a pilot outcome?
Ask your reliability engineer to evaluate three things at the 90-day pilot milestone: (1) can they produce a health trend for the pilot asset that covers the full monitoring period? (2) does the trend show a degradation pattern that, if observed during a pre-TAR review, would change the scope of work for that asset? (3) could they export this data and present it in a TAR scope meeting as a credible basis for a component-level decision? If the answer to all three is yes, the platform has demonstrated TAR planning utility. That demonstration is more valuable than the number of alerts generated during the pilot.
Should the reliability engineer or the plant manager lead the vendor evaluation?
Both should participate, but with distinct roles. The reliability engineer should own the technical requirements: HAZLOC certification review, API standards alignment, fault diagnosis specificity testing, and data export format validation. The plant manager should own the business case requirements: TAR value quantification, PSM compliance documentation, total cost of ownership, and vendor financial stability. The evaluation scoring should combine both perspectives, weighted by the requirement tier (safety and compliance requirements are not weighted; they are binary).
How does predictive maintenance differ from condition monitoring in a chemical plant context?
Predictive maintenance is a maintenance strategy: replacing or repairing components based on their actual condition rather than on fixed schedules. Condition monitoring is the enabling technology: continuous measurement of asset health parameters that makes predictive maintenance decisions possible. In chemical plants, the two terms are often used interchangeably, but the distinction matters for program design. Condition monitoring gives you the data. A predictive maintenance program defines what you do with it: how you translate health trends into work orders, how you integrate findings into TAR scope, and how you measure the reduction in unplanned downtime over time. Evaluating a condition monitoring platform without designing the predictive maintenance program it will enable is a common source of underperformance post-deployment.
What is the financial exposure of getting the tools evaluation wrong in a continuous chemical plant?
The financial exposure takes three forms. First, a platform that cannot be deployed in classified areas requires hardware replacement after procurement, typically at 40-60% of the original hardware cost. Second, a platform that does not satisfy PSM documentation requirements means a parallel manual inspection program must be maintained, adding labor cost without eliminating the original compliance gap. Third, a platform that is not usable for TAR scope planning means the highest-value financial outcome of continuous monitoring is never realized. The combined exposure across these three failure modes is typically larger than the cost of the platform itself over a three-year period.