[Webinar] Kafka + Disaster Recovery: Are You Ready? | Register Now

Delivering Real-Time Manufacturing Predictive Maintenance

Written By

Machines in motion, much like data in motion, require an engine to drive their journey. While Confluent and Apache Kafka® are your data-based engine, transitioning the perspective from software and data to hardware requires a more literal engine. No matter the machine—planes, trains, or automobiles—the precision engineering on the motor is key to performance. Quality issues in engine manufacturing quickly put an end to any journey.

End customers bring the fuel, and the engine is brought to you by the manufacturer. The engine is a complex component engineered, machined, and tested to exacting standards. The smallest out-of-spec imperfection is visible, be it as reduced power, increased vibration, or even total failure.

Confluent is used within the engine manufacturing industry to provide real-time visibility while this critical component is machined, assembled, and tested. Bringing the power of IT into the traditional manufacturing world of operational technology (OT) increases product quality, reduces waste, and limits the risk of a customer left hanging. 

Learn More: Data Streaming at the Core of Industry 4.0 →

Building an engine is a complex process

Precision manufacturing doesn’t just happen. Every aspect of the process builds on top of a deliberate design. Materials are selected for strength, weight, cost, and any number of other critical attributes. Machining transforms raw materials into the shapes and sizes needed to impart the right behavior physics. Assembly binds these parts together. And finally, each engine is started for the all-important integration test.

Each step along the way is managed by a manufacturing execution system (MES). While they are programmed to perform their exacting tasks with repeatability, the physical world is nothing if not an entropy funnel. That’s why the best-performing manufacturers capture data throughout each process to catch and respond to quality issues in real time.

Consider a few specific steps in the process:

  • Aluminum for the cylinders and pistons is machined to a specific mating diameter

  • Packing grease is applied to prevent rust

  • Gaskets are installed to insure good compression

Each of these steps produce two critical data points:

  • Manufacturing output is nominal (no errors detected by the machining or assembly tooling itself—such as power disruption)

  • Component testing is within tolerance

During the manufacture of a single engine there can easily be several dozen steps with corresponding opportunities for unit testing.

Manufacturing data use cases:

Consider the two primary use cases for our manufacturing data: engine quality and tooling quality. We get insight into these from the same datasets.

  • Engine quality is determined by the instantaneous results of unit tests

  • Tooling quality is determined by the trend of test results

The ability to continuously capture and quickly act on this data is crucial to modern manufacturers and is a key driver behind data infrastructure transformation.

The cost of data timeliness

Manufacturing and distribution are intensive processes; for instance, the supply chain to source raw materials, capital equipment used for production, and finally distribution to end customers. Each step costs time, labor, and capital and must be precisely managed.

Errors in engine or tooling quality quickly disrupt any smooth-running system. Two common impacts are:

  • Engine quality issues identified by the customer, which may result in damage to the brand and ship/remanufacture/reinstallation costs.

  • Engine quality issues identified by the final integration test, which may result in scrapping multiple engine components and manually rebuilding the engine to meet specifications.

That’s on a per-engine basis. Now expand the impact if the issue wasn’t limited to a single engine, but rather every unit manufactured by a misconfigured tool. An hour of scrapped engines will force hundreds of units through this ad hoc recovery process. A downtime study estimated that unplanned downtime costs Fortune Global 500 companies 11% of their yearly turnover, almost $1.5 trillion in losses a year.

On the flip side, it should not be hard to imagine how catching issues with a specific component in real time can dramatically reduce the cost associated with scrap, rework, and brand damage.

The need for predictive maintenance in engine manufacturing

Tooling quality issues—if left unchecked—lead to cascading costs. The industry is filled with horror stories of companies that never fully recovered from these incidents. We touched on costs associated with identifying and correcting quality issues. And yes, detecting these in real time limits the impact. Modern manufacturing depends on doing even better than delivering quality components. 

Unplanned downtime must be minimized in order to get the most value from capital equipment.  Reacting to parts that are out of spec is not enough. The insights available from testing and test result trends can be used to evolve from a reactive mindset to one that fully embraces the idea of predictive maintenance.

Predictive maintenance leverages streaming data to identify issues before they occur, enabling proactive remediation. This limits scrap, which saves cost. Maintenance windows can be streamlined and timed to coincide with other complementary activities which increases efficiency. Finally, industry data shows that early maintenance on expensive manufacturing tooling significantly extends equipment lifespan.

Confluent Cloud delivers for predictive maintenance

Apache Kafka is purpose-built to support large-scale durable data streaming use cases. It’s in wide use across the manufacturing space as well as other major verticals. 

Confluent Cloud is a fully managed data streaming platform whose design guarantees align well to this use case. Confluent is real time, high volume, and durable. Streaming test results with Confluent deliver insights across three valuable dimensions:

  • Instant reaction to engine quality

  • Real-time insight into tooling quality

  • Visibility into engine tooling that indicates maintenance will be required

The same data stream flowing through Confluent Cloud is typically consumed by multiple different clients, each working to deliver insights across different dimensions. These are typically broken down into two common categories:

  • Instant reactions, such as when the unit test fails for a specific component

  • Training machine learning (ML) models that are used for scoring trend-based rules

Implementing Confluent for predictive maintenance

Powering quality control with Confluent Cloud is a relatively straightforward pattern. All OT data emitted as part of the manufacturing process lands in Confluent Cloud in real time. Since data volume and size aren’t a limiting factor, we recommend capturing granular data from historians (where records and trends are frequently aggregated) or directly from MQTT resources. 

Confluent is fault tolerant, ensuring data integrity and system reliability as well as highly scalable, handling data from millions of sensors and machines simultaneously. Stream processing can be done using Kafka Streams, ksqlDB, or Flink—doing real-time data analysis on engine parameters, for example. 

Each piece of ingested data takes two paths: training and scoring. Both of these are efficient and occur in real time.

All data consumed is made available to a machine learning model in real time. We’ve seen that some companies utilize continuous training while others perform training on a more periodic basis. The data structure used to power this training is typically controlled by a sliding window that contains 90 days of data or more. The data itself is only ever processed once—on ingest—so deploying a new model doesn’t require reprocessing a large data footprint. Rather, the entire model continuously evolves based on the most recent data.

Stream processing solutions, which can be built with ksqlDB or Apache Flink, enable this continuous training and instant scoring capability. These workflow engines natively integrate with the Kafka topics, respect the data contract, and provide higher-order functions that greatly simplify the event loop. Consider the following pattern:

-- Continuous learning from raw data input which adds to the 
-- recent data corpus.
SELECT AVG(size_mm) as avg_size
FROM TEST_DATA
WINDOW HOPPING (SIZE 90 DAYS, ADVANCE BY 1 HOUR)

-- Realtime scoring based on the model - pass if within 1%
-- of the average trend
SELECT serial_number,
       component_id,
       size_mm
       CASE
          WHEN ABS(size_mm-avg_size)<avg_size*.01 THEN 'Pass'
          ELSE ‘Fail’
        ELSE 'Other'
      END AS score_boolean
FROM TEST_DATA
JOIN SCORE_MODEL on TEST_DATA.component_id=SCORE_MODEL.component_id

High-volume scoring is completed at line speed against the latest published version of the ML model. Based on scoring and the defined quality spec, operator personnel at the plant are able to quickly react to any system generated alerts.

Confluent offers 120+ pre-built connectors to connect with many data sources and downstream systems including analytics tools and storage solutions, making it easy for teams to share and leverage data products across the organization for additional use cases (e.g., financial reporting, product development).

Conclusion

We’ve shown how manufacturing is expected to deliver at peak performance, and results outside of that window add costs that can be fatal to some companies. Real-time visibility into manufacturing at the component and end unit level reduces the cost associated with rework and scrap. The same test data that reveals component quality can also be used to train machine learning models and indicate when planned maintenance should be scheduled.

Together, these allow high-performing manufacturers to nimbly move from a traditional reactive operating model to one that’s truly proactive and efficient.

Here are additional resources to learn more about data streaming with Confluent: 

  • Keith Resar is a Field Architect at Confluent. A sales engineer at heart, he brings a business background for translating business needs and product alignment into viable customer solutions, as well as deep experience selling into infrastructure (Cloud IaaS/PaaS), managed hosting, IT outsourcing, application management, and consulting solutions.

Did you like this blog post? Share it now

Scaling Web Scraping With Data Streaming, Agentic AI, and GenAI

Reworkd CTO Adam Watkins shares how the AI startup leverages agentic AI, GenAI, and data streaming to automate and scale real-time web scraping for faster, more reliable data extraction.


Win the CSP & MSP Markets by Leveraging Confluent’s Data Streaming Platform and OEM Program

This blog explores how cloud service providers (CSPs) and managed service providers (MSPs) increasingly recognize the advantages of leveraging Confluent to deliver fully managed Kafka services to their clients. Confluent enables these service providers to deliver higher value offerings to wider...