Striving for improved sustainability goals with advanced analytics can have multiple benefits to your operation.

In today’s world, virtually no industry is operating without consideration of their impact on the environment. It’s no secret that process manufacturers have been scrutinized for contributing to greenhouse gas emissions and excessive energy consumption, so striving for sustainability can sometimes be seen as a forced hassle. In contrast, however; achieving enhanced sustainability at a process manufacturing organization can actually result in better operational performance and efficiency, saving time and money for the manufacturer. Two birds, one stone.

Defining our Goals: United Nations SDGs

In recent years, the UN created a set of seventeen Sustainable Development Goals (SDGs) to reach by 2030 in an effort to universally protect the planet. Of these are four that pertain particularly to process manufacturing companies and how they can contribute to the global effort:

These four goals have become a guideline for the industry as a whole to craft corporate sustainability goals, and it’s evident that these have become a top priority. Besides the obligations that require companies to invest in sustainable processes, it’s also become known that changing operations to better align with these goals results in many other positive impacts.

Sustainability Doesn’t Happen in Spreadsheets

There are many challenges that process manufacturing teams face today in terms of meeting these sustainability goals. The first stems from the lack of tangible and actionable direction provided to team members at the plant, with broad guidelines set at a corporate level. Subject matter experts (SMEs) often do not have the resources within traditional data technology and methods to analyze their process data and make insight-based decisions to push their operation towards KPIs for improved sustainability.

The reality is that spreadsheets don’t provide any tools for efficiently contextualizing, cleansing, and analyzing data. Many teams spend numerous hours inside of spreadsheets trying to organize data for insight, leaving no time for actually making connections between the data that can lead to a reduction in waste, materials, or money spent.

Additionally, this method does not empower process manufacturers to make reliable predictions based on rapid and historical data. If an environmental violation happens, actions taken to correct it can only happen after it has occurred, and the opportunity to see what caused the problem can be missed.

Enter: Advanced Analytics

With advanced analytics applications, process manufacturing operations can generate compliance reports automatically, with up-to-date data from disparate sources, freeing up time to focus on environmental impact.

Beyond monitoring of data as it’s happening and opening a world of insight during incident investigation, the accessibility, presentation, and correlation of data contributes to effective predictive analytics. This can give teams insight into when unproductive downtime may occur and lead to wasted resources.

Advanced analytics can also provide SMEs with a better understanding of how process changes will affect the environment by reporting on KPIs geared towards specific sustainability measures and creating models to compare process performance and operating conditions to ideal levels.

Beyond this, advanced analytics makes it easy to share insight across an entire team—leaving the days of spreadsheet-sharing in the past. Results are error-free and accessible for the whole organization to maintain the same mindset, so regardless of level, everyone knows the company’s progress towards improved sustainability.

Applying Specific SDG Measures

Here are a few examples of ways that process manufacturers are utilizing advanced analytics to strive towards a better sustainability goals with lower impact on the environment, while also improving their operational performance.

The term “sustainability” can mean a lot of things, such as monitoring and controlling green house gas emissions, optimizing energy efficiency, implementing alternative energy sources, reducing waste and so on. These examples highlight the flexibility of Seeq; wherever you have environmental process data and would like to optimize your environmental performance, Seeq can be used.

SDG 6: Clean Water and Sanitation

Operations can avoid over-cleaning in clean-in-place (CIP) processes where sanitation materials can be unnecessarily used.

SDG 7: Affordable and Clean Energy

Process manufacturers are currently using advanced analytics to develop energy models and decrease total energy consumption, with minimal required capital expense.

SDG 12: Responsible Consumption and Production

Mass balance equations can be run continuously to track historical changes, providing an opportunity to find points where material is wasted.

SDG13: Climate Action

Many organizations are increasing generation of renewables and adopting smart grid technologies to mitigate carbon emissions through advanced analytics. Aggregation of methane emissions from various data sources through the use of advanced analytics can identify or predict places where methane is leaked, down to detailed micro-levels within the operation.

A Sustainable Future

It’s simple: Investing in a sustainability goals strategy is good for business. The efficient use of raw materials, less waste, and lower energy consumption both directly lead to an improved environment and your bottom-line. In addition, sustainable practices such as these can boost your reputation above the competitors in your industry. See how advanced analytics can work for your operation today.

Advanced data analytics is empowering process manufacturing teams across all verticals.

Enhanced accessibility into operational and equipment data has surged a transformation in the process manufacturing industry. Engineers can now see both historical and time-series data from their operation as it’s happening and at remote locations, so entire teams can be up-to-speed continuously and reliably. The only problem with this? Finding their team is “DRIP”—Data rich, information poor.

With tremendous amounts of data, a lack of proper organization, cleansing, and contextualizing only puts process engineers at a standstill. Some chemical environments have 20,000 to 70,000 signals (or sensors), oil refineries can have 100,000, and enterprise sensor data signals can reach millions.

These amounts of data can be overwhelming, but tactfully refining it can lead to greatly advantageous insights. Many SMEs and process engineers’ valuable time is filled with sorting through spreadsheets to try to wrangle the data, and not visualizing and analyzing patterns and models that lead to effective insight. With advanced analytics, process manufacturers can easily see all up-to-date data from disparate sources and make decisions based on the analysis to immediately improve operations.

Moving Up from “Data Janitors”

Moving data from “raw” to ready for analysis should not take up the majority of your subject matter experts’ time. Some organizations in today’s world still report that over 70 percent of their time involved with operational analytics is only dedicated to cleansing their data.

But your team is not “data janitors.” Today’s technology can take care of the monotonous and very time-consuming tasks of accessing, cleansing, and contextualizing data so your team can move straight to benefitting from the insights.

The Difference Between Spreadsheets and Advanced Analytics

For an entire generation, spreadsheets have been the method of choice for analyzing data in the process manufacturing industry. At the moment of analysis, the tool in use needs to enable user input to define critical time periods of interest and relevant context. Spreadsheets have been the way of putting the user in control of data investigation while offering a familiar, albeit cumbersome, path of analysis.

But the downfalls of spreadsheets have become increasingly apparent:

All of these pain points combine to an ultimate difficulty to reconcile and analyze data in the broader business context necessary for profitability and efficiency use cases to improve operational performance.

With advanced analytics, experts in process manufacturing operations on the front lines of configuring data analytics, improvements to the production’s yield, quality, availability, and bottom-lines are readily available.

How It’s Done

Advanced analytics leverages innovations in big data, machine learning, and web technologies to integrate and connect to all process manufacturing data sources and drive business improvement. Some of the capabilities include:

The Impact of Advanced Analytics

Simply put, advanced analytics gives you the whole picture. It draws relationships and correlations between specific data that need to be made in order to improve performance based on accurate and reliable insight. Seeq’s advanced analytics solution is specifically designed for process manufacturing data and has been empowering and saving leading manufacturers time and money upon immediate implementation. Learn more about the application and how it eliminates the need for spreadsheet exhaustion here.

Machine Learning (ML) has seen an exponential growth during the last five years and many analytical platforms have adopted ML technologies to provide packaged solutions to their users. So, why has Machine Learning become mainstream?

Let’s take a look at Technically Multivariate Analysis (MVA). While many algorithms have been widely available for a long time, MVA is still considered a subset of ML algorithms. MVA typically refers to two algorithms:

As such, MVA has become a de facto standard in manufacturing batch processing and others. Some typical use cases are:

In principle, industrial datasets are not different from other supervised or unsupervised learning problems and they can be evaluated using a wide range of algorithms. Multivariate Analysis was preferred because it offered global and local explainability. MVA models are multivariate extensions of the well understood linear regression that provide weights (slope) for each variable. This enables critical understanding and optimization of underlying process dynamics which is a very important aspect in manufacturing.

NEW CHANGES IN INDUSTRIAL MACHINE LEARNING

In the past, many ML algorithms were considered black box models, because the inner mechanics of the model were not transparent to the user. These model types had limited utility in manufacturing since they could not answer the WHY and therefore lacked credibility.

This has very much changed. Today, model explainers in ML are a very active field of research and excellent libraries have become available to analyze the underlying model mechanics of highly complex architectures.

The following shows an example of applying ML technologies to a typical MVA project type. In the original publication (https://journals.sagepub.com/doi/10.1366/0003702021955358 ), several preprocessing steps have been studied together with PLS to build a predictive model. All steps were performed using commercial off the shelf software that manually worked the analysis.

Using ML pipelines, the same study can be structured as follows:

pipeline=Pipeline(steps= [('preprocess', None), ('regression',None)])
preprocessing_options=[{'preprocess': (SNV(),)},
                       {'preprocess': (MSC(),)},
                       {'preprocess': (SavitzkyGolay(9,2,1),)},
                       {'preprocess': (make_pipeline(SNV(),SavitzkyGolay(9,2,1)),)}]

regression_options=[{'regression': (PLSRegression(),), 'regression__n_components': np.arange(1,10)},
                    {'regression': (LinearRegression(),)},
                    {'regression': (xgb.XGBRegressor(objective="reg:squarederror", random_state=42),)}]
param_grid = []
for preprocess in preprocessing_options:
    for regression in regression_options:
        param_grid.append({**preprocess, **regression})
search=GridSearchCV(pipeline,param_grid=param_grid, scoring=score, n_jobs=2,cv=kf_10,refit=False)

This small code example manages to test every combination of prepossessing and regression steps, then automatically select the best model. [A combination of SNV (Standard Normal Variate), 1st derivative and XGBoost showed the highest cross validated explained variance of 0.958].

The transformed spectra and the model weights can be overlaid to provide insights into the model mechanics:

Conclusion

Multivariate Analysis (MVA) has been successfully applied in manufacturing and is here to stay. But there is no doubt that Machine Learning (ML) data engineering concepts will be widely applied to this domain as well. Pipelines and autotuning libraries will ultimately replace the manual work of selecting data transformation, model selection and hyper parameter tuning. New ML algorithms and Deep Learner, in combination with local and global explainer, will expand Manufacturing Intelligence and provide key insights into Process Dynamics.

Special Thanks

Thanks to Dr. Salvador Garcia-Munoz for providing code examples and data sets.

For more information, please contact us.

Detailed equipment & batch data models set up by pharmaceutical and biotech companies have enabled the creation of equipment centric machine learning (ML) models for example, batch evolution monitoring. The next step is to extend the existing equipment centric models and create process or end-to-end models.

The challenge is that the current data models do not fully support the extension:

·        Equipment models are based on the ISA-95 structure and reflect only the physical layout of the manufacturing facilities.

·        Batch Execution Systems (BES) are integrated using ISA-88 and entail only equipment that is controlled by the batch execution system. Often BES systems are set up to execute single unit procedures and subsequent processing steps are executed separately.

·        Management Execution Systems (MES) typically map the entire process and material flow but as a level 3+ system is difficult to integrate into a data modelling pipeline.

·        There are also facilities that use paper-based process tracking instead of MES\BES, which makes traceability even more challenging.

Batch-to-Batch traceability can quickly become very complex especially when many different assets are involved. The following shows an example of a reactor train in a biotech facility:

It shows all the different product pathways from reactor ‘01’ to the final processing step, as an example in red: 01, 11, 22, 33, 44. At any moment in time, the other reactors are either being cleaned or used for a parallel process.

Such a process is difficult to model in a BES or MES system and real time visibility or historical analysis is very challenging. This is especially true if subsequent processing steps are to be included (Chromatography, Fill and Finish, ....)

The missing link to model the different pathways is to integrate each transfer between reactors or equipment. OSIsoft AF offers the AF Transfer model that is fully integrated in the AF system. AF Transfer event can be defined with the out-the-box properties:

·        Source Equipment

·        Destination Equipment

·        Start Time

·        End Time

The AF Transfer model has a lot of the same features that AF event frames offer. Transfers can be templated and through the in-and-outflow ports defined in different granularities.

Once the transfer between equipment has been defined, batches can be traced back in real time with or without using the batch id. This is possible through the equipment and time context of the transfer model:

In this case, starting from the end reactor ‘44’ all previous steps can be retraced by going backwards in time and using the source-destination equipment relationships

The implementation requires a data reference to configure each transfer. The configuration user interface requires the following attributes:

·        Destination Element: Attribute of the destination Element

·        Name: Name of the transfer

·        Optional: Description, Batch Id and Total

The result is transfer logs can be matched up to the corresponding unit procedures by time and equipment context as shown below:

As shown in this example, the end time of transfer log 'Transfer Id S7MZUDGK' matches the start time of unit procedure: "Batch Id WNJ6H99R". The entire pathway can now be reconstructed in one query.

Conclusion

The sequence of discrete processing events such as unit procedures can be modelled using the OSIsoft AF Transfer class. The resulting transfer logs allow retracing the process backwards in time by using the source-destination relationship of the transfer model. Modelling the process flow is key to expanding equipment centric ML models.

Please contact us for more information.

Have you ever wondered if it were possible to predict process conditions in manufacturing? Know what is likely to happen before it actually happens in your business processes? Digital Twin might just be your answer.

There are several different definitions of Digital Twins or Clones and many use them interchangeably with terms such as Industry 4.0 or the Industrial Internet of Things (IIOT). Fundamentally, Digital Twins are digital representations of a physical asset, process or product, and they behave similarly to the object they represent. The concept of Digital Clones has been around for some time. Earlier models were based on engineering principles and approximations, however they required very deep domain expertise, were time consuming and were limited to a few use cases.

Today Digital Clones are virtual models that are built entirely by using massive historical datasets and Machine Learning (ML) to extract the underlying dynamics. The data driven approach makes Digital Clones accessible for a wide range of applications. Therefore, the potential for Digital Twins is enormous and includes process enhancements\optimization, equipment life cycle management, energy reductions, safety improvements just to name a few.

Building digital clones require:

1.      A large historical data set or data historian

2.      High data quality and sufficient data granularity

3.      Very fast data access

4.      A large GPU for the model development and real time predictions

5.      A supporting data structure to manage the development, deployment, and maintenance of ML models

The following shows the application of a Digital Twin to a batch process example. The model is built with 30 second interpolated data using a window of past data to predict future (5 min) data points:

So, what’s all the hype of Digital Clones? Well, not only are they able to predict process conditions, they also provide explanatory power on what drives the process - the underlying dynamics. The following dashboard shows a replay of this analysis including the estimate of the model weights:

Conclusion

In summary, the availability of enterprise level data historians and deep learning libraries allow Digital Clones to be implemented on the equipment and process level throughout manufacturing. The technology allows a wide range of applications and offer an insight into the process dynamics that were not previously available.

Please contact us for more information.

Multivariate Analysis (MVA) is a well-established technique to analyse highly correlated process variables. It is well known in batch, but also successfully applied in discrete or continuous processing. In comparison to single variable applications, for example statistical process control, MVA has shown to be superior in the detection of process drifts and upsets. In practice, the implementation of MVA requires two different data structures or models:

Event Frames are usually autogenerated from the batch execution system (BES) and reflect the logical\automation sequences for recipe execution. Both AF Elements and Event Frames are  being used to create MVA models and calculate statistics. Below is an example of a multivariate model that combines the autogenerated Event Frame “Unit Procedure” and process variables in the Element: “Bio Reactor 0”:

This type of analysis is  typically used for batch-to-batch comparison (T2 and speX statistics) and batch evolution monitoring in the pharmaceutical, biotech and chemical Industry.

Challenge

One of the shortcomings of using automation phases is that they  seldom  line up with time frames that are critical for the underlying process evolution (process phases). Often there is a mismatch in the granularity, process phases are either longer or shorter in duration compared to the automation phases. Also start and end might be based on specific process conditions, for example temperature, batch maturity, online measurements and others. The mismatch between automation and process phases causes misalignment in the MVA model and a broadening of the process control envelopes. . The resulting models are often not optimal.

Solution

SEEQ has developed a platform that excels in creating time series segments as well as time series data cleansing and conditioning. The platform provides several different approaches to define very precise start and end condition.  The following show the definition of a new capsule based on a profile search that solely focuses on the process peak temperature:

These capsules can be utilized in other applications through an API and blended with other PI data models to create very precise multivariate models:

Benefits

Multivariate Analysis is a powerful method to analyze highly correlated process data. It depends on  equipment\process models and time series segments. OSIsoft PI provides data models for both. And typically time segments are automatically populated from a BES or MES systems. SEEQ provides new capabilities to create highly precise time segments called capsules, that refines the MVA analysis and creates meaningful process envelops. The integration is seamless since both systems provide powerful API’s to their time series data and models. The resulting MVA models target specific process phases that can be used to create improved process control limits or regression analysis.

Please contact us for more information.

Most biotech and pharmaceutical companies are adopting Process Analytical Technology (PAT) in manufacturing to provide real time operational insights that allows better control and leads to higher yields, purity, and\or shorter cycle times.

However, to integrate PAT data into an existing data infrastructure has been challenging. Because each PAT – also a single measurement – is a spectrum consisting of a list or array of values (e.g. time, channel, value) that cannot be stored in a classical data historian.

This graph shows you several spectra forming a multi-dimensional time series:

integrate PAT data

Spectra are often stored in SQL type databases as plain tables, separate from the other manufacturing data stored in the historian. The main problem with this approach is the loss of equipment and batch context. This is problematic to any subsequent analysis.

So why are spectra stored separately? Because most industrial data historian store values as simple time series in different data types, typically bool, int, float and string. Each time series point is a tuple of a timestamp and a single value (scalar). For PAT and other use cases, it would be required to extend the existing data shapes to accommodate vectors, matrices, and tensors:

There are many use cases for these data structures:

In the OSIsoft Asset Framework, extending the Historian is accomplished by deploying a new source and linking it to a time series database that supports time-based vectors, matrices, and tensors:

The RAMAN spectra are attributes on the unit\vessel or located on the RAMAN equipment. Therefore, extending the existing OSIsoft AF data model allows the measurements to be analyzed in the present batch or event frame context:

Conclusion

Classical historians have been developed for scalar time series information. This has worked for most sensor data type, but they cannot accommodate higher dimensional time series information. The solution is to extend the existing historian with databases that allow a more flexible schema. This results in better utilization of existing equipment and batch context that enables context specific analysis.

Please contact us for more information.

Machine Learning (ML) will undoubtedly transform manufacturing and grow from a few selected application such as Predictive Maintenance to a wide range of use cases. The technology already exists today, libraries are widely available under open-source licenses and on-premises IT infrastructure as well as cloud service allow these applications to scale.

So, what is holding it back?

One area that limits the wide adoption of ML models is the underlying data structure. Companies have heavily invested in their data infrastructure and the creation of meta databases (mostly ISA-95 and ISA-88), but the productizing of ML models is still lagging. There are several reasons for this:

Industrial standards ISA-95 and ISA-88 provide a framework to structure the equipment and batch model, but by design do not support ML modeling. For example, one equipment can have several ML uses cases that all require a different structure, e.g. example multivariate batch modeling, predictive maintenance, forecasts for predictive control, …

One approach to structure industrial models is ML Relational Mapping (MLRM). It builds on the already existing object relational mapping (ORM) by linking existing type systems. The concept does not require restructuring existing data models and is therefore fast to implement:

MLRM adds an additional type or class that links for example equipment and batch types as well as provides definitions for the ML model. By separating the functionality, this approach does not clutter the existing type system and provides the flexibility to define different models for one class or multi class models without the need to restructure.

The following shows an OSIsoft AF based UI that implements MLRM:

Summary

Machine Learning applications will show grow rapidly in the Manufacturing Environment. The challenge will be to provide the right structure, so that ML models can be built on top of existing type systems. ML Relational Mapping (MLRM) provides a flexible approach by implementing a model specific type system that links to existing data models.

© All rights reserved.