High Frequency Data

Processes in industrial operation occur often at different time scales, some are fast (sub seconds to hours), and others are slow (hours, days, weeks, or months). In a biotechnology facility for example, there are slow moving batch processes, fast purification steps and very fast filling lines. Capturing events at different time scales and analyzing them, requires a data strategy for the acquisition, storage, and analysis.

To optimize storage space and network bandwidth, the OSIsoft PI system differentiates between high frequency data also known as snapshot values and compressed or archived data. Data are archived from the snapshot table by applying a swinging door compression algorithm. This data strategy has proven to be great balance between displaying real time data in high resolution as well as storing sufficiently enough data for historical data analysis.

The drawback of this approach is that the snapshot queue contains only a single value for each process variable, so analysis based on snapshot or event driven data is limited to single point. There are some valuable use cases such as statistical process control, alarm management or event triggers. However, Machine Learning (ML) or multivariate models (MVA) are usually based on time series vectors.

To accommodate advance modeling of high frequency data, the OSIsoft PI system requires expansion off the snapshot table to a low latency time series storage:

The requirements for the Snapshot Db are primarily driven by read speed as well as write speed. Some open-source time series databases (TSDB) that allows a million writes per seconds are available now. The read speeds are even more impressive: We measured ~ 800K read/sec for a standard OSIsoft PI system, whereas a low latency TSDB is faster by a factor of 800 - 1,000.

An additional benefit of using open source TSDB is that it allows us to add open-source ML and MVA libraries as well as, to take advantage of the very rich open-source visualization ecosphere. For example, the following shows a Grafana Dashboard of the Snapshot Db:

Summary

The OSIsoft PI system has been designed to capture real time events in a snapshot table and store compressed data in the PI Data Archive. This data architecture is optimized for short term event data and long-term data storage. Missing in this scenario are capabilities to store and analyze high frequency data, which modern low latency time series databases can provide. By adding a dedicated high frequency data store, fast processes can be monitored and analyzed in parallel to an already existing data infrastructure. This will open a large range of new uses cases that are difficult or impossible to realize with existing systems.

For information, please contact us.

© All rights reserved.