Graduate Information Science and Technology Program

    School of Information Sciences

    University of Pittsburgh

Network-Aware Data Management Group



DIP: Data-Intensive Process Monitoring





In this project we explore novel technologies for efficient summarization and sensemaking based on dynamic data from complex processes. This research is motivated by emerging advanced infrastructures that facilitate rapid operational data collection (e.g., bedside medical devices, energy monitoring hardware, data acquisition products based on wireless sensor networks, etc.).  This project includes the following tasks:

Task 1: Process Data Warehousing.

Task 2: Signature-based Analysis of Numeric Process Logs.

Task 3: Assessment of Process Dynamics based on Information Divergence.

Task 1: Process Data Warehousing

Our patented Process Data Warehouse (PDW) technology deals with the task of efficient utilization of large-scale numeric data streams. The PDW implements a novel approach to continuous summarization and discovery of trends in dynamic process data. The proposed approach includes a method, system architecture and a set of optimization techniques for efficient warehousing of large data streams. Potential uses of the PDW technology range from data utilization in specialized data centers to Internet-scale data search and analysis engines. A PDW system can explore the most and least probable scenarios in development of complex processes that produce large amounts of raw data. The data summaries generated by the PDW system can further be used for intelligent assistance with the process monitoring and early warning. For example, taking into account summarized information on energy consumption the system would be able to generate early warnings of high probability of a power outage. Similarly it can be used in many other application domains (e.g., medical data analysis, market data monitoring, natural disasters and structural health analysis, etc.).

Task 2: Signature-based Analysis of Numeric Data Streams

A major challenge in large-scale process monitoring is to recognize significant transitions in the process conditions and to distinguish them from random fluctuations that do not produce a notable change in the process dynamics.  Such transitions should be recognized at the early stages of their development using a minimal ``snapshot'' of the observable process log.  We consider a novel approach to detecting notable transitions based on analysis of coherent behavior of frequency components in the process log (coherency portraits). We have found that notable transitions in the process dynamics are characterized by unique coherency portraits, which are also invariant with respect to the random process fluctuations.  Our experimental study demonstrates the significant efficiency of our approach as compared to traditional change detection techniques.

Task 3: Assessment of Process Dynamics based on Information Divergence

Process monitoring involves tracking system’s behavior, evaluating the current state of the system, and assess its severity. In this task we develop an approach based on steady-state analysis of the process model generated from a numeric process log. In particular, we utilize Markov process models to continuously produce steady-state vectors reflecting the process dynamics at different time moments. We explore measures of information divergence to assess deviations in a sequence of steady state vectors as process evolves. This approach is efficient in detecting and early warnings about severe deviations in the expected process behavior.

PhD Students:

Andrii Cherniak

Yihuang Kang

Ying-Feng Hsu

Selected References

  1. Cherniak, A., Zadorozhny, V. Signature-Based Detection of Notable Transitions in Numeric Data Stream. To appear in IEEE Transactions on Knowledge and Data Engineering, 2013.

  2. Kang, Y., Zadorozhny, V. Divergence-based Detection of Severe Process States. Under preparation.

  3. Zadorozhny,V. Process Data Warehouse. US Patent 7,933, 861 issued on April 26, 2011

  4. Bickel, J., Visweswaran, S., Levin, J., Hsu, Y-F.., Kang,  Y., Zadorozhny, V. . Data Warehousing and Markov Modeling of Children Admitted with Respiratory Complaints.  American Medical Informatics Association Symposium, San Francisco,  2010

  5. Hsu, Y.-F. Efficient Information Processing Architecture for Early Warning Systems. Dissertation Proposal, Graduate Information Science and Technology Program, School of Information Sciences, University of Pittsburgh, 2010

List of publications