U.S. Department of Energy

Pacific Northwest National Laboratory

Stream Adaptive Foraging for Evidence

Stream Adaptive Foraging for Evidence (SAFE) will find and characterize events of interest within high-volume streaming data, by leveraging deep learning techniques. Deep learning techniques have been shown to be effective at learning hierarchical representations from unlabeled data.

Approach

SAFE addresses the AIM objective of “re-balancing human-machine analytic effort in exploratory knowledge discovery” by automating the process of finding interesting or unusual events while maintaining a human corrective influence in the system. The automation will enhance the user’s ability to find events at the speed and scale necessary for high-volume streaming data, and guide the user’s attention to the most valuable subsets of data at a rate commensurate with the human cognitive capacity. SAFE will develop techniques based on deep learning that will identify useful events in data streams without requiring the user to tag the data beforehand.

Benefit

SAFE work will be of value to government, research or commercial agencies which have an interest in streaming data and other related domains. 

  • Public services (e.g. Bonneville Power Administration) – the BPA is engaged with proactive fault detection and avoidance, in a high-speed streaming data environment.  This work could be instrumental in allowing complex patterns to be understood and dealt with.
  • Research Organizations – This research has a potential direct application toward the development of systems for timely determination of health abnormalities, in a streaming biometric environment.
  • Commercial entities (e.g. extrahop.com) – Extrahop pursues indications of abnormal network conditions in a high-volume streaming environment.  Entities like this require new solutions for the detection and characterization of unusual network behavior in real time.

This data will also demonstrate the applicability of deep learning to nontraditional domains, which will be of interest to stakeholders who process data from high-volume streaming environments, for example those in the power generation or cybersecurity infrastructure business. Additionally, as the project develops, advancements to evidence generation in these domains will be shown.

| Pacific Northwest National Laboratory