loading page

A practical approach for applying Machine Learning in the detection and classification of network devices used in building management
  • +1
  • Maroun Touma,
  • Shalisha Witherspoon,
  • Shonda Witherspoon,
  • Isabelle Crawford-Eng
Maroun Touma
Author Profile
Shalisha Witherspoon
Author Profile
Shonda Witherspoon
Author Profile
Isabelle Crawford-Eng
University of Pennsylvania
Author Profile


With the increasing deployment of smart buildings and infrastructure, Supervisory Control and Data Acquisition (SCADA) devices and the underlying IT network have become essential elements for the proper operations of these highly complex systems. Of course, with the increase in automation and the proliferation of SCADA devices, a corresponding increase in surface area of attack on critical infrastructure has increased. Understanding device behaviors in terms of known and understood or potentially qualified activities versus unknown and potentially nefarious activities in near-real time is a key component of any security solution. In this paper, we investigate the challenges with building robust machine learning models to identify unknowns purely from network traffic both inside and outside firewalls, starting with missing or inconsistent labels across sites, feature engineering and learning, temporal dependencies and analysis, and training data quality (including small sample sizes) for both shallow and deep learning methods. To demonstrate these challenges and the capabilities we have developed, we focus on Building Automation and Control networks (BACnet) from a private commercial building system. Our results show that ”Model Zoo” built from binary classifiers based on each device or behavior combined with an ensemble classifier integrating information from all classifiers provides a reliable methodology to identify unknown devices as well as determining specific known devices when the device type is in the training set. The capability of the Model Zoo framework is shown to be directly linked to feature engineering and learning, and the dependency of the feature selection varies depending on both the binary and ensemble classifiers as well.

Peer review status:IN REVISION

01 Dec 2020Submitted to Applied AI Letters
02 Dec 2020Assigned to Editor
02 Dec 2020Submission Checks Completed
04 Dec 2020Reviewer(s) Assigned
24 Feb 2021Review(s) Completed, Editorial Evaluation Pending
24 Feb 2021Editorial Decision: Revise Minor