The proliferation of IoT devices that can be more easily compromised than desktop computers has led to an increase in IoT-based botnet attacks. To mitigate this threat, there is a need for new ...methods that detect attacks launched from compromised IoT devices and that differentiate between hours- and milliseconds-long IoT-based attacks. In this article, we propose a novel network-based anomaly detection method for the IoT called N-BaIoT that extracts behavior snapshots of the network and uses deep autoencoders to detect anomalous network traffic from compromised IoT devices. To evaluate our method, we infected nine commercial IoT devices in our lab with two widely known IoT-based botnets, Mirai and BASHLITE. The evaluation results demonstrated our proposed methods ability to accurately and instantly detect the attacks as they were being launched from the compromised IoT devices that were part of a botnet.
The Internet of Things (IoT) is a global ecosystem of information and communication technologies aimed at connecting any type of object (thing), at any time, and in any place, to each other and to ...the Internet. One of the major problems associated with the IoT is the heterogeneous nature of such deployments; this heterogeneity poses many challenges, particularly, in the areas of security and privacy. Specifically, security testing and analysis of IoT devices is considered a very complex task, as different security testing methodologies, including software and hardware security testing approaches, are needed. In this paper, we propose an innovative security testbed framework targeted at IoT devices. The security testbed is aimed at testing all types of IoT devices, with different software/hardware configurations, by performing standard and advanced security testing. Advanced analysis processes based on machine learning algorithms are employed in the testbed in order to monitor the overall operation of the IoT device under test. The architectural design of the proposed security testbed along with a detailed description of the testbed implementation is discussed. The testbed operation is demonstrated on different IoT devices using several specific IoT testing scenarios. The results obtained demonstrate that the testbed is effective at detecting vulnerabilities and compromised IoT devices.
The increasing number of IoT devices in “smart” environments, such as homes, offices, and cities, produce seemingly endless data streams and drive many daily decisions. Consequently, there is growing ...interest in identifying contextual information from sensor data to facilitate the performance of various tasks, e.g., traffic management, cyber attack detection, and healthcare monitoring. The correct identification of contexts in data streams is helpful for many tasks, for example, it can assist in providing high-quality recommendations to end users and in reporting anomalous behavior based on the detection of unusual contexts. This paper presents DeepStream, a novel data stream temporal clustering algorithm that dynamically detects sequential and overlapping clusters. DeepStream is tuned to classify contextual information in real time and is capable of coping with a high-dimensional feature space. DeepStream utilizes stacked autoencoders to reduce the dimensionality of unbounded data streams and for cluster representation. This method detects contextual behavior and captures nonlinear relations of the input data, giving it an advantage over existing methods that rely on PCA. We evaluated DeepStream empirically using four sensor and IoT datasets and compared it to five state-of-the-art stream clustering algorithms. Our evaluation shows that DeepStream outperforms all of these algorithms. Our evaluation also demonstrates how DeepStream’s improved clustering performance results in improved detection of anomalous data.
IoT devices are known to be vulnerable to various cyber-attacks, such as data exfiltration and the execution of flooding attacks as part of a DDoS attack. When it comes to detecting such attacks ...using network traffic analysis, it has been shown that some attack scenarios are not always equally easy to detect if they involve different IoT models. That is, when targeted at some IoT models, a given attack can be detected rather accurately, while when targeted at others the same attack may result in too many false alarms. In this research, we attempt to explain this variability of IoT attack detectability and devise a risk assessment method capable of addressing a key question: how easy is it for an anomaly-based network intrusion detection system to detect a given cyber-attack involving a specific IoT model? In the process of addressing this question we (a) investigate the predictability of IoT network traffic, (b) present a novel taxonomy for IoT attack detection which also encapsulates traffic predictability aspects, (c) propose an expert-based attack detectability estimation method which uses this taxonomy to derive a detectability score (termed ‘D-Score’) for a given combination of IoT model and attack scenario, and (d) empirically evaluate our method while comparing it with a data-driven method.
Telecommunication service providers (telcos) are exposed to cyber-attacks executed by compromised IoT devices connected to their customers’ networks. Such attacks might have severe effects on the ...attack target, as well as the telcos themselves. To mitigate those risks, we propose a machine learning-based method that can detect specific vulnerable IoT device models connected behind a domestic NAT, thereby identifying home networks that pose a risk to the telcos infrastructure and service availability. To evaluate our method, we collected a large quantity of network traffic data from various commercial IoT devices in our lab and compared several classification algorithms. We found that (a) the LGBM algorithm produces excellent detection results, and (b) our flow-based method is robust and can handle situations for which existing methods used to identify devices behind a NAT are unable to fully address, e.g., encrypted, non-TCP or non-DNS traffic. To promote future research in this domain we share our novel labeled benchmark dataset.
Although home Internet of Things (IoT) devices are typically plain and task oriented, the context of their daily use may affect their traffic patterns. That is, a given IoT device will probably not ...generate the exact same traffic data when operated by different people in different environments and when connected to different networks with different topologies and communication components. For this reason, anomaly-based intrusion detection systems tend to suffer from a high false positive rate (FPR). To overcome this, we propose a two-step collaborative anomaly detection method which first uses an autoencoder to differentiate frequent ("benign") and infrequent (possibly "malicious") traffic flows. Clustering is then used to analyze only the infrequent flows and classify them as either known ("rare yet benign") or unknown (malicious). Our method is collaborative, in that 1) normal behaviors are characterized more robustly, as they take into account a variety of user interactions and network topologies and 2) several features are computed based on a pool of identical devices rather than just the inspected device. We evaluated our method empirically, using 21 days of real-world traffic data that emanated from eight identical IoT devices deployed on various networks, one of which was located in our controlled lab where we implemented two popular IoT-related cyber-attacks. Our collaborative anomaly detection method achieved a macro-average area under the precision-recall curve of 0.841, an F1 score of 0.929, and an FPR of only 0.014. These promising results were obtained by using labeled traffic data from our lab as the test set, while training the models on the traffic of devices deployed outside the lab, and thus demonstrate a high level of generalizability. In addition to its high generalizability and promising performance, our proposed method also offers benefits, such as privacy preservation, resource savings, and model poisoning mitigation. On top of that, as a contribution to the scientific community, our novel data set is available online.
Within the complex and competitive semiconductor manufacturing industry, lot cycle time (CT) remains one of the key performance indicators. Its reduction is of strategic importance as it contributes ...to cost decreasing, time-to-market shortening, faster fault detection, achieving throughput targets, and improving production-resource scheduling. To reduce CT, we suggest and investigate a data-driven approach that identifies key factors and predicts their impact on CT. In our novel approach, we first identify the most influential factors using conditional mutual information maximization, and then apply the selective naive Bayesian classifier (SNBC) for further selection of a minimal, most discriminative key-factor set for CT prediction. Applied to a data set representing a simulated fab, our SNBC-based approach improves the accuracy of CT prediction in nearly 40% while narrowing the list of factors from 182 to 20. It shows comparable accuracy to those of other machine learning and statistical models, such as a decision tree, a neural network, and multinomial logistic regression. Compared to them, our approach also demonstrates simplicity and interpretability, as well as speedy and efficient model training. This approach could be implemented relatively easily in the fab promoting new insights to the process of wafer fabrication.
A key challenge associated with Kubernetes configuration files (KCFs) is that
they are often highly complex and error-prone, leading to security
vulnerabilities and operational setbacks. Rule-based ...(RB) tools for KCF
misconfiguration detection rely on static rule sets, making them inherently
limited and unable to detect newly-discovered misconfigurations. RB tools also
suffer from misdetection, since mistakes are likely when coding the detection
rules. Recent methods for detecting and remediating KCF misconfigurations are
limited in terms of their scalability and detection coverage, or due to the
fact that they have high expertise requirements and do not offer automated
remediation along with misconfiguration detection. Novel approaches that employ
LLMs in their pipeline rely on API-based, general-purpose, and mainly
commercial models. Thus, they pose security challenges, have inconsistent
classification performance, and can be costly. In this paper, we propose
GenKubeSec, a comprehensive and adaptive, LLM-based method, which, in addition
to detecting a wide variety of KCF misconfigurations, also identifies the exact
location of the misconfigurations and provides detailed reasoning about them,
along with suggested remediation. When empirically compared with three
industry-standard RB tools, GenKubeSec achieved equivalent precision (0.990)
and superior recall (0.999). When a random sample of KCFs was examined by a
Kubernetes security expert, GenKubeSec's explanations as to misconfiguration
localization, reasoning and remediation were 100% correct, informative and
useful. To facilitate further advancements in this domain, we share the unique
dataset we collected, a unified misconfiguration index we developed for label
standardization, our experimentation code, and GenKubeSec itself as an
open-source tool.
IoT devices are known to be vulnerable to various cyber-attacks, such as data exfiltration and the execution of flooding attacks as part of a DDoS attack. When it comes to detecting such attacks ...using network traffic analysis, it has been shown that some attack scenarios are not always equally easy to detect if they involve different IoT models. That is, when targeted at some IoT models, a given attack can be detected rather accurately, while when targeted at others the same attack may result in too many false alarms. In this research, we attempt to explain this variability of IoT attack detectability and devise a risk assessment method capable of addressing a key question: how easy is it for an anomaly-based network intrusion detection system to detect a given cyber-attack involving a specific IoT model? In the process of addressing this question we (a) investigate the predictability of IoT network traffic, (b) present a novel taxonomy for IoT attack detection which also encapsulates traffic predictability aspects, (c) propose an expert-based attack detectability estimation method which uses this taxonomy to derive a detectability score (termed `D-Score') for a given combination of IoT model and attack scenario, and (d) empirically evaluate our method while comparing it with a data-driven method.
Although home IoT (Internet of Things) devices are typically plain and task oriented, the context of their daily use may affect their traffic patterns. For this reason, anomaly-based intrusion ...detection systems tend to suffer from a high false positive rate (FPR). To overcome this, we propose a two-step collaborative anomaly detection method which first uses an autoencoder to differentiate frequent (`benign') and infrequent (possibly `malicious') traffic flows. Clustering is then used to analyze only the infrequent flows and classify them as either known ('rare yet benign') or unknown (`malicious'). Our method is collaborative, in that (1) normal behaviors are characterized more robustly, as they take into account a variety of user interactions and network topologies, and (2) several features are computed based on a pool of identical devices rather than just the inspected device. We evaluated our method empirically, using 21 days of real-world traffic data that emanated from eight identical IoT devices deployed on various networks, one of which was located in our controlled lab where we implemented two popular IoT-related cyber-attacks. Our collaborative anomaly detection method achieved a macro-average area under the precision-recall curve of 0.841, an F1 score of 0.929, and an FPR of only 0.014. These promising results were obtained by using labeled traffic data from our lab as the test set, while training the models on the traffic of devices deployed outside the lab, and thus demonstrate a high level of generalizability. In addition to its high generalizability and promising performance, our proposed method also offers benefits such as privacy preservation, resource savings, and model poisoning mitigation. On top of that, as a contribution to the scientific community, our novel dataset is available online.