Internet of Things (IoT) is an emergent and evolving technology, interconnecting the cyber and physical worlds. IoT technology finds applications in a broad spectrum of areas such as homes, health, ...water and sanitation, transportation, and environmental monitoring. However, the endless opportunities and benefits of IoT come with many security challenges due to the reduced computation, communication, storage, and energy capabilities of the IoT smart devices. Several computationally lightweight cryptographic protocols exist for these resource-constrained IoT smart devices. However, lightweight solutions render the resource-rich ends of the IoT systems (e.g., edge, fog, or cloud modes) vulnerable as nodes at those ends have the capacity for computationally heavier cryptographic protocols, and they operate in relatively more malicious environments. This asymmetric computational nature of IoT systems requires security protocols that can adapt to the resource availability at the node that they operate. This survey describes the IoT structure, computational capabilities of the devices at the end, edge, fog, and cloud platforms, and classifies existing lightweight cryptographic protocols. The comparative analysis of the existing lightweight cryptographic solutions along with their advantages, drawbacks, and vulnerabilities highlights the need for elastic cryptographic protocols which are capable of adapting to the asymmetric capabilities of the different nodes in IoT systems.
Secure communications in wireless sensor networks operating under adversarial conditions require providing pairwise (symmetric) keys to sensor nodes. In large scale deployment scenarios, there is no ...priory knowledge of post deployment network configuration since nodes may be randomly scattered over a hostile territory. Thus, shared keys must be distributed before deployment to provide each node a key-chain. For large sensor networks it is infeasible to store a unique key for all other nodes in the key-chain of a sensor node. Consequently, for secure communication either two nodes have a key in common in their key-chains and they have a wireless link between them, or there is a path, called key-path, among these two nodes where each pair of neighboring nodes on this path have a key in common. Length of the key-path is the key factor for efficiency of the design. This paper presents novel deterministic and hybrid approaches based on Combinatorial Design for deciding how many and which keys to assign to each key-chain before the sensor network deployment. In particular, Balanced Incomplete Block Designs (BIBD) and Generalized Quadrangles (GQ) are mapped to obtain efficient key distribution schemes. Performance and security properties of the proposed schemes are studied both analytically and computationally. Comparison to related work shows that the combinatorial approach produces better connectivity with smaller key-chain sizes
Cyberspace is full of uncertainty in terms of advanced and sophisticated cyber threats that are equipped with novel approaches to learn the system and propagate themselves, such as AI-powered ...threats. To debilitate these types of threats, a modern and intelligent Cyber Situation Awareness (SA) system needs to be developed that has the ability of monitoring and capturing various types of threats, analyzing, and devising a plan to avoid further attacks. This article provides a comprehensive study on the current state-of-the-art in the cyber SA to discuss the following aspects of SA: key design principles, framework, classifications, data collection, analysis of the techniques, and evaluation methods. Last, we highlight misconceptions, insights, and limitations of this study and suggest some future work directions to address the limitations.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, SAZU, UL, UM, UPUK
Distributed Denial-of-Service (DDoS) attacks are increasing as the demand for Internet connectivity massively grows in recent years. Conventional shallow machine learning-based techniques for DDoS ...attack classification tend to be ineffective when the volume and features of network traffic, potentially carry malicious DDoS payloads, increase exponentially as they cannot extract high importance features automatically. To address this concern, we propose a hybrid approach named AE-MLP that combines two deep learning-based models for effective DDoS attack detection and classification. The Autoencoder (AE) part of our proposed model provides an effective feature extraction that finds the most relevant feature sets automatically without human intervention (e.g., knowledge of cybersecurity professionals). The Multi-layer Perceptron Network (MLP) part of our proposed model uses the compressed and reduced feature sets produced by the AE as inputs and classifies the attacks into different DDoS attack types to overcome the performance overhead and bias associated with processing large feature sets with noise (i.e., unnecessary feature values). Our experimental results, obtained through comprehensive and extensive experiments on different aspects of performance on the CICDDoS2019 dataset, demonstrate both a very high and robust accuracy rate and F1-score that exceed 98% which also outperformed the performance of many similar methods. This shows that our proposed model can be used as an effective DDoS defense tool against the growing number of DDoS attacks.
The Internet of Things (IoT) devices are being widely deployed and have been targeted and victimized by malware attacks. The mathematical modelling for an accurate prediction of malicious spreads of ...botnets across IoT networks is of great importance. Suppose the spread of IoT botnets can be predicted using mathematical models, the security community can then take the necessary steps to deter an outbreak of botnet attacks and minimize the damage caused by malware. This paper surveys mobile malware epidemiological models to understand the mechanisms and dynamics of malware spread for IoT botnets. We describe the characteristics of IoT botnets based on the Susceptible-Infection-Recovery-Susceptible and Susceptible-Exposed-Infection-Recovery-Susceptible epidemic models. These models extend the traditional SIR (Susceptible-Infection-Recovery) model by adding extra states and parameters specific to the epidemic spread of IoT botnets. We use mathematical modelling to simulate complex spreading processes of IoT botnets and interpret the influence of an epidemic on distributed denial of service attacks. We use MATLAB and R to illustrate the use of a stochastic IoT botnet transmission model in the identification and mitigation of challenges towards minimizing the impact of devastating IoT botnet epidemics.
The use of artificial intelligence (AI) to detect phishing emails is primarily dependent on large-scale centralized datasets, which has opened it up to a myriad of privacy, trust, and legal issues. ...Moreover, organizations have been loath to share emails, given the risk of leaking commercially sensitive information. Consequently, it has been difficult to obtain sufficient emails to train a global AI model efficiently. Accordingly, privacy-preserving distributed and collaborative machine learning, particularly federated learning (FL), is a desideratum. As it is already prevalent in the healthcare sector, questions remain regarding the effectiveness and efficacy of FL-based phishing detection within the context of multi-organization collaborations. To the best of our knowledge, the work herein was the first to investigate the use of FL in phishing email detection. This study focused on building upon a deep neural network model, particularly recurrent convolutional neural network (RNN) and bidirectional encoder representations from transformers (BERT), for phishing email detection. We analyzed the FL-entangled learning performance in various settings, including (i) a balanced and asymmetrical data distribution among organizations and (ii) scalability. Our results corroborated the comparable performance statistics of FL in phishing email detection to centralized learning for balanced datasets and low organizational counts. Moreover, we observed a variation in performance when increasing the organizational counts. For a fixed total email dataset, the global RNN-based model had a 1.8% accuracy decrease when the organizational counts were increased from 2 to 10. In contrast, BERT accuracy increased by 0.6% when increasing organizational counts from 2 to 5. However, if we increased the overall email dataset by introducing new organizations in the FL framework, the organizational level performance improved by achieving a faster convergence speed. In addition, FL suffered in its overall global model performance due to highly unstable outputs if the email dataset distribution was highly asymmetric.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
Precision health leverages information from various sources, including omics, lifestyle, environment, social media, medical records, and medical insurance claims to enable personalized care, prevent ...and predict illness, and precise treatments. It extensively uses sensing technologies (e.g., electronic health monitoring devices), computations (e.g., machine learning), and communication (e.g., interaction between the health data centers). As health data contain sensitive private information, including the identity of patient and carer and medical conditions of the patient, proper care is required at all times. Leakage of these private information affects the personal life, including bullying, high insurance premium, and loss of job due to the medical history. Thus, the security, privacy of and trust on the information are of utmost importance. Moreover, government legislation and ethics committees demand the security and privacy of healthcare data. Besides, the public, who is the data source, always expects the security, privacy, and trust of their data. Otherwise, they can avoid contributing their data to the precision health system. Consequently, as the public is the targeted beneficiary of the system, the effectiveness of precision health diminishes. Herein, in the light of precision health data security, privacy, ethical and regulatory requirements, finding the best methods and techniques for the utilization of the health data, and thus precision health is essential. In this regard, firstly, this paper explores the regulations, ethical guidelines around the world, and domain-specific needs. Then it presents the requirements and investigates the associated challenges. Secondly, this paper investigates secure and privacy-preserving machine learning methods suitable for the computation of precision health data along with their usage in relevant health projects. Finally, it illustrates the best available techniques for precision health data security and privacy with a conceptual system model that enables compliance, ethics clearance, consent management, medical innovations, and developments in the health domain.
Display omitted
•Investigate and present the precision health data requirements for security, privacy, and trust.•Present machine learning paradigms that are relevant to the healthcare domain, including privacy-by-design techniques.•Discuss a conceptual system model for the health data's security/privacy in its different stages, including computation.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Source coding has a rich and long history. However, a recent explosion of multimedia Internet applications (such as teleconferencing and video streaming, for instance) renews interest in fast ...compression that also squeezes out as much redundancy as possible. In 2009 Jarek Duda invented his asymmetric numeral system (ANS). Apart from having a beautiful mathematical structure, it is very efficient and offers compression with a very low coding redundancy. ANS works well for any symbol source statistics, and it has become a preferred compression algorithm in the IT industry. However, designing an ANS instance requires a random selection of its symbol spread function. Consequently, each ANS instance offers compression with a slightly different compression ratio. The paper investigates the compression optimality of ANS. It shows that ANS is optimal for any symbol sources whose probability distribution is described by natural powers of 1/2. We use Markov chains to calculate ANS state probabilities. This allows us to precisely determine the ANS compression rate. We present two algorithms for finding ANS instances with a high compression ratio. The first explores state probability approximations in order to choose ANS instances with better compression ratios. The second algorithm is a probabilistic one. It finds ANS instances whose compression ratios can be made as close to the best ratio as required. This is done at the expense of the number θ of internal random "coin" tosses. The algorithm complexity is O(θL3), where
is the number of ANS states. The complexity can be reduced to O(θLlog2L) if we use a fast matrix inversion. If the algorithm is implemented on a quantum computer, its complexity becomes O(θ(log2L)3).
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
Reliable data transmissions are challenging in industrial wireless sensor networks as channel conditions change over time. Rapid changes in channel conditions require accurate estimation of the ...routing path performance and timely update of the routing information. However, this is not well fulfilled in existing routing approaches. Addressing this problem, this paper presents combined global and local update processes for efficient route update and maintenance, and incorporates them with a hierarchical proactive routing framework. While the global process updates the routing path with a relatively long period, the local process with a shorter period checks potential routing path problems. A theoretical modeling is developed to describe the processes. Through simulations, the presented approach is shown to reduce end-to-end delay up to 30 times for large networks, while improving packet reception ratio (PRR) in comparison with hierarchical and proactive routing protocols ROL/NDC, DSDV, and DSDV with IPv6 Routing Protocol for Low-Power and Lossy Networks' Trickle algorithm. Compared with reactive routing protocols AODV and Ad Hoc On-demand Multipath Distance Vector, it provides similar PRR while reducing end-to-end delay over 15 times.
This work investigates the accuracy and efficiency tradeoffs between centralized and collective (distributed) algorithms for (i) sampling, and (ii) n-way data analysis techniques in multidimensional ...stream data, such as Internet chatroom communications. Its contributions are threefold. First, we use the Kolmogorov-Smirnov goodness-of-fit test to show that statistical differences between real data obtained by collective sampling in time dimension from multiple servers and that of obtained from a single server are insignificant. Second, we show using the real data that collective data analysis of 3-way data arrays (users x keywords x time) known as high order tensors is more efficient than centralized algorithms with respect to both space and computational cost. Furthermore, we show that this gain is obtained without loss of accuracy. Third, we examine the sensitivity of collective constructions and analysis of high order data tensors to the choice of server selection and sampling window size. We construct 4-way tensors (users x keywords x time x servers) and analyze them to show the impact of server and window size selections on the results.
Full text
Available for:
FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NUK, OBVAL, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ