Information technology systems face increasing cyber security threats, mostly from insiders. Network security mechanism for insiders are not as strict as for rest. Also insider can easily bypass ...security or have legitimate access to confidential documents, therefore to detect and prevent insider threat is a growing challenge. The aim of this paper is to implement predictive models that are using linguistic analysis to determine an employee’s risk level computer-mediated communication, particularly emails. The emails log part of the TWOS dataset has been analyzed using supervised machine learning techniques. The data set comprise behavior traces of 24 users observed over 5 days spam. Limited data issue have been addressed by avoiding complex models with many parameters. We have limited their normalization and ability to overfit by using existing pivotal models. The outcomes are collated and contrasted for the following algorithms: Adaboost, Naive Bayes (NB), Logistic Regression (LR), KNN, Linear Regression (LR) and Support Vector Machine (SVM). Among all these algorithms, Adaboost has outperformed with 98.3% Accuracy and 0.983 AUC for identification of malicious emails.
Insider threats are one of the most challenging and growing security threats which the government agencies, organizations, and institutions face. In such scenarios, malicious (red) activities are ...performed by the authorized individuals within the company. Because of which, an insider threat has become a taxing and difficult task to identify among other attacks. Along with other monitoring parameters; email logs play a vital role in many research areas such as stalking Insider Threat involving Collaborating Traitors, Textual Analysis, and Social Media exploration. This paper presents a semi-supervised machine learning framework which embraces the pre-processing and classification techniques together for unlabeled dataset i.e. emails. Enron Corporation dataset has been used for experiments and TWOS for evaluation of the proposed framework. Initially, dataset is transformed into vector form using Term Frequency–Inverse Document Frequency (TF–IDF). Thereafter, K-Means is used to classify emails based on message content. Finally, Machine Learning algorithm Decision Tree (DT) is applied to classify the malicious activities. The proposed framework has also been tested with other algorithms such as Logistic Regression (LR), Naive Bayes (NB), KNN, Support Vector Machine (SVM), Random Forest (RF) and Neural Network (NN). However, Decision Tree (DT) combined with pre-processing steps has given the desired results with 99.96% Accuracy and 0.994 AUC for identification of malicious content.
•Insider threat to employers and companies is a complex and growing challenge.•Research devoted to “traitor detection” has remained very restricted as compared to “masquerader detection”.•Insider threat detection performed through Textual analysis, big data and email logs are worthwhile.•In this research Class label identification done through clustering algorithm.•Prediction of malicious emails by using multiple Machine Learning Classifiers.
Phishing attacks are a growing concern for individuals and organizations alike, with the potential to cause significant financial and reputational damage. Traditional methods for detecting phishing ...attacks, such as blacklists and signature-based techniques, have limitations that have led to developing more advanced techniques. In recent years, machine learning and deep learning techniques have gained attention for their potential to improve the accuracy of phishing detection. Deep learning algorithms, such as CNNs and LSTMs, are designed to learn from patterns and identify anomalies in data, making them more effective in detecting sophisticated phishing attempts. To develop a comprehensive understanding of the current state of research on the use of deep learning techniques for phishing detection, a systematic literature review is necessary. This review aims to identify the various deep learning techniques used for phishing detection, their effectiveness, and areas for future research. By synthesizing the findings of relevant studies, this review identifies the strengths and limitations of different approaches and provides insights into the challenges that need to be addressed to improve the accuracy and effectiveness of phishing detection. This review aims to contribute to developing a coherent and evidence-based understanding of the use of deep learning techniques for phishing detection. The review identifies gaps in the literature and informs the development of future research questions and areas of focus. With the increasing sophistication of phishing attacks, applying deep learning in this area is a critical and rapidly evolving field. This systematic literature review aims to provide insights into the current state of research and identify areas for future research to advance the field of phishing detection using deep learning.