With the significant growth of internet usage, people increasingly share their personal information online. As a result, an enormous amount of personal information and financial transactions become ...vulnerable to cybercriminals. Phishing is an example of a highly effective form of cybercrime that enables criminals to deceive users and steal important data. Since the first reported phishing attack in 1990, it has been evolved into a more sophisticated attack vector. At present, phishing is considered one of the most frequent examples of fraud activity on the Internet. Phishing attacks can lead to severe losses for their victims including sensitive information, identity theft, companies, and government secrets. This article aims to evaluate these attacks by identifying the current state of phishing and reviewing existing phishing techniques. Studies have classified phishing attacks according to fundamental phishing mechanisms and countermeasures discarding the importance of the end-to-end lifecycle of phishing. This article proposes a new detailed anatomy of phishing which involves attack phases, attacker’s types, vulnerabilities, threats, targets, attack mediums, and attacking techniques. Moreover, the proposed anatomy will help readers understand the process lifecycle of a phishing attack which in turn will increase the awareness of these phishing attacks and the techniques being used; also, it helps in developing a holistic anti-phishing system. Furthermore, some precautionary countermeasures are investigated, and new strategies are suggested.
•Use of 7 different classification algorithms and NLP based features.•A Big URL Data Set is produced and shared (36,400 legitimate and 37,175 phishing).•Real-time and language-independent ...classification algorithms.•Feature-rich classifiers with Word Vectors, NLP-based and Hybrid features.•The proposed approach reaches 97.98% accuracy rate.
Due to the rapid growth of the Internet, users change their preference from traditional shopping to the electronic commerce. Instead of bank/shop robbery, nowadays, criminals try to find their victims in the cyberspace with some specific tricks. By using the anonymous structure of the Internet, attackers set out new techniques, such as phishing, to deceive victims with the use of false websites to collect their sensitive information such as account IDs, usernames, passwords, etc. Understanding whether a web page is legitimate or phishing is a very challenging problem, due to its semantics-based attack structure, which mainly exploits the computer users’ vulnerabilities. Although software companies launch new anti-phishing products, which use blacklists, heuristics, visual and machine learning-based approaches, these products cannot prevent all of the phishing attacks. In this paper, a real-time anti-phishing system, which uses seven different classification algorithms and natural language processing (NLP) based features, is proposed. The system has the following distinguishing properties from other studies in the literature: language independence, use of a huge size of phishing and legitimate data, real-time execution, detection of new websites, independence from third-party services and use of feature-rich classifiers. For measuring the performance of the system, a new dataset is constructed, and the experimental results are tested on it. According to the experimental and comparative results from the implemented classification algorithms, Random Forest algorithm with only NLP based features gives the best performance with the 97.98% accuracy rate for detection of phishing URLs.
Nowadays, individuals and organizations are increasingly targeted by phishing attacks, so an accurate phishing detection system is required. Therefore, many phishing detection techniques have been ...proposed as well as phishing datasets have been collected. In this paper, three datasets have been used to train and test machine learning classifiers. The datasets have been archived by Phish-Tank and UCI Machine Learning Repository. Furthermore, Information Gain algorithm have been used for features reduction and selection purpose. In addition, six machine learning classifiers have been evaluated, namely NaiveBayes, ANN, DecisionStump, KNN, J48 and RandomForest. However, the classifiers have been trained and tested over the three datasets in two stages. The first stage is using all features included in each dataset while the second stage using selected features by IG algorithm. At the first stage RandomForest classifier has shown the best performance over Dataset-1 and Dataset-2, while J48 has shown the best performance over Dataset-3. On the other hand, after features selection, the RandomForest classifier was the superior among the other five classifiers over Dataset-1 and Dataset-2 with accuracy of 98% and 93.66% respectively. While ANN classifier has shown the best performance with accuracy of 88.92% over Dataset-3. Because of the few number of instances as well as features in Dataset-3 comparing to the other two dataset; the performance of the classifiers has been affected.
Phishing is an identity theft, which deceives Internet users into revealing their sensitive data, e.g., login information, credit/debit card details, and so on. Researchers have developed various ...anti-phishing methods in recent years. However, the problem still exists. Therefore, this paper presents a detailed analysis of phishing attack methods and defense techniques. This survey is presented in five folds. First, we discuss in detail the lifecycle of phishing attack, its history, and motivation behind this attack. Second, we present various distribution methods that are used to spread phishing attacks. Third, we provide taxonomy of various phishing-attacking techniques in desktop and mobile environments. Fourth, we provide numerous phishing protection mechanisms and their comparisons. Finally, the article presents various performance challenges faced by developers while dealing with this crucial attack. This paper also provides the consequences of phishing attacks in emerging domains like mobile and online social networks. This paper will help the different users in avoiding phishing attacks while using Internet for their day-to-day activities, and will guide business administrators in designing new effective solutions for their enterprise against various types of phishing threats.
Public health responses to the COVID-19 pandemic since March 2020 have led to lockdowns and social distancing in most countries around the world, with a shift from the traditional work environment to ...virtual one. Employees have been encouraged to work from home where possible to slow down the viral infection. The massive increase in the volume of professional activities executed online has posed a new context for cybercrime, with the increase in the number of emails and phishing websites. Phishing attacks have been broadened and extended through years of pandemics COVID-19. This paper presents a novel approach for detecting phishing Uniform Resource Locators (URLs) applying the Gated Recurrent Unit (GRU), a fast and highly accurate phishing classifier system. Comparative analysis of the GRU classification system indicates better accuracy (98.30%) than other classifier systems.
In recent times, a phishing attack has become one of the most prominent attacks faced by internet users, governments, and service-providing organizations. In a phishing attack, the attacker(s) ...collects the client’s sensitive data (i.e., user account login details, credit/debit card numbers, etc.) by using spoofed emails or fake websites. Phishing websites are common entry points of online social engineering attacks, including numerous frauds on the websites. In such types of attacks, the attacker(s) create website pages by copying the behavior of legitimate websites and sends URL(s) to the targeted victims through spam messages, texts, or social networking. To provide a thorough understanding of phishing attack(s), this paper provides a literature review of Artificial Intelligence (AI) techniques: Machine Learning, Deep Learning, Hybrid Learning, and Scenario-based techniques for phishing attack detection. This paper also presents the comparison of different studies detecting the phishing attack for each AI technique and examines the qualities and shortcomings of these methodologies. Furthermore, this paper provides a comprehensive set of current challenges of phishing attacks and future research direction in this domain.
The phishing attack is a malicious emerging threat on the internet where the hackers try to access the user credentials such as login information or Internet banking details through pirated websites. ...Using that information, they get into the original website and try to modify or steal the information. The problem with traditional defense systems like firewalls is that they can only stop certain types of attacks because they rely on a fixed set of principles to do so. As a result, the model needs a client-side defense mechanism that can learn potential attack vectors to detect and prevent not only the known but also unknown types of assault. Feature selection plays a key role in machine learning by selecting only the required features by eliminating the irrelevant ones from the real-time dataset. The proposed model uses Hyperparameter Optimized Artificial Neural Networks (H-ANN) combined with a Hybrid Firefly and Grey Wolf Optimization algorithm (H-FFGWO) to detect and block phishing websites in Internet of Things(IoT) Applications. In this paper, the H-FFGWO is used for the feature selection from phishing datasets ISCX-URL, Open Phish, UCI machine-learning repository, Mendeley website dataset and Phish tank. The results showed that the proposed model had an accuracy of 98.07%, a recall of 98.04%, a precision of 98.43%, and an F1-Score of 98.24%.
Phishing attacks remain a significant cybersecurity threat in the digital landscape, leading to the development of defense mechanisms. This paper presents a thorough examination of Artificial ...Intelligence (AI)-based ensemble methods for detecting phishing attacks, including websites, emails, and SMS. Through the screening of research articles published between 2019 and 2023, 37 relevant studies were identified and analyzed. Key findings highlight the prevalence of ensemble methods such as AdaBoost, Bagging, and Gradient Boosting in phishing attack detection models. Adaboost emerged as the most used method for website phishing detection, while Stacking and Adaboost were prominent choices for email phishing detection. The majority-voting ensemble method was frequently employed in SMS phishing detection models. The performance evaluation of these ensemble methods involves metrics, such as accuracy, ROC-AUC, and F-score, underscoring their effectiveness in mitigating phishing threats. This study also underscores the availability of credible open-access datasets for the progressive development and benchmarking of phishing attack detection models. The findings of this study suggest the development of new and optimized ensemble methods for phishing attack detection.
Detecting Phishing Website with Machine Learning Priya Darshini, Smt.V.; Srilatha, P.; Neelima, P.
International journal of recent technology and engineering,
09/2019, Volume:
8, Issue:
3
Journal Article
Open access
Attacks are many types to disturb the network or any other websites. Phishing attacks (PA) are a type of attacks which attack the website and damage the website and may lose the data. Many types of ...research have been done to prevent the attacks. To overcome this, in this paper, the integrated phishing attack detection system which is adopted with SVM classifier is implemented to detect phishing websites. Phishing is the cyber attack that will destroy the website and may attack with the virus. There are two parameters that can detect the final phishing detection rate such as Identity, and security. Phishing attacks also occur in various banking and e-commerce websites. This paper deals with the UCL machine learning phishing dataset which consists of 32 attributes. The proposed algorithm implements on this dataset and shows the performance.
Today's world is heading towards complete digital transformation, and with all its advantages, this transformation involves many risks, the most important of which is phishing. This paper proposes a ...system that classifies the email as phishing or legitimate. Initially, the samples were brought from different data sets, and then the system extracts the features from all parts of the email. The proposed system uses one of the machine learning algorithms (K-means algorithm) to select the valuable features; the proposed system uses four methods to calculate the distance in the K-means algorithm. After features selection, The paper uses ANN as a classifier to classify emails into phishing and ham, and the proposed system tunes the parameters of ANN to obtain a high percentage of accuracy. The proposed system gave an accuracy equal to 99.4%.