Efficient management of urban transportation is crucial for addressing the growing challenges posed by increasing traffic and population. In this context, the utilization of big data and intelligent ...systems has become paramount. This paper introduces a comprehensive approach to traffic flow management at a key intersection in Tashkent, capitalizing on the integration of big data analytics and predictive model.
The wide application of digital mining technology will certainly promote the development of music therapy. In this paper, we first analyze the development of music therapy and clarify the ...classification of orchestration. Secondly, for the time period model of mining data technology, the mathematical method of the ARMA model and the ARIMA model is investigated. Finally, the music therapy effect of guzheng performance training is predicted based on the ARIMA time series. In accordance with the playing sloped of guzheng clips from Gaojia opera, the relationship between the total number of subjects and the amount of music therapy sessions was examined, as well as the correlation among the number of active behaviors, self-expression, positive emotional expressions, and negative feelings of the subject children. The attentional concentration time showed a highly significant correlation with the number of times under music therapy, i.e., P<0.02, while the number of active behaviors and self-expression showed a significant correlation, i.e., P<0.04.
The need for small and medium enterprises (SMEs) to adopt data analytics has reached a critical point, given the surge of data implied by the advancement of technology. Despite data mining (DM) being ...widely used in the transportation sector, it is staggering to note that there are minimal research case studies being done on the application of DM by SMEs, specifically in the transportation sector. From the extensive review conducted, the three most common DM models used by large enterprises in the transportation sector are identified, namely “Knowledge Discovery in Database,” “Sample, Explore, Modify, Model and Assess” (SEMMA), and “CRoss Industry Standard Process for Data Mining” (CRISP‐DM). The same finding was revealed in the SMEs' context across the various industries. It was also uncovered that among the three models, CRISP‐DM had been widely applied commercially. However, despite CRISP‐DM being the de facto DM model in practice, a study carried out to assess the strengths and weakness of the models reveals that they have several limitations with respect to SMEs. This paper concludes that there is a critical need for a novel model to be developed in order to cater to the SMEs' prerequisite, especially so in the transportation sector context.
This article is categorized under:
Application Areas > Business and Industry
Application Areas > Industry Specific Applications
Given the surge of data collected from the advancement of technology, the need to adopt analytics has never been more critical. Despite CRISP‐DM being the de facto data‐mining model in practice, there is a critical need for a novel model to be developed in catering to the SMEs in the transportation sector.
Since the spread of Covid-19 in Indonesia, in early March 2020, the activities of Educational Institutions have not been disrupted. As conventional learning. Learning at Singaperbangsa University ...began with regulation from the Ministry of Education and Culture of the Republic of Indonesia, from learning that boldly affects concentration, influences concentration, such as signals, learning atmosphere, and teaching methods, so that factors affect the level of student satisfaction in learning. This study aims to determine the level of student satisfaction with learning who dares to use the Bayes naive algorithm using RapidMiner tools with results obtained with an accuracy rate of 76.92%, class precision of 100.00%, class recall 57.14%, and an AUC value of 0.881 or close to, so the resulting model is good. In other words, the results obtained using the Naïve Bayes algorithm can be used as material for making decisions about the level of online learning satisfaction.
The purpose of this paper is to evaluate several machine learning models under the CRISP-DM methodology in order to determine, through its metrics, the best model for predicting the performance of ...high school students in the Colombian Caribbean region in the Saber 11º test, while proposing a new methodology for evaluating the results of the test by regions in order to take into account the socioeconomic particularities of each one of them. The CRISP-DM methodology is taken as a basis due to its maturity, this methodology allows the extraction of business and data knowledge, offers a guide for data preparation, modeling and validation of the models; it is expected that the proposed methodology will be implemented by the Colombian Institute for the Promotion of Higher Education (ICFES), departmental education secretariats and educational institutions. A variety of techniques and tools were used to develop ETL processes to obtain a data set with the most relevant attributes, in order to evaluate four machine learning models developed with the J48 (C4.5), LMT, PART and Multilayer Perceptron algorithms; obtaining that the best data set and the best learning model is obtained using the InfoGain attribute selection method and the LMT decision tree algorithm, respectively. Therefore, this project will facilitate the actors of the National Education System to make decisions for the benefit of students and the quality of education in the country, especially in the Caribbean region.
Abstract
As a result of the increasing challenges in the field of lightweight constructions, the demand for efficient joining technologies is continuously rising. For this purpose, cold forming ...processes offer an environmental friendly and fast alternative to established joining methods (e.g. welding). However, to ensure a high reliability, not only the selection of an appropriate procedure, but also the dimensioning of the individual joint is essential. While product designers can rely on a wide range of design principles for thermal processes, the dimensioning and evaluation of mechanical joining processes is mainly based on expert knowledge and a few experimental tests. Although few studies already investigated the numerical analysis of mechanical joints, an approach for the sustainable and consistent optimization of the strength and reliability of joining connections for varying use-cases is not available yet. Motivated by this lack, this paper presents an approach for the automated transfer of information within the process chain and the data-based analysis of mechanical joints by using clinching as an example. Therefore, the CRISP-DM reference model is used for the systematic data mining.
Telecommunication is one of the fastest growing industrial sectors so that there are more telecommunication companies. This can create various threats if the company does not use the strategy ...properly. Customer churn refers to the level of customer reduction which is one of the threats to reducing the company's revenue. This is an important issue for developing companies to evaluate in order to reduce the potential for churn that occurs. The initial stage that needs to be done is to predict customers who have the potential to switch from the company, one of which is the data mining approach. Classification is a data mining technique that can predict the class of datasets with various existing classification algorithms. The purpose of this study is to identify the effect of the number of dataset records on several classification algorithms. This research was conducted based on the CRISP-DM method by applying three classification algorithms, namely Logistic Regression, Naïve Bayes, and Decision Tree C4.5. The results showed that the greater the number of records in the dataset, the higher the accuracy value will be obtained. In dataset-1, logistic regression is a better algorithm based on an accuracy value of 80.09%, while naïve Bayes is superior based on an AUC value of 0.733 and an execution time of 0.00798 seconds. In dataset-2, it is found that decision tree is an algorithm that is more suitable than logistic regression and naïve Bayes algorithms, with an accuracy of 91.9% and an AUC value of 0.846 which is included in the good classification criteria. However, in execution time, the naïve Bayes algorithm only takes a processing time of 0.00403 seconds.
•CRISP-DM methodology is used for crime data mining .•Crime against women dataset is gathered from the national crime record bureau website year-wise from 2001 to 2020.•Crime trends, regression ...analysis, correlation gradient, correlation heat map and choropleth map are analysed.•Forecasts crime under indian penal code in categories with the accuracy of 72.29, 92.15, 83.30 and 84.33% respectively.
Crime against women (CAW) in India is the violence against women that is at par in previous years. India is densely populated has added to the figures of crime against women. This paper aims at study of crime against women dataset given by NCRB (National Crime Record Bureau) from 2001 to 2020 for all the 27 states and 9 union territories. EDA (exploratory data analysis) with linear regression is a powerful combination for understanding the relationship between various factors and the incidence of crime against women. EDA is a process of analysing and summarizing the main characteristics of a data set through visualizations, descriptive statistics, and other techniques. At the same time, linear regression is a statistical method that models the relationship between a dependent variable and one or more independent variables. India's crime against women dataset on various crime categories under Indian Penal Code (IPC) such as rape, cruelty by husband and his relatives, kidnapping and abduction, dowry deaths, assault on women with intent to outrage her modesty, insult to modesty of women and human trafficking are considered to accomplish this. CRISP-DM methodology allows for a consistent and structured approach to data mining, which reduces the risk of errors and improves the chances of success in predicting crime rate. The proposed model has various data analytics steps to pre-process the datasets and visualize the crime rate. The visualization of data helps to uncover trends present in the crime dataset. The proposed predictive model analyses data and predict crime against women under four IPC categories to give accuracy of 72.29, 92.15, 83.30 and 84.33% respectively.
Display omitted
In any business organization, database infrastructures are subject to various structured query language (SQL) injection attacks, such as tautologies, alternative coding, stored procedures, use of the ...union operator, piggyback, among others. This article describes a data mining project developed to mitigate the problem of identifying SQL injection attacks on databases. The project was conducted using an adaptation of the cross-industry standard process for data mining (CRISP-DM) methodology. A total of 12 python libraries was used for cleaning, transformation, and modeling. The anomaly detection model was carried out using clustering by the k – nearest neighbors (kNN) algorithm. The query text was analyzed for the groups with anomalies to identify sentences presenting attack traces. A web interface was implemented to display the daily summary of the attacks found. The information source was obtained from the transactions log of a PostgreSQL database server. Our results allowed the identification of different attacks by injection of SQL code above 80%. The execution time for processing half a million transaction log was approximately 60 minutes using a computer with the following characteristics: Intel® Core i7 processor 7th generation, 12GB RAM and 500GB SSD.
PurposeData-driven quality management systems, brought about by the implementation of digitisation and digital technologies, is an integral part of improving supply chain management performance. The ...purpose of this study is to determine a methodology to aid the implementation of digital technologies and digitisation of the supply chain to enable data-driven quality management and the reduction of waste from manufacturing processes.Design/methodology/approachMethodologies from both the quality management and data science disciplines were implemented together to test their effectiveness in digitalising a manufacturing process to improve supply chain management performance. The hybrid digitisation approach to process improvement (HyDAPI) methodology was developed using findings from the industrial use case.FindingsUpon assessment of the existing methodologies, Six Sigma and CRISP-DM were found to be the most suitable process improvement and data mining methodologies, respectively. The case study revealed gaps in the implementation of both the Six Sigma and CRISP-DM methodologies in relation to digitisation of the manufacturing process.Practical implicationsValuable practical learnings borne out of the implementation of these methodologies were used to develop the HyDAPI methodology. This methodology offers a pragmatic step by step approach for industrial practitioners to digitally transform their traditional manufacturing processes to enable data-driven quality management and improved supply chain management performance.Originality/valueThis study proposes the HyDAPI methodology that utilises key elements of the Six Sigma DMAIC and the CRISP-DM methodologies along with additions proposed by the author, to aid with the digitisation of manufacturing processes leading to data-driven quality management of operations within the supply chain.