► New feature selection method introduced by using similarity measure and fuzzy entropy measures. ► Similarity and fuzzy entropy based feature selection method is succesfully used in classification ...of four medical data sets. ► Higher classification accuracy using this combination is gained compared to original classification method. ► Given technique manages to reduce datas dimension by removing irrelevant features and hence also reduces computational time.
Feature selection plays an important role in classification for several reasons. First it can simplify the model and this way computational cost can be reduced and also when the model is taken for practical use fewer inputs are needed which means in practice that fewer measurements from new samples are needed. Second by removing insignificant features from the data set one can also make the model more transparent and more comprehensible, providing better explanation of suggested diagnosis, which is an important requirement in medical applications. Feature selection process can also reduce noise and this way enhance the classification accuracy. In this article, feature selection method based on fuzzy entropy measures is introduced and it is tested together with similarity classifier. Model was tested with four medical data sets which were, dermatology, Pima-Indian diabetes, breast cancer and Parkinsons data set. With all the four data sets, we managed to get quite good results by using fewer features that in the original data sets. Also with Parkinsons and dermatology data sets, classification accuracy was managed to enhance significantly this way. Mean classification accuracy with Parkinsons data set being 85.03% with only two features from original 22. With dermatology data set, mean accuracy of 98.28% was achieved using 29 features instead of 34 original features. Results can be considered quite good.
Additive manufacturing (AM) is a promising technology for designing complex metallic pieces for different sectors with resource and time effectiveness. Titanium (Ti) is an essential critical material ...for AM development. AM can produce intricate and cost-effective components with Ti alloys for the transportation sector which would not be possible with conventional manufacturing (CM) technologies. This study assesses the impact of AM on the life cycle of Ti and its alloys by using review (numerical data, case examples) and dynamics simulation modelling. This article quantifies potential environmental benefits and examines aspects related to using Ti alloys in the automotive and aerospace industries. Mass flow, energy consumption and related greenhouse gas (GHG) emissions are assessed by making a comparison between subcategories of AM including binder jetting (BJT), directed energy deposition (DED), electron beam-based powder bed fusion (EB-PBF), and laser-based powder bed fusion (L-PBF) and CM processes including forging, milling, machining, and die casting. The results show that the AM subcategories considered potentially reduce manufacturing phase energy consumption and GHG emissions except for L-PBF. The findings highlight that an inclusive consideration of all life cycle phases is needed to fully identify potential benefits of AM for industries. Also, the scenario analysis in this study proposes the opportunity for saving mass and minimizing energy consumption and GHG emissions by optimizing the structural design and manufacturing processes for Ti components.
Display omitted
•All life cycle phases are needed to identify the potential benefits of AM fully.•AM processes such as BJT, Wire DED and EB-PBF consume less energy.•L-PBF is identified to have the most SEC and GHG emissions.•Optimizing design and maximizing build platform utilization offset high AM energy.•Optimized lightweight Ti components are promising for less environmental impact.
This paper introduces a new defuzzification technique derived as a generalization of the formula for the calculation of possibilistic mean originally proposed by Carlsson and Fullér in 2001 for fuzzy ...numbers. Unlike the possibilistic mean, the generalized formulation allows also for the defuzzification of subnormal convex fuzzy sets and also for non-convex fuzzy sets (e.g. the outputs of Mamdani- or Larsen-type fuzzy inference). The Luukka–Stoklasa–Collan transformation introduced in 2019 is applied to generalize the possibilistic mean formula. Using this transformation an algorithm for the calculation of the possibilistic-mean-based defuzzification of a general fuzzy set with a continuous membership function on the given interval is proposed. This way the Luukka–Stoklasa Defuzzification (LSD) inspired by the possibilistic mean construction is introduced - a defuzzification that can be calculated also for fuzzy sets in general (subnormal, non-convex), not only for fuzzy numbers. As such LSD is applicable also in fuzzy expert systems and fuzzy control settings where the outputs of the inference systems can be expected to be represented by subnormal and non-convex fuzzy sets. Fast-computation formulas for LSD of piece-wise linear fuzzy sets are also provided. The applicability of LSD in the ranking of fuzzy numbers and its ability to distinguish between fuzzy numbers where other frequently used defuzzification methods do not is shown. Two more case studies are presented where LSD outperforms the chosen frequently used defuzzification methods: a fuzzy expert system for inventory control and a fuzzy cruise controller problem.
This paper introduces a new type of mean applicable in various areas of science and practice: the α-weighted averaging operator (AWA). AWA has all the properties required from a linear averaging ...operator and some additional ones. We discuss the applications of AWA in data aggregation in various areas including uncertainty modeling (summarization, defuzzification), multiple-criteria and multi-expert decision-making and evaluation. We prove that when applied to fuzzy numbers, the α-weighted average converges to the possibilistic mean of a fuzzy number with the increasing number of elements in its support. As such the α-weighted average is a more general aggregation operator than the original possibilistic mean. When fuzzy subsets of the real line represent the information to be aggregated, AWA provides new means for their defuzzification compatible with the possibilistic moments, but applicable to discrete and subnormal fuzzy sets. We also introduce a generalized formulation of the α-weighted averaging operator (GAWA) that can be applied in multiple-criteria and multi-expert evaluation and decision-making problems. We suggest the use of GAWA in operations research theory and applications in the context of data aggregation, multiple-criteria and group evaluation and decision-making.
•A new aggregation operator called the α-weighted averaging operator is introduced.•The operator reflects uncertainty, time indices etc. using the α-weights.•Basic properties of the aggregation operator are summarized.•Its applicability in defuzzification and aggregation of evaluations is established.•α-weighted averaging operator generalizes the possibilistic mean of fuzzy numbers.
Histograms are an intuitively understandable tool for graphically presenting frequency data that is available for and useful in modern data-analysis, this also makes comparing histograms an ...interesting field of research. The concept of similarity and similarity measures have been gaining in importance, because similarity and similarity measures can be used to replace the simpler distance measures in many data-analysis applications. In this paper we concentrate on circular histograms that are well-suited for time or direction-stamped frequency data and especially on the comparison of circular histograms by way of similarity. We focus on Łukasiewicz many-valued logic based similarities and introduce a new similarity measure, the “modulo similarity” for circular problems. We prove that modulo similarity is a similarity measure in the strict sense. We also present a new compatibility measure, the “maximum pair assignment compatibility” that can be used in lossless sample-based comparison of histograms. We demonstrate the usefulness of these two new concepts by numerically applying them to a comparison of circular histograms and comparatively analyze the results with results from a comparison with a bin-based Łukasiewicz many-valued logic based method for the comparison of histograms.
Stock markets can be interpreted to a certain extent as prediction markets, since they can incorporate and represent the different opinions of investors who disagree on the implications of the ...available information on past and expected events and trade on their beliefs in order to achieve profits. Many forecast models have been developed for predicting the future state of stock markets, with the aim of using this knowledge in a trading strategy. This paper interprets the classification of the S&P500 open-to-close returns as a four-class problem. We compare four trading strategies based on a random forest classifier to a buy-and-hold strategy. The results show that predicting the classes with higher absolute returns, ‘strong positive’ and ‘strong negative’, contributed the most to the trading strategies on average. This finding can help shed light on the way in which using additional event outcomes for the classification beyond a simple upward or downward movement can potentially improve a trading strategy.
In this literature review, we investigate machine learning techniques that are applied for stock market prediction. A focus area in this literature review is the stock markets investigated in the ...literature as well as the types of variables used as input in the machine learning techniques used for predicting these markets. We examined 138 journal articles published between 2000 and 2019. The main contributions of this review are: (1) an extensive examination of the data, in particular, the markets and stock indices covered in the predictions, as well as the 2173 unique variables used for stock market predictions, including technical indicators, macro-economic variables, and fundamental indicators, and (2) an in-depth review of the machine learning techniques and their variants deployed for the predictions. In addition, we provide a bibliometric analysis of these journal articles, highlighting the most influential works and articles.
•A systematic review of 138 related journal articles published during 2000–2019.•North American market covered most, especially the S&P500 index, followed by Asia.•Technical Indicators (e.g., return, simple moving average, RSI) common predictors.•Neural networks and support vector machines are frequently used algorithms.•Growing use of deep learning methods and textual data in recent research articles.
In this paper we propose three novel feature ranking methods for supervised feature selection in the context of classification which are based on possibility theory. All three methods – ...nonspecificity, strife and total uncertainty – are tested on eight artificial data sets and ten medical real-world data sets and benchmarked against ReliefF, the Fisher score, the fuzzy entropy and similarity (FES), the Fuzzy similarity and entropy (FSAE) filter, symmetrical uncertainty as well as using no feature selection. The feature ranking methods were applied following two approaches: (1) using a fixed threshold for the number of highest-ranking features selected and (2) using a hybrid feature selection approach with a classifier (k-nearest neighbor classifier, decision tree, similarity classifier, SVM) to select the optimal number of features to select. The results indicate that strife and the Fisher score are the two feature ranking methods that for both approaches are on average ranked the highest in terms of the test set accuracy on the real-world data sets. Besides that, for the hybrid approach, strife uses most of the time a considerably smaller number of features than nonspecificity and total uncertainty. In terms of stability, which was measured with the adjusted stability measure (ASM), the Fisher score and strife were among the most stable feature ranking methods in this study. Additionally, strife’s feature subsets were diverse compared to those of the remaining feature selection methods, making it a good candidate to be included in a feature selection ensemble.
In the literature, researchers and practitioners can find a manifold of algorithms to perform a classification task. The similarity classifier is one of the more recently suggested classification ...algorithms. In this paper, we suggest a novel similarity classifier with multiple ideal vectors per class that are generated with k-means clustering in combination with the jump method. Two approaches for pre-processing, via simple standardization and via principal component analysis in combination with the MAP test and Parallel Analysis, are presented. On the artificial data sets, the novel classifier with standardization and with transformation power Y = 1 for the jump method results in significantly higher mean classification accuracies than the standard classifier. The results of the artificial data sets demonstrate that in contrast to the standard similarity classifier, the novel approach has the ability to cope with more complex data structures. For the real-world credit data sets, the novel similarity classifier with standardization and Y = 1 achieves competitive results or even outperforms the k-nearest neighbour classifier, the Naive Bayes algorithm, decision trees, random forests and the standard similarity classifier.
Display omitted
•Introduction of novel similarity classifier with multiple ideal vectors•Ideal vectors are determined with k-means clustering and the jump method.•The model is tested on three artificial and three real-world data sets.•Novel classifier often significantly better than the standard similarity classifier.
Research and development (R&D) project ranking as investments is a well-known problem that is made difficult by incomplete and imprecise information about future project profitability. This paper ...shows how profitability results of R&D project evaluation with the fuzzy pay-off method can be ranked with four new variants of fuzzy TOPSIS each using a different fuzzy similarity measure. An overall project ranking that incorporates the four new variants' rankings with three different ideal solutions totaling 12 subrankings is presented. The implementation of the created methods is illustrated with a numerical example.