This article examines the use of predictions of electrical demand as a means of making the metallurgical industry more efficient and discusses one approach – the use of decision trees – to solving ...problems that involve automated data analysis. The concept of the significance of input attributes is introduced together with a formula for calculating it. Results are presented from practical application of this method to the problem of finding the significant factors for forecasting the electric-power needs of a metallurgical plant.
Neural networks are generally exposed to a dynamic environment where the training patterns or the input attributes (features) will likely be introduced into the current domain incrementally. This ...Letter considers the situation where a new set of input attributes must be considered and added into the existing neural network. The conventional method is to discard the existing network and redesign one from scratch. This approach wastes the old knowledge and the previous effort. In order to reduce computational time, improve generalization accuracy, and enhance intelligence of the learned models, we present ILIA algorithms (namely ILIA1, ILIA2, ILIA3, ILIA4 and ILIA5) capable of Incremental Learning in terms of Input Attributes. Using the ILIA algorithms, when new input attributes are introduced into the original problem, the existing neural network can be retained and a new sub-network is constructed and trained incrementally. The new sub-network and the old one are merged later to form a new network for the changed problem. In addition, ILIA algorithms have the ability to decide whether the new incoming input attributes are relevant to the output and consistent with the existing input attributes or not and suggest to accept or reject them. Experimental results show that the ILIA algorithms are efficient and effective both for the classification and regression problems.
Common inductive learning strategies offer the tools for knowledge acquisition, but possess some inherent limitations due to the use of fixed bias during the learning process. To overcome limitations ...of such base-learning approaches, a novel research trend is oriented to explore the potentialities of meta-learning, which is oriented to the development of mechanisms based on a dynamical search of bias. This could lead to an improvement of the base-learner performance on specific learning tasks, by profiting of the accumulated past experience. As a significant set of I/O data is needed for efficient base-learning, appropriate meta-data characterization is of crucial importance for useful meta-learning. In order to characterize meta-data, firstly a collection of meta-features discriminating among different base-level tasks should be identified. This paper focuses on the characterization of meta-data, through an analysis of meta-features that can capture the properties of specific tasks to be solved at base level. This kind of approach represents a first step toward the development of a meta-learning system, capable of suggesting the proper bias for base-learning different specific task domains.
The paper presents a generalization of the Pittsburgh approach to learn fuzzy classification rules from data. The proposed approach allows us to obtain a fuzzy rule-based system with a predefined ...level of compromise between its accuracy and interpretability (transparency). The application of the proposed technique to design the fuzzy rule-based classifier for the well known benchmark data sets (Dermatology and Wine) available from the http://archive.ics.uci.edu/ml is presented. A comparative analysis with several alternative (fuzzy) rule-based classification techniques has also been carried out.
Recent years witnessed an exponential increase in the number of data services available on the Web. Many popular Web sites, including social networks, offer API for interacting with their ...information, and open data initiative such as the Linked Data project promise to achieve the vision of the Web of data. Unfortunately, access to Web data is typically limited by the constraints imposed by the query interface, and by technical limitations such as the network latency, or the number and frequency of allowed daily service invocations. Moreover, several sources may independently publish data about the same real-world objects; in such case, their combined use for assembling all available information about those objects requires duplicate removal, reconciliation and integration. This paper describes various data materialization problems, defining properties such as source coverage and data alignment of the materialized data, and then focuses on a specific problem, the reseeding of data access methods by using available information from previous calls in order to build a materialization of maximum size.
Student Modeling by Data Mining Peña-Ayala, Alejandro; Mizoguchi, Riichiro
New Challenges for Intelligent Information and Database Systems
Book Chapter
Peer reviewed
This work pursues to find out patterns of characteristics and behaviors of students. Thus, it is presented an approach to mine repositories of student models (SM). The source information embraces ...students’ personal information and assessment of the use of a Web-based educational system (WBES) by students. In addition, the repositories reveal a profile composed by personal attributes, cognitive skills, learning preferences, and personality traits of a sample of students. The approach mines such repositories and produces several clusters. One cluster represents volunteers who tend to abandon. Another group clusters people who fulfill their commitments. It is concluded that: educational data mining (EDM) produces some findings to depict students that could be considered for authoring content and sequencing teaching-learning experiences.
Economy becomes a field of special interest for the application of fuzzy logic. Here we present some works carried out in this direction, highlighting their advantages and also some of the ...difficulties encountered. Fuzzy inference systems are very useful for Economic Modelling. The use of a rule system defines the underlying economic theory, and allows extracting inferences and predictions. We applied them to modelling and prediction of waged-earning employment in Spain, with Jang’s algorithm (ANFIS) for the period 1977-1998.
As additional experiences in this direction, we have applied the IFN algorithm (Info-Fuzzy- Network) developed by Maimon and Last to the study of the profit value of the Andalusian agrarian industry.
The search for key sectors in an economy has been and still is one of the more recurrent themes in Input-Output analysis, a relevant research area in the economic analysis. We proposed a multidimensional approach to classify the productive sectors of the Spanish Input- Output table. We subsequently analyzed the problems that can arise in key sector analysis and industrial clustering, due to the usual presence of outliers when using multidimensional data.
This paper presents data mining-based techniques for enabling data integration across deep web data sources. We target query processing across inter-dependent data sources. Thus, besides input-input ...and output-output matching of attributes, we also need to consider input-output matching. We develop data mining techniques for discovering the instances for querying deep web data sources from the information provided by the query interfaces themselves, as well as from the obtained output pages of the related data sources, by query probing using dynamically identified input instances. Then, using a hierarchical representation of schemas and by applying clustering techniques, we are able to generate schema matches. We show the effectiveness of our technique while integrating 24 query interfaces.