With the advent of Big Data era, data reduction methods are in highly demand given their ability to simplify huge data, and ease complex learning processes. Concretely, algorithms able to select ...relevant dimensions from a set of millions are of huge importance. Although effective, these techniques also suffer from the “scalability” curse when they are brought into tackle large-scale problems.
In this paper, we propose a distributed feature weighting algorithm which precisely estimates feature importance in large datasets using the well-know algorithm RELIEF in small problems. Our solution, called BELIEF, incorporates a novel redundancy elimination measure that generates similar schemes to those based on entropy, but at a much lower time cost. Furthermore, BELIEF provides a smooth scale-up when more instances are required to increase precision in estimations.
Empirical tests performed on our method illustrate the estimation ability of BELIEF in manifold huge sets – both in number of features and instances, as well as its reduced runtime cost as compared to other state-of-the-art methods.
► We present a complete analysis of web usage mining in the website OrOliveSur.com. ► Clustering, association rule and subgroup discovery techniques have been applied. ► Results show to the webmaster ...team interesting conclusions to improve the design.
Web usage mining is the process of extracting useful information from users history databases associated to an e-commerce website. The extraction is usually performed by data mining techniques applied on server log data or data obtained from specific tools such as Google Analytics. This paper presents the methodology used in an e-commerce website of extra virgin olive oil sale called www.OrOliveSur.com. We will describe the set of phases carried out including data collection, data preprocessing, extraction and analysis of knowledge. The knowledge is extracted using unsupervised and supervised data mining algorithms through descriptive tasks such as clustering, association and subgroup discovery; applying classical and recent approaches. The results obtained will be discussed especially for the interests of the designer team of the website, providing some guidelines for improving its usability and user satisfaction.
Data quality is deemed as determinant in the knowledge extraction process. Low-quality data normally imply low-quality models and decisions. Discretization, as part of data preprocessing, is ...considered one of the most relevant techniques for improving data quality.
In static discretization, output intervals are generated at once, and maintained along the whole process. However, many contemporary problems demands rapid approaches capable of self-adapting their discretization schemes to an ever-changing nature. Other major issues for stream-based discretization such as interval definition, labeling or how is implemented the interaction between learning and discretization components are also discussed in this paper.
In order to address all the aforementioned problems, we propose a novel, online and self-adaptive discretization solution for streaming classification which aims at reducing the negative impact of fluctuations in evolving intervals. Experiments with a long list of standard streaming datasets and discretizers have demonstrated that our proposal performs significantly more accurately than the other alternatives. In addition, our scheme is able to leverage from class information without incurring in an overweight cost, being ranked as one of the most rapid supervised options.
•We propose LOFD, an online, self-adaptive discretizer for streaming classification.•LOFD smoothly adaptsits interval limits reducing the negative impact of shifts.•Interval labeling and interaction problems in data streaming are analyzed.•Interaction discretizer-learner is addressed by providing 2 alike solutions in LOFD.•The model is compared to the start-of-the-art, using several real-world problems.
Nowadays the phenomenon of Big Data is overwhelming our capacity to extract relevant knowledge through classical machine learning techniques. Discretization (as part of data reduction) is presented ...as a real solution to reduce this complexity. However, standard discretizers are not designed to perform well with such amounts of data. This paper proposes a distributed discretization algorithm for Big Data analytics based on evolutionary optimization. After comparing with a distributed discretizer based on the Minimum Description Length Principle, we have found that our solution yields more accurate and simpler solutions in reasonable time.
Rhabdomyosarcoma (RMS) is the most frequent soft tissue sarcoma (STS) in children and adolescents. In Spain the annual incidence is 4.4 cases per million children < 14 years. It is an uncommon ...neoplasm in adults, but 40% of RMS are diagnosed in patients over 20 years of age, representing 1% of all STS in this age group. RMS can appear anywhere in the body, with some sites more frequently affected including head and neck, genitourinary system and limbs. Assessment of a patient with suspicion of RMS includes imaging studies (MRI, CT, PET-CT) and biopsy. All patients with RMS should receive chemotherapy, either at diagnosis in advanced or metastatic stages, or after initial resection in early local stages. Local control includes surgery and/or radiotherapy depending on site, stage, histology and response to chemotherapy. This guide provides recommendations for diagnosis, staging and treatment of this neoplasm.
Purpose
Early phase trials are crucial in developing innovative effective agents for childhood malignancies. We report the activity in early phase paediatric oncology trials in Spain from its ...beginning to the present time and incorporate longitudinal data to evaluate the trends in trial characteristics and recruitment rates.
Methods
Members of SEHOP were contacted to obtain information about the open trials at their institutions. The study period was split into two equal periods for analysis: 2007–2013 and 2014–2020.
Results
Eighty-one trials and two molecular platforms have been initiated. The number of trials has increased over the time of the study for all tumour types, with a predominance of trials available for solid tumours (66%). The number of trials addressed to tumours harbouring specific molecular alterations has doubled during the second period. The proportion of industry-sponsored compared to academic trials has increased over the same years. A total of 565 children and adolescents were included, with an increasing trend over the study period. For international trials, the median time between the first country study approval and the Spanish competent authority approval was 2 months (IQR 0–6.5). Fourteen out of 81 trials were sponsored by Spanish academic institutions.
Conclusions
The number of available trials, and the number of participating patients, has increased in Spain from 2007. Studies focused on molecular-specific targets are now being implemented. Barriers to accessing new drugs for all ranges of age and cancer diseases remain. Additionally, opportunities to improve academic research are still required in Spain.
Introduction
Cancer and blood disorders in children are rare. The progressive improvement in survival over the last decades largely relies on the development of international academic clinical trials ...that gather the sufficient number of patients globally to elaborate solid conclusions and drive changes in clinical practice. The participation of Spain into large international academic trials has traditionally lagged behind of other European countries, mainly due to the burden of administrative tasks to open new studies, lack of financial support and limited research infrastructure in our hospitals.
Methods
The objective of ECLIM-SEHOP platform (Ensayos Clínicos Internacionales Multicéntricos-SEHOP) is to overcome these difficulties and position Spain among the European countries leading the advances in cancer and blood disorders, facilitate the access of our patients to novel diagnostic and therapeutic approaches and, most importantly, continue to improve survival and reducing long-term sequelae. ECLIM-SEHOP provides to the Spanish clinical investigators with the necessary infrastructural support to open and implement academic clinical trials and registries.
Results
In less than 3 years from its inception, the platform has provided support to 20 clinical trials and 8 observational studies, including 8 trials and 4 observational studies where the platform performs all trial-related tasks (integral support: trial setup, monitoring, etc.) with more than 150 patients recruited since 2017 to these studies. In this manuscript, we provide baseline metrics for academic clinical trial performance that permit future comparisons.
Conclusions
ECLIM-SEHOP facilitates Spanish children and adolescents diagnosed with cancer and blood disorders to access state-of-the-art diagnostic and therapeutic strategies.
Purpose
Despite numerous advances, survival remains dismal for children and adolescents with poor prognosis cancers or those who relapse or are refractory to first line treatment. There is, ...therefore, a major unmet need for new drugs. Recent advances in the knowledge of molecular tumor biology open the door to more adapted therapies according to individual alterations. Promising results in the adult anticancer drug development have not yet been translated into clinical practice. We report the activity in early pediatric oncology trials in Spain.
Methods
All members of the Spanish Society of Pediatric Hematology Oncology (SEHOP) were contacted to obtain information about early trials open in each center.
Results
22 phase I and II trials were open as of May 2015: 15 for solid tumors (68 %) and 7 for hematological malignancies (32 %). Fourteen (64 %) were industry sponsored. Since 2010, four centers have joined the Innovative Therapies For Children With Cancer, an international consortium whose aim is developing novel therapies for pediatric cancers. A substantial number of studies have opened in these 5 years, improving the portfolio of trials for children. Results of recently closed trials show the contribution of Spanish investigators, the introduction of molecularly targeted agents and their benefits.
Conclusions
Clinical trials are the way to evaluate new drugs, avoiding the use of off-label drugs that carry significant risks. The Spanish pediatric oncology community through the SEHOP is committed to develop and participate in collaborative academic trials, to favor the advancement and optimization of existing therapies in pediatric cancer.
•Three commercial polymers with different thermal properties (poly(vinyl chloride) (PVC), poly(ethylene terephthalate) (PET) and polypropylene (PP)) has been studied under λ = 515 nm femtosecond ...laser irradiation.•A photothermal model has allowed us to estimate the threshold frequencies of three different heat regimes observed experimentally (non-cumulative, cumulative and saturation).•Better performance has been observed in high frequency processing with greater uniformity and less debris.•The thermal characteristics of materials determine their behavior under femtosecond laser irradiation.
The response of three commercial polymers (poly(vinyl chloride) (PVC), poly(ethylene terephthalate) (PET) and polypropylene (PP)) with different thermal properties under high repetition rates (1 kHz-1 MHz) with femtosecond (450 fs) multi-pulse laser irradiation at λ = 515 nm (1.4 J/cm2) is reported resulting in a complete study with controlling the ablation depth and minimizing collateral thermal effects. Tunable ablation depth is achieved accurately by varying the repetition rate at a constant fluence. The results are compared to a photothermal model that aims at explaining the heat accumulation effect of successive pulses as a function of the repetition rate and predicts three different heat regimes (non-cumulative, cumulative and saturation). The threshold frequencies for each regime can be estimated from the model, providing control for selecting frequency values and thermal regimes. Thermal analyses are performed to characterize the materials, concluding that thermal parameters are vital for selecting optimal materials and laser processing parameters.