We introduce the C++ application and R package ranger. The software is a fast implementation of random forests for high dimensional data. Ensembles of classification, regression and survival trees ...are supported. We describe the implementation, provide examples, validate the package with a reference implementation, and compare runtime and memory usage with other implementations. The new software proves to scale best with the number of features, samples, trees, and features tried for splitting. Finally, we show that ranger is the fastest and most memory efficient implementation of random forests to analyze data on the scale of a genome-wide association study.
The title of an article is the main entrance for reading the full article. The aim of our work therefore is to examine differences of title content and form between original research articles and its ...changes over time. Using PubMed we examined title properties of 500 randomly chosen original research articles published in the general major medical journals BMJ, JAMA, Lancet, NEJM and PLOS Medicine between 2011 and 2020. Articles were manually evaluated with two independent raters. To analyze differences between journals and changes over time, we performed random effect meta-analyses and logistic regression models. Mentioning of results, providing any quantitative or semi-quantitative information, using a declarative title, a dash or a question mark were rarely used in the title in all considered journals. The use of a subtitle, methods-related items, such as mentioning of methods, clinical context or treatment increased over time (all p < 0.05), while the use of phrasal tiles decreased over time (p = 0.044). Not a single NEJM title contained a study name, while the Lancet had the highest usage of it (45%). The use of study names increased over time (per year odds ratio: 1.13 (95% CI: 1.03‒1.24), p = 0.008). Investigating title content and form was time-consuming because some criteria could only be adequately evaluated by hand. Title content changed over time and differed substantially between the five major medical journals. Authors are advised to carefully study titles of journal articles in their target journal prior to manuscript submission.
•Harrell’s C is proposed as a split criterion in random survival forests.•Split points of continuous predictor variables differ substantially between Harrell’s C and log-rank splitting.•The log-rank ...statistic has a stronger end-cut preference than Harrell’s C.•Harrell’s C outperforms log-rank splitting in smaller scale studies.•Harrell’s C outperforms log-rank splitting if the censoring rate is high.
Random survival forests (RSF) are a powerful method for risk prediction of right-censored outcomes in biomedical research. RSF use the log-rank split criterion to form an ensemble of survival trees. The most common approach to evaluate the prediction accuracy of a RSF model is Harrell’s concordance index for survival data (‘C index’). Conceptually, this strategy implies that the split criterion in RSF is different from the evaluation criterion of interest. This discrepancy can be overcome by using Harrell’s C for both node splitting and evaluation. We compare the difference between the two split criteria analytically and in simulation studies with respect to the preference of more unbalanced splits, termed end-cut preference (ECP). Specifically, we show that the log-rank statistic has a stronger ECP compared to the C index. In simulation studies and with the help of two medical data sets we demonstrate that the accuracy of RSF predictions, as measured by Harrell’s C, can be improved if the log-rank statistic is replaced by the C index for node splitting. This is especially true in situations where the censoring rate or the fraction of informative continuous predictor variables is high. Conversely, log-rank splitting is preferable in noisy scenarios. Both C-based and log-rank splitting are implemented in the R package ranger. We recommend Harrell’s C as split criterion for use in smaller scale clinical studies and the log-rank split criterion for use in large-scale ‘omics’ studies.
Random forests have often been claimed to uncover interaction effects. However, if and how interaction effects can be differentiated from marginal effects remains unclear. In extensive simulation ...studies, we investigate whether random forest variable importance measures capture or detect gene-gene interactions. With capturing interactions, we define the ability to identify a variable that acts through an interaction with another one, while detection is the ability to identify an interaction effect as such.
Of the single importance measures, the Gini importance captured interaction effects in most of the simulated scenarios, however, they were masked by marginal effects in other variables. With the permutation importance, the proportion of captured interactions was lower in all cases. Pairwise importance measures performed about equal, with a slight advantage for the joint variable importance method. However, the overall fraction of detected interactions was low. In almost all scenarios the detection fraction in a model with only marginal effects was larger than in a model with an interaction effect only.
Random forests are generally capable of capturing gene-gene interactions, but current variable importance measures are unable to detect them as interactions. In most of the cases, interactions are masked by marginal effects and interactions cannot be differentiated from marginal effects. Consequently, caution is warranted when claiming that random forests uncover interactions.
Successful publishing of an article depends on several factors, including the structure of the main text, the so-called introduction, methods, results and discussion structure (IMRAD). The first ...objective of our work is to provide recent results on the number of paragraphs (pars.) per section used in articles published in major medical journals. Our second objective is the investigation of other structural elements, i.e., number of tables, figures and references and the availability of supplementary material. We analyzed data from randomly selected original articles published in years 2005, 2010 and 2015 from the journals The BMJ, The Journal of the American Medical Association, The Lancet, The New England Journal of Medicine and PLOS Medicine. Per journal and year 30 articles were investigated. Random effect meta-analyses were performed to provide pooled estimates. The effect of time was analyzed by linear mixed models. All articles followed the IMRAD structure. The number of pars. per section increased for all journals over time with 1.08 (95% confidence interval (CI): 0.70-1.46) pars. per every two years. The largest increase was observed for the methods section (0.29 pars. per year; 95% confidence interval (CI): 0.19-0.39). PLOS Medicine had the highest number of pars. The number of tables did not change, but number of figures and references increased slightly. Not only the standard IMRAD structure should be used to increase the likelihood for publication of an article but also the general layout of the target journal. Supplementary material has become standard. If no journal-specific information is available, authors should use 3/10/9/8 pars. for the introduction/methods/results/discussion sections.
The problem of checking the genotype distribution obtained for some diallelic marker for compatibility with the Hardy-Weinberg equilibrium (HWE) condition arises also for loci on the X chromosome. ...The possible genotypes depend on the sex of the individual in this case: for females, the genotype distribution is trinomial, as in the case of an autosomal locus, whereas a binomial proportion is observed for males. Like in genetic association studies with autosomal SNPs, interest is typically in establishing approximate compatibility of the observed genotype frequencies with HWE. This requires to replace traditional methods tailored for detecting lack of fit to the model with an equivalence testing procedure to be derived by treating approximate compatibility with the model as the alternative hypothesis. The test constructed here is based on an upper confidence bound and a simple to interpret combined measure of distance between true and HWE conforming genotype distributions in female and male subjects. A particular focus of the paper is on the derivation of the asymptotic distribution of the test statistic under null alternatives which is not of the usual Gaussian form. A closed sample size formula is also provided and shown to behave satisfactorily in terms of the approximation error.
Based on unique data from representative computer-based surveys among more than 3400 citizens, this paper empirically examines the determinants of several climate change beliefs and attitudes in ...three countries which are key players in international climate policy, namely the USA, Germany (as the largest country in the European Union), and China. Our econometric analysis implies that political orientation in the USA is by far more relevant for general climate change beliefs and beliefs in anthropogenic climate change than in Germany and China. Furthermore, US and German citizens with a conservative, but not green identification significantly less often support publicly financed climate policy, while US and German respondents with a social–green identification and Chinese respondents belonging to the Communist Party have a significantly higher willingness to pay a price premium for climate-friendly products. However, our econometric analysis overall reveals that environmental values, which are measured by a New Ecological Paradigm (NEP) scale, are the major factors for climate change beliefs and attitudes in all three countries and thus play an even more dominant role than political orientation. In addition, environmental values weaken the differences in several climate change beliefs and attitudes between a right-wing and a left-wing identification. These interaction effects between political orientation and the NEP scale are especially strong in the USA, only relevant for the support of publicly financed climate policy in Germany, and negligible in China. Our estimation results suggest alternative strategies such as specific communication campaigns in order to reduce the climate change skepticism in conservative and right-wing circles in the USA and to increase the support of climate policies among such population groups.
•Econometric analysis of determinants of climate change beliefs and attitudes•Political orientation in the USA is by far more relevant than in Germany and China.•Environmental values, measured by NEP scale, are the major factors.•Interaction effects between NEP scale and political orientation, especially in the USA
Biodegradable implants reduce the likelihood of further surgery for hardware removal and reduce the risks of associated infection and allergy. The purpose of this study is to evaluate the clinical ...efficacy and determine the comparability of biodegradable magnesium alloy MgYREZr (MAGNEZIX® CS) compression screw fixation compared with standard titanium screw fixation in the surgical treatment of hallux valgus deformity.
Eleven patients undergoing corrective surgery for hallux valgus utilising biodegradable magnesium screws and a control group of 25 patients undergoing corrective hallux valgus surgery with standard titanium screws were reviewed at a median of 19 months (range 12-30 months). PROM scores (Manchester-Oxford Foot Questionnaire (MOXFQ), Foot and Ankle Outcomes Instrument (FAOI) and the EQ-5D-3 L) were recorded preoperatively and at latest follow-up.
The results between the two groups were broadly similar, with the Magnesium and Titanium patients showing similar patterns in the various domains in the MOXFQ, the FAOI and the EQ-5D-3 L. Most patients reported a near full shoe comfort score, and EQ-5D-3 L scores were significantly improved in both patient groups (with most patients reporting a full score). Foot pain and foot function improved irrespective of the scoring systems and patients in both groups demonstrated significantly improved scores following the surgery (p < 0.05). Notably, there were no significant differences when comparing the post-operative scores between the groups for any individual scoring parameter. No impairment to quality of life was recorded. There were no intra or post-operative complications. There were no problems encountered through the use of the bioabsorbable screws.
Biodegradable magnesium-based compression screws appeared to be safe in this study and are an effective fixation device in the treatment of hallux valgus deformity with clinical outcomes similar to standard titanium screw fixation.
This paper examines the extent and the determinants of global climate change beliefs. In contrast to former studies for the U.S. and other western countries, we focus on China due to its crucial role ...in international climate policy in conjunction with its vulnerability to global warming. The empirical analysis is based on unique data from a survey among more than 1000 adults in five Chinese cities. In line with former studies, our results reveal that the vast majority of almost 90% of the Chinese respondents believes in the existence of global warming. This seems to be a convenient and necessary basis for the support of costly public adaptation activities in China. Our econometric analysis reveals that already perceived experiences with extreme weather events (and particularly heatwaves) alone are strongly correlated with climate change beliefs and that physical or financial damages due to these events lead to even stronger relationships. Our estimation results additionally suggest females as well as people with a lower education, in medium ages, with higher household incomes, and from Chengdu or Shenyang to be more skeptical toward the existence of climate change.
Tuberous sclerosis complex (TSC) is a multisystem disease with prominent neurologic manifestations such as epilepsy, cognitive impairment and autism spectrum disorder. mTOR inhibitors have ...successfully been used to treat TSC-related manifestations in older children and adults. However, data on their safety and efficacy in infants and young children are scarce. The objective of this study is to assess the utility and safety of mTOR inhibitor treatment in TSC patients under the age of 2 years.
A total of 17 children (median age at study inclusion 2.4 years, range 0-6; 12 males, 5 females) with TSC who received early mTOR inhibitor therapy were studied. mTOR inhibitor treatment was started at a median age of 5 months (range 0-19 months). Reasons for initiation of treatment were cardiac rhabdomyomas (6 cases), subependymal giant cell astrocytomas (SEGA, 5 cases), combination of cardiac rhabdomyomas and SEGA (1 case), refractory epilepsy (4 cases) and disabling congenital focal lymphedema (1 case). In all cases everolimus was used. Everolimus therapy was overall well tolerated. Adverse events were classified according to the Common Terminology Criteria of Adverse Events (CTCAE, Version 5.0). Grade 1-2 adverse events occurred in 12 patients and included mild transient stomatitis (2 cases), worsening of infantile acne (1 case), increases of serum cholesterol and triglycerides (4 cases), changes in serum phosphate levels (2 cases), increase of cholinesterase (2 cases), transient neutropenia (2 cases), transient anemia (1 case), transient lymphopenia (1 case) and recurrent infections (7 cases). No grade 3-4 adverse events were reported. Treatment is currently continued in 13/17 patients. Benefits were reported in 14/17 patients and included decrease of cardiac rhabdomyoma size and improvement of arrhythmia, decrease of SEGA size, reduction of seizure frequency and regression of congenital focal lymphedema. Despite everolimus therapy, two patients treated for intractable epilepsy are still experiencing seizures and another one treated for SEGA showed no volume reduction.
This retrospective multicenter study demonstrates that mTOR inhibitor treatment with everolimus is safe in TSC patients under the age of 2 years and shows beneficial effects on cardiac manifestations, SEGA size and early epilepsy.