Competing risks data are commonly encountered in randomized clinical trials or observational studies. Ignoring competing risks in survival analysis leads to biased risk estimates and improper ...conclusions. Often, one of the competing events is of primary interest and the rest competing events are handled as nuisances. These approaches can be inadequate when multiple competing events have important clinical interpretations and thus of equal interest. For example, in COVID‐19 in‐patient treatment trials, the outcomes of COVID‐19 related hospitalization are either death or discharge from hospital, which have completely different clinical implications and are of equal interest, especially during the pandemic. In this paper we develop nonparametric estimation and simultaneous inferential methods for multiple cumulative incidence functions (CIFs) and corresponding restricted mean times. Based on Monte Carlo simulations and a data analysis of COVID‐19 in‐patient treatment clinical trial, we demonstrate that the proposed method provides global insights of the treatment effects across multiple endpoints.
Water transport rate in network membranes is inversely correlated to thickness, thus superior permeance is achievable with ultrathin membranes prepared by complicated methods circumventing nanofilm ...weakness and defects. Conferring ultrahigh permeance to easily prepared thicker membranes remains challenging. Here, a tetrakis(hydroxymethyl) phosphonium chloride (THPC) monomer is discovered that enables straightforward modification of polyamide composite membranes. Water permeance of the modified membrane is ≈6 times improved, give rising to permeability (permeance × thickness) one magnitude higher than state‐of‐the‐art polymer nanofiltration membranes. Meanwhile, the membrane exhibits good rejection (RNa2SO4 = 98%) and antibacterial properties under crossflow conditions. THPC modification not only improves membrane hydrophilicity, but also creates additional angstrom‐scale channels in polyamide membranes for unimpeded transport of water. This unique mechanism provides a paradigm shift in facile preparation of ultrapermeable membranes with unreduced thickness for clean water and desalination.
Facile modification of polyamide composite membranes by an inexpensive phosphonium monomer featuring tetrahedral geometry is found to create additional water transport channels, improving the water purification performance without reducing film thickness. Compared with cutting‐edge ultrathin membranes, the modified membrane highlights good rejection, antibacterial properties, superior water permeability, and facile preparation.
A cross sectional population is defined as a population of living individuals at the sampling or observational time. Cross‐sectionally sampled data with binary disease outcome are commonly analyzed ...in observational studies for identifying how covariates correlate with disease occurrence. It is generally understood that cross‐sectional binary outcome is not as informative as longitudinally collected time‐to‐event data, but there is insufficient understanding as to whether bias can possibly exist in cross‐sectional data and how the bias is related to the population risk of interest. As the progression of a disease typically involves both time and disease status, we consider how the binary disease outcome from the cross‐sectional population is connected to birth‐illness‐death process in the target population. We argue that the distribution of cross‐sectional binary outcome is different from the risk distribution from the target population and that bias would typically arise when using cross‐sectional data to draw inference for population risk. In general, the cross‐sectional risk probability is determined jointly by the population risk probability and the ratio of duration of diseased state to the duration of disease‐free state. Through explicit formulas we conclude that bias can almost never be avoided from cross‐sectional data. We present age‐specific risk probability (ARP) and argue that models based on ARP offers a compromised but still biased approach to understand the population risk. An analysis based on Alzheimer's disease data is presented to illustrate the ARP model and possible critiques for the analysis results.
In biomedical practices, multiple biomarkers are often combined using a prespecified classification rule with tree structure for diagnostic decisions. The classification structure and cutoff point at ...each node of a tree are usually chosen on an ad hoc basis, depending on decision makers' experience. There is a lack of analytical approaches that lead to optimal prediction performance, and that guide the choice of optimal cutoff points in a pre‐specified classification tree. In this paper, we propose to search for and estimate the optimal decision rule through an approach of rank correlation maximization. The proposed method is flexible, theoretically sound, and computationally feasible when many biomarkers are available for classification or prediction. Using the proposed approach, for a prespecified tree‐structured classification rule, we can guide the choice of optimal cutoff points at tree nodes and estimate optimal prediction performance from multiple biomarkers combined.
This paper introduces two sets of measures as exploratory tools to study physical activity patterns: active‐to‐sedentary/sedentary‐to‐active rate function (ASRF/SARF) and active/sedentary rate ...function (ARF/SRF). These two sets of measures are complementary to each other and can be effectively used together to understand physical activity patterns. The specific features are illustrated by an analysis of wearable device data from National Health and Nutrition Examination Survey (NHANES). A two‐level semiparametric regression model for ARF and the associated activity magnitude is developed under a unified framework using the marked point process formulation. The inactive and active states measured by accelerometers are treated as a 0‐1 point process, and the activity magnitude measured at each active state is defined as a marked variable. The commonly encountered missing data problem due to device nonwear is referred to as “window censoring,” which is handled by a proper estimation approach that adopts techniques from recurrent event data. Large sample properties of the estimator and comparison between two regression models as measurement frequency increases are studied. Simulation and NHANES data analysis results are presented. The statistical inference and analysis results suggest that ASRF/SARF and ARF/SRF provide useful analytical tools to practitioners for future research on wearable device data.
Competing risks data are commonly encountered in randomized clinical trials and observational studies. This paper considers the situation where the ending statuses of competing events have different ...clinical interpretations and/or are of simultaneous interest. In clinical trials, often more than one competing event has meaningful clinical interpretations even though the trial effects of different events could be different or even opposite to each other. In this paper, we develop estimation procedures and inferential properties for the joint use of multiple cumulative incidence functions (CIFs). Additionally, by incorporating longitudinal marker information, we develop estimation and inference procedures for weighted CIFs and related metrics. The proposed methods are applied to a COVID-19 in-patient treatment clinical trial, where the outcomes of COVID-19 hospitalization are either death or discharge from the hospital, two competing events with completely different clinical implications.
Tree‐based methods are popular nonparametric tools in studying time‐to‐event outcomes. In this article, we introduce a novel framework for survival trees and ensembles, where the trees partition the ...dynamic survivor population and can handle time‐dependent covariates. Using the idea of randomized tests, we develop generalized time‐dependent receiver operating characteristic (ROC) curves for evaluating the performance of survival trees. The tree‐building algorithm is guided by decision‐theoretic criteria based on ROC, targeting specifically for prediction accuracy. To address the instability issue of a single tree, we propose a novel ensemble procedure based on averaging martingale estimating equations, which is different from existing methods that average the predicted survival or cumulative hazard functions from individual trees. Extensive simulation studies are conducted to examine the performance of the proposed methods. We apply the methods to a study on AIDS for illustration.
A time‐dependent measure, termed the rate ratio, was proposed to assess the local dependence between two types of recurrent event processes in one‐sample settings. However, the one‐sample work does ...not consider modeling the dependence by covariates such as subject characteristics and treatments received. The focus of this paper is to understand how and in what magnitude the covariates influence the dependence strength for bivariate recurrent events. We propose the covariate‐adjusted rate ratio, a measure of covariate‐adjusted dependence. We propose a semiparametric regression model for jointly modeling the frequency and dependence of bivariate recurrent events: the first level is a proportional rates model for the marginal rates and the second level is a proportional rate ratio model for the dependence structure. We develop a pseudo‐partial likelihood to estimate the parameters in the proportional rate ratio model. We establish the asymptotic properties of the estimators and evaluate the finite sample performance via simulation studies. We illustrate the proposed models and methods using a soft tissue sarcoma study that examines the effects of initial treatments on the marginal frequencies of local/distant sarcoma recurrence and the dependence structure between the two types of cancer recurrence.
Feature selection (FS) has recently attracted considerable attention in many fields. Highly-overlapping classes and skewed distributions of data within classes have been found in various ...classification tasks. Most existing FS methods are all instance-based, which ignores the significant differences in characteristics between the particular outliers and the main body of the class, causing confusion for classifiers. In this paper, we propose a novel supervised FS method, Intrusive Outliers-based Feature Selection (IOFS), to find out what kind of outliers lead to misclassification and exploit the characteristics of such outliers. In order to accurately identify the intrusive outliers (IOs), we provide a density-mean center algorithm to obtain the appropriate representative of a class. A special distance threshold is given to obtain the candidate for IOs. Combining with several metrics, mathematical formulations are provided to evaluate the overlapping degree of the intrusive class pairs. Features with high overlapping degrees are assigned to low rankings in IOFS method. An extension of IOFS based on a small number of extreme IOs, called E-IOFS, is also proposed. Three theoretical proofs are provided for the essential theoretical basis of IOFS. Experiments comparing against various state-of-the-art methods on eleven benchmark datasets show that IOFS is rational and effective, especially on the datasets with higher overlapping classes. And E-IOFS almost always outperforms IOFS.