•The spatial and seasonal variations of Dust Storm Index in semi-arid regions of central Iran were investigated.•The association between climate changes and vegetation covers with sand-dust events ...were identified using Ridge Regression.•The surface winds speed had a significant effect on sand-dust events in the summer of the whole study period.•The vegetation degradation has led to the intensification of dust emissions in the spring of the second period.•The highest activity of dust events occurred in the border region of Iran and Turkmenistan.
Atmospheric conditions and physical characteristics of the earth surface have an important effect on the spatiotemporal variations of sand-dust events. The main objective of the present study was to investigate the effect of these variables on the seasonal variation of these events in semi-arid regions of Central Iran Zone (CIZ). The Ridge Regression (RR) method was used to analyze the relationship between seasonal variations of precipitation, surface winds speed, air temperature, and Enhanced Vegetation Index (EVI) with Dust Storm Index (DSI) for two different periods (2001–2008 and 2009–2016). The dusty winds direction around the study area was also determined using the dust roses. The results showed that the annual DSI changes in the study area had a week incremental trend with a rate of 0.07/8 yrs in the previous period while it followed a strong increasing trend with a rate of 0.22/8 yrs in the latter period. It was also found that the activity of sand-dust storms in the second period was greater than the first period, especially in the border region of Iran and Turkmenistan. According to RR analysis, DSI had a significant positive association with the surface winds speed in the summer (β = +0.48; p-value < 0.05) and the winter precipitation (β = −0.3; p-value < 0.05) over the previous period. During this period, there was no significant relationship between the temperature and EVI with DSI in other seasons (p-value > 0.05). In the second period, the surface winds speed was positively correlated with the DSI in the spring (β = +2.04), summer (β = +2.6) and autumn (β = +2.08). The significant negative relationship between EVI and DSI changes was observed only in the spring season (β = −0.7; p-value < 0.05). Our findings also indicated that dusty winds direction in the northeast, northwest, and southeast parts of the study area were from the northwest, southeast, and west, respectively. These findings can help to mitigate the negative consequences of dust emissions and improve the wind erosion management in semi-dry lands of CIZ.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Display omitted
•A polynomial scoring function P3-Score gives better scoring power (0.735) and ranking power (0.688) on CASF-2016 test set.•The multivariate polynomial ridge regression is a promising ...method to improve the traditional scoring function performance.•The constructed 14 feature terms can be used to develop a new scoring function.
Scoring functions are of great importance in fast evaluations of the protein–ligand binding affinity. To improve the scoring power and ranking power, some new features are constructed, and a new empirical scoring function (P3-Score) using 14 features was developed based on multivariate polynomial ridge regression and k-fold cross-validation on the training set. The scoring power and ranking power of P3-Score are compared with other 36 classical scoring functions on the test set in CASF-2016, the results indicate that P3-Score gives better scoring power (0.735) and ranking power (0.688) than the current empirical scoring functions. The multivariate polynomial ridge regression could be a promising method to improve the classical scoring function and prevent overfitting. However, in comparison with recently developed machine learning scoring functions, most ML scoring functions present better scoring performance than the classical scoring functions.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Hypergraph, an important learning tool to modulate high-order data correlations, has a wide range of applications in machine learning and computer vision. The key issue of the hypergraph-based ...applications is to construct an informative hypergraph, in which the hyperedges effectively represent the high-order data correlations. In practice, the real-world data is usually sampled from a union of non-linear manifolds. Due to the issues of noise and data corruptions, many data samples deviate from the underlying data manifolds. To construct an informative hypergraph that represents real-world data distribution well, we propose a hypergraph model (ℓ2-Hypergraph). Our model generates each hyperedge by solving an affine subspace ridge regression problem, where the samples with non-zero representation coefficients are used for hyperege generation. Specifically, to be robust to sparse noise and corruptions, a sparse constraint is imposed on data errors. We have conducted image clustering and classification experiments on real-world datasets. The experimental results demonstrate that our hypergraph model is superior to the existing hypergraph construction methods in both accuracy and robustness to sparse noise.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Display omitted
The Random Vector Functional Link Neural Network (RVFLNN) enables fast learning through a random selection of input weights while learning procedure determines only output weights. ...Unlike Extreme Learning Machines (ELM), RVFLNN exploits connection between the input layer and the output layer which means that RVFLNN are higher class of networks. Although RVFLNN has been proposed more than two decades ago (Pao, Park, Sobajic, 1994), the nonlinear expansion of the input vector into set of orthogonal functions has not been studied. The Orthogonal Polynomial Expanded Random Vector Functional Link Neural Network (OPE-RVFLNN) utilizes advantages from expansion of the input vector and random determination of the input weights. Through comprehensive experimental evaluation by using 30 UCI regression datasets, we tested four orthogonal polynomials (Chebyshev, Hermite, Laguerre and Legendre) and three activation functions (tansig, logsig, tribas). Rigorous non-parametric statistical hypotheses testing confirms two major conclusions made by Zhang and Suganthan for classification (Zhang and Suganthan, 2015) and Ren et al. for timeseries prediction (Ren, Suganthan, Srikanth, Amaratunga, 2016) in their RVFLNN papers: direct links between the input and output vectors are essential for improved network performance, and ridge regression generates significantly better network parameters than Moore-Penrose pseudoinversion. Our research shows a significant improvement of network performance when one uses tansig activation function and Chebyshev orthogonal polynomial for regression problems. Conclusions drawn from this study may be used as guidelines for OPE-RVFLNN development and implementation for regression problems.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
Modern neural networks are often operated in a strongly overparametrized regime: they comprise so many parameters that they can interpolate the training set, even if actual labels are replaced by ...purely random ones. Despite this, they achieve good prediction error on unseen data: interpolating the training set does not lead to a large generalization error. Further, overparametrization appears to be beneficial in that it simplifies the optimization landscape. Here, we study these phenomena in the context of two-layers neural networks in the neural tangent (NT) regime. We consider a simple data model, with isotropic covariates vectors in d dimensions, and N hidden neurons. We assume that both the sample size n and the dimension d are large, and they are polynomially related. Our first main result is a characterization of the eigenstructure of the empirical NT kernel in the overparametrized regime Nd ≫ n. This characterization implies as a corollary that the minimum eigenvalue of the empirical NT kernel is bounded away from zero as soon as Nd≫n and, therefore, the network can exactly interpolate arbitrary labels in the same regime. Our second main result is a characterization of the generalization error of NT ridge regression including, as a special case, min-ℓ2 norm interpolation. We prove that, as soon as Nd ≫ n, the test error is well approximated by the one of kernel ridge regression with respect to the infinite-width kernel. The latter is in turn well approximated by the error of polynomial ridge regression, whereby the regularization parameter is increased by a "self-induced" term related to the high-degree components of the activation function. The polynomial degree depends on the sample size and the dimension (in particular on log n/log d).
There is significant interest in the development and application of deep neural networks (DNNs) to neuroimaging data. A growing literature suggests that DNNs outperform their classical counterparts ...in a variety of neuroimaging applications, yet there are few direct comparisons of relative utility. Here, we compared the performance of three DNN architectures and a classical machine learning algorithm (kernel regression) in predicting individual phenotypes from whole-brain resting-state functional connectivity (RSFC) patterns. One of the DNNs was a generic fully-connected feedforward neural network, while the other two DNNs were recently published approaches specifically designed to exploit the structure of connectome data. By using a combined sample of almost 10,000 participants from the Human Connectome Project (HCP) and UK Biobank, we showed that the three DNNs and kernel regression achieved similar performance across a wide range of behavioral and demographic measures. Furthermore, the generic feedforward neural network exhibited similar performance to the two state-of-the-art connectome-specific DNNs. When predicting fluid intelligence in the UK Biobank, performance of all algorithms dramatically improved when sample size increased from 100 to 1000 subjects. Improvement was smaller, but still significant, when sample size increased from 1000 to 5000 subjects. Importantly, kernel regression was competitive across all sample sizes. Overall, our study suggests that kernel regression is as effective as DNNs for RSFC-based behavioral prediction, while incurring significantly lower computational costs. Therefore, kernel regression might serve as a useful baseline algorithm for future studies.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Measuring carbon abatement in China's commercial buildings (CACCB) has been recognized as a path to evaluate the energy conservation work (ECW) in China's commercial building sector. This study first ...presents a bottom-up model for measuring the CACCB values based on decomposing the extended Kaya identity via the Logarithmic Mean Divisia Index (LMDI) method. The results indicate that (1) mainly three types of drivers (f, d, and K) contributed negatively to the carbon intensity of commercial buildings from 2000 to 2015, and the comprehensive effects were quantified as the intensity values of CACCB. The CACCB values in the three Five-Year Plan periods were 383.41 (2001–2005), 591.09 (2006–2010), and 621.54 MtCO2 (2011–2015). (2) A comparative analysis of the contribution rate elasticity of the drivers assessed by the LMDI method and ridge regression effectively examined the robustness of the CACCB measurement model. Meanwhile, the performance of the measurement model was also evaluated. (3) More significant CACCB effects observed in recent years can be attributed to significant improvements made in ECW. To sum up, we believe that our approach covers the research gap of CACCB measurement, and our efforts constitute significant guidance for developing future ECW in China's commercial building sector.
Display omitted
•Carbon abatement in China's commercial buildings (CACCB) in 2001–2015: 1596.04 MtCO2.•Developed a bottom-up measurement model for CACCB based on the Kaya-LMDI methods.•Proposed a robustness analysis for the measurement model based on the ridge regression.•Conducted a decadal overview for energy conservation work in the commercial building sector.•A pathway to achieving more significant carbon abatement was discussed and proposed.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
Machine learning (ML) techniques are often employed for the accurate prediction of the compressive strength of concrete. Despite higher accuracy, previous ML models failed to interpret the rationale ...behind predictions. Model interpretability is essential to appeal to the interest of domain experts. Therefore, overcoming research gaps identified, this research study proposes a way to predict the compressive strength of concrete using supervised ML algorithms (Decision tree, Extra tree, Adaptive boost (AdaBoost), Extreme gradient boost (XGBoost), Light gradient boosting method (LGBM), and Laplacian Kernel Ridge Regression (LKRR). Alternatively, SHapley Additive exPlainations (SHAP) – a novel black-box interpretation approach - was employed to elucidate the predictions. The comparison revealed that tree-based algorithms and LKRR provide acceptable accuracy for compressive strength predictions. Moreover, XGBoost and LKRR algorithms evinced superior performance (R = 0.98). According to SHAP interpretation, XGBoost predictions capture complex relationships among the constituents. On the other hand, SHAP provides unified measures on feature importance and the impact of a variable for a prediction. Interestingly, SHAP interpretations were in accordance with what is generally observed in the compressive behavior of concrete, thus validating the causality of ML predictions.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
In this study, we proposed an ensemble learning method, simultaneously integrating a low-rank matrix completion model and a ridge regression model to predict anticancer drug response on cancer cell ...lines. The model was applied to two benchmark datasets, including the Cancer Cell Line Encyclopedia (CCLE) and the Genomics of Drug Sensitivity in Cancer (GDSC). As previous studies suggest, the dual-layer integrated cell line-drug network model was one of the best models by far and outperformed most state-of-the-art models. Thus, we performed a head-to-head comparison between the dual-layer integrated cell line-drug network model and our model by a 10-fold crossvalidation study. For the CCLE dataset, our model has a higher Pearson correlation coefficient between predicted and observed drug responses than that of the dual-layer integrated cell line-drug network model in 18 out of 23 drugs. For the GDSC dataset, our model is better in 26 out of 28 drugs in the phosphatidylinositol 3-kinase (PI3K) pathway and 26 out of 30 drugs in the extracellular signal-regulated kinase (ERK) signaling pathway, respectively. Based on the prediction results, we carried out two types of case studies, which further verified the effectiveness of the proposed model on the drug-response prediction. In addition, our model is more biologically interpretable than the compared method, since it explicitly outputs the genes involved in the prediction, which are enriched in functions, like transcription, Src homology 2/3 (SH2/3) domain, cell cycle, ATP binding, and zinc finger.
Display omitted
This study proposed an ensemble learning method, simultaneously integrating a low-rank matrix completion model and a ridge regression model to predict anticancer drug response on cancer cell lines. The model was effectively applied to two benchmark datasets. It also explicitly outputs the genes involved in the prediction, which are enriched in functions.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP