The purpose of this study is to explore the factors that affect rural household food security in northern area of Pakistan. The random sampling technique was applied to collected data from 294 rural ...households through a face to face interview. A binary logistic regression technique was used to determine the factors that influence household food insecurity. The results of our study revealed that age, gender, education, remittances, unemployment, inflation, assets, and disease are important factors determining household food insecurity. Moreover, gender played a dominant role in food insecurity as female headed household were food insecure while male headed household were food secure. The policies should be set to promote education, more focus on female headed household and encourage the inflow of remittances.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
The rapidly increasing incidence of Diabetes Mellitus (DM) has shown that DM is a serious disease that endangered human life in all parts of the world. The late stage of Type-II DM (T2DM) in ...particular is accompanied by complex complications. Healthcare systems with various data mining algorithms can help the endocrinologist to find whether patients have diabetes in the early detection of T2DM. In the present research, a novel and efficient binary logistic regression (BLR) is proposed founding on feature transformation of XGBoost (XGBoost-BLR) for accurately predicting the specific type of T2DM, and making the model adaptive to more than one dataset. In order to raise the identification ratio, the databases are executed by series of preprocessing procedures which include removing outliers, normalization, and missing value processing. We select features that have a more significant effect on the results by χ2 test (CST). Then, the selected features are projected into high-dimensional feature space by XGBoost. Finally, the high-dimensional features generated can be modeled by the BLR application. The proposed XGBoost-BLR achieved a 94% and 98% identification rate for diabetes prediction in Pima Indians Diabetes Database (PIDD) and Early-Stage Diabetes Risk Prediction Database (ESDRPD).
•An intelligent diagnosis system can be used to help physicians with diabetes diagnosis, which is time saving and efficient.•In order to accomplish of this aim, we improve the validity and rationality of the dataset with preprocessing method.•Given clinical significance of type 2 Diabetes Mellitus, 10 features that have more significant effects on the results are selected by Chi-Square Test.•The model is indicated to be useful for the early screening T2DM.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Abstract
This study aims to determine the factors that significantly affect the classification of stroke. The response variable used is the type of stroke, namely non-hemorrhagic stroke and ...hemorrhagic stroke. The predictors used were cholesterol level, blood sugar level, temperature, length of stay, pulse rate, and gender. By using logistic regression, the results obtained modeling accuracy of 74.8% where the predictors that have a significant effect (alpha <0.05) are total cholesterol and length of stay.
Active Learning has been a popular method to circumvent the labeling cost in machine learning methods. The majority of active learning approaches can be classified into two categories: ...representative-based and informative-based methods, with some hybrid methods that combine both. This work presents a naïve query strategy, namely Similarity-Based Active Learning (SBAL), which computes the sum of a row in the similarity matrix at each selection step, and a general optimization framework that can accommodate a broad range of active learning algorithms. The label complexity for different classification metrics is used as a primary criterion for comparing different algorithms. The proposed algorithm’s numerical performance is illustrated using simulated data scenarios and by applying it to the real-world COVID-19 image classification. The results demonstrate that, based on the classification metric, labeling cost and label complexity, SBAL outperforms other hybrid methods, such as Adaptive Active Learning (AAL) and Maximizing Variance for Active Learning (MVAL).
•A general optimization framework for a wide range of query strategies.•Extended label complexity to measure the performance of active learning.•The naïve Similarity-Based Active Learning query strategy.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
To identify factors and indicators that affect chronic pain and pain relief, and to develop predictive models using machine learning.
We analyzed the data of 67,028 outpatient cases and 11,310 valid ...samples with pain from a large retrospective cohort. We used decision tree, random forest, AdaBoost, neural network, and logistic regression to discover significant indicators and to predict pain and treatment relief.
The random forest model had the highest accuracy, F1 value, precision, and recall rates for predicting pain relief. The main factors affecting pain and treatment relief included body mass index, blood pressure, age, body temperature, heart rate, pulse, and neutrophil/lymphocyte × platelet ratio. The logistic regression model had high sensitivity and specificity for predicting pain occurrence.
Machine learning models can be used to analyze the risk factors and predictors of chronic pain and pain relief, and to provide personalized and evidence-based pain management.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Process-based nearshore morphodynamic models are commonly used tools by coastal engineers and planners to predict the nearshore morphology change of sandy beaches across various spatiotemporal ...scales. Accurate modeling of the morphological response on medium and long time scales is imperative for quantitative assessments of coastal infrastructure over a project’s intended life-span. However, most previous modeling applications have focused on single/sub-seasonal storm events and are often limited to an assessment of the subaerial beach (i.e. berm and dune). This not only leaves uncertainty concerning the quality of morphology predictions on extended (> weeks) time scales, but also the capacity of process-based models to emulate realistic nearshore sandbar dynamics and the corresponding exchange of sediment between the nearshore-beach system. To shed light on these meso-scale dynamics, CSHORE, a 1D phase-averaged, process-based nearshore morphodynamic model, was applied on an annual scale to a multi-barred, dissipative beach in Oysterville, WA, USA. Thousands of unique sediment transport and hydrodynamic parameter combinations were executed during model calibration. A large portion of these simulations displayed physically realistic sandbar dynamics, including the growth, decay, and migration of intertidal and subtidal sandbars. To explore the model mechanisms enabling realistic bar behavior, the binary and multinomial logistic regression model were used to quantify the relationship between model parameter selection and the probability of various categorical bar configurations occurring in the final predicted profile. The results indicate the most sensitive parameters associated with barred morphology, in this study, and support the use of separate sediment transport parameters for low and high wave energy conditions. The co-utilization of numerical and statistical modeling outlined in this publication is generalizable to future exploratory modeling and/or calibration routines concerned with categorical outcomes.
•CSHORE is used to simulate annual scale nearshore and intertidal sandbar evolution.•Novel logistic regression methods inform model parameter influence on bar behavior.•Intertidal (subtidal) bars are most influenced by low (high) wave energy processes.•Low values of the breaker index favor barred morphology predictions.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Testing goodness of fit is an important step in evaluating a statistical model. For binary logistic regression models, the Hosmer–Lemeshow goodness-of-fit test is often used. For multinomial logistic ...regression models, however, few tests are available. We present the mlogitgof command, which implements a goodness-of-fit test for multinomial logistic regression models. This test can also be used for binary logistic regression models, where it gives results identical to the Hosmer–Lemeshow test.
Full text
Available for:
NUK, OILJ, SAZU, UKNU, UL, UM, UPUK
Preparation of landslide susceptibility map is the first step for landslide hazard mitigation and risk assessment. The main aim of this study is to explore potential applications of two new models ...such as two-class Kernel Logistic Regression (KLR) and Alternating Decision Tree (ADT) for landslide susceptibility mapping at the Yihuang area (China). The ADT has not been used in landslide susceptibility modeling and this paper attempts a novel application of this technique. For the purpose of comparison, a conventional method of Support Vector Machines (SVM) which has been widely used in the literature was included and their results were assessed. At first, a landslide inventory map with 187 landslide locations for the study area was constructed from various sources. Landslide locations were then spatially randomly split in a ratio of 70/30 for building landslide models and for the model validation. Then a spatial database with a total of fourteen landslide conditioning factors was prepared, including slope, aspect, altitude, topographic wetness index (TWI), stream power index (SPI), sediment transport index (STI), plan curvature, landuse, normalized difference vegetation index (NDVI), lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Using the KLR, the SVM, and the ADT, three landslide susceptibility models were constructed using the training dataset. The three resulting models were validated and compared using the receive operating characteristic (ROC), Kappa index, and five statistical evaluation measures. In addition, pairwise comparisons of the area under the ROC curve were carried out to assess if there are significant differences on the overall performance of the three models. The goodness-of-fits are 92.5% (the KLR model), 88.8% (the SVM model), and 95.7% (the ADT model). The prediction capabilities are 81.1%, 84.2%, and 93.3% for the KLR, the SVM, and the ADT models, respectively. The result shows that the ADT model yielded better overall performance and accurate results than the KLR and SVM models. The KLR model considered slightly better than SVM model in terms of the positive prediction values. The ADT and KLR are the two promising data mining techniques which might be considered to use in landslide susceptibility mapping. The results from this study may be useful for landuse planning and decision making in landslide prone areas.
•Spatial prediction of landslide hazards was carried out by using KLR, ADT, and SVM.•All landslide models have a good prediction capability.•ADT model has better prediction capabilities.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK
Prediction models help healthcare professionals and patients make clinical decisions. The goal of an accurate prediction model is to provide patient risk stratification to support tailored clinical ...decision-making with the hope of improving patient outcomes and quality of care. Clinical prediction models use variables selected because they are thought to be associated (either negatively or positively) with the outcome of interest. Building a model requires data that are computer-interpretable and reliably recorded within the time frame of interest for the prediction. Such models are generally defined as either diagnostic, likelihood of disease or disease group classification, or prognostic, likelihood of response or risk of recurrence. We describe a set of guidelines and heuristics for clinicians to use to develop a logistic regression-based prediction model for binary outcomes that is intended to augment clinical decision-making.
The main objective of the present study was to compare the performance of a classifier that implements the Logistic Regression and a classifier that employs a Naïve Bayes algorithm in landslide ...susceptibility assessments. The study provides an evaluation concerning the influence of model's complexity and the size of the training data, while it identifies the most accurate and reliable classifier.
The comparison of the two classifiers was based on the assessment of a database containing 116 sites located at the mountains of Epirus, Greece, where serious landslides events have been encountered. The sites are classified into two categories, non-landslide and landslide areas. The identification of those areas was established by analysing airborne imagery, extensive field investigation and the examination of previous research studies. The geo-environmental conditions in those locations where analyzed in regard with their susceptibility to slide. In particular, seven variables where analyzed: engineering geological units, slope angle, slope aspect, mean annual rainfall, distance from river network, distance from tectonic features and distance from road network.
Multicollinearity analysis and feature selection was implemented in order to estimate the conditional independence among the variables and to rank the variables according to their significance in estimating landslide susceptibility. By the above processes the construction of nine different datasets was accomplished. Further partition allowed creating subsets of training and validating data from the original 116 sites. Each dataset was characterized by the number of the variables used and the size of the training datasets.
The comparison and validation of the outcomes of each model was achieved using statistical evaluation measures, the receiving operating characteristic and the area under the success and predictive rate curves. The results indicated that model's complexity and the size of the training dataset influence the accuracy and the predictive power of the models concerning landslide susceptibility. In particular, the most accurate model with high predictive power was the eighth model (five variables and 92 training data), with the Naïve Bayes classifier having a slightly higher overall performance and accuracy than the Logistic Regression classifier, 87.50% and 82.61% on the validation datasets, respectively. The highest area under the curve was achieved by the Naïve Bayes classifier for both the training and validating datasets (0.875 and 0.806 respectively) while the Logistic Regression classifier achieved a lower AUC values for the training and validating datasets (0.844 and 0.711, respectively). When limited data are available it seems that more accurate and reliable results could be obtained by generative classifiers, like Naïve Bayes classifiers. Overall, landslide susceptibility assessments could serve as a useful tool for the local and national authorities, in order to evaluate strategies to prevent and mitigate the adverse impacts of landslide events.
•Logistic regression and Naïve Bayes were used in landslide susceptibility zoning.•Model complexity and the size of training data influence the prediction accuracy.•The reduction in model's complexity improved the generalization performance.•The Naïve Bayes model outperforms the Logistic regression.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP