Firmness is a key indicator of apple quality. Building a predictive model for apple firmness based on hyperspectral technology and regression algorithms can achieve rapid, non-destructive, and ...high-throughput detection of apple firmness. This paper adopts an Adaptive Window Length Savitzky-Golay Smoothing (AWL-SG smoothing) algorithm based on the Savitzky-Golay Smoothing (SG smoothing) algorithm, which can adaptively adjust the window length according to the change rate of spectral data at different wavelengths. SG smoothing, AWL-SG smoothing, Standard Normal Variate (SNV), and Multiplicative Scatter Correction (MSC) algorithms were used to preprocess the original spectral data, and Partial Least Squares (PLS), Ridge Regression (Ridge), and Kernel Ridge Regression (Kernel Ridge) predictive models were constructed to analyze the impact of different preprocessing methods on model prediction accuracy. The prediction models established with spectral data preprocessed by SG smoothing and AWL-SG smoothing algorithms showed significant improvement in predictive performance on the basis of the original spectral data, among which the AWL-SG smoothing algorithm performed the best. The Ridge model established with spectra data preprocessed by AWL-SG smoothing achieved an R2 of 0.8914 in the test set. Successive Projection Algorithm (SPA), Principal Component Analysis (PCA), and Independent Component Analysis (ICA) dimensionality reduction algorithms were used to reduce the dimensions of the full-band spectral data preprocessed by SG smoothing and AWL-SG smoothing algorithms, and Ridge and Kernel Ridge prediction models were constructed. The results showed that both SPA and PCA algorithms could improve the predictive performance of the models, with the PCA performing the best. The combination of AWL-SG + PCA + Ridge achieved the best predictive effect, with an R2 of 0.9146 in the test set.
•Introduces an innovative method for non-destructive and rapid detection of apple firmness using hyperspectral technology combined with advanced regression algorithms, offering a high-throughput solution for assessing apple quality.•Demonstrates the effectiveness of the Adaptive Window Length Savitzky-Golay (AWL-SG) smoothing algorithm, which adaptively adjusts window length for spectral data smoothing, significantly enhancing predictive model accuracy compared to traditional methods.•Employs Dimensionality Reduction techniques, including Successive Projection Algorithm (SPA), Principal Component Analysis (PCA), and Independent Component Analysis (ICA), to optimize spectral data for improved predictive performance.•The AWL-SG + PCA + Ridge regression model combination achieved the best predictive accuracy, with a determination coefficient (R2) of 0.9146 in the test set, indicating a robust method for firmness prediction.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Interpolators-estimators that achieve zero training error-have attracted growing attention in machine learning, mainly because state-of-the art neural networks appear to be models of this type. In ...this paper, we study minimum ℓ2 norm ("ridgeless") interpolation least squares regression, focusing on the high-dimensional regime in which the number of unknown parameters p is of the same order as the number of samples n. We consider two different models for the feature distribution: a linear model, where the feature vectors xi ∈ Rp are obtained by applying a linear transform to a vector of i.i.d. entries, xi = Σ1/2 zi (with zi ∈ Rp); and a nonlinear model, where the feature vectors are obtained by passing the input through a random one-layer neural network, xi = φ(Wzi) (with zi ∈ Rd, W ∈ Rp×d a matrix of i.i.d. entries, and φ an activation function acting componentwise on Wzi). We recover-in a precise quantitative way-several phenomena that have been observed in large-scale neural networks and kernel machines, including the "double descent" behavior of the prediction risk, and the potential benefits of overparametrization.
High-Speed Tracking with Kernelized Correlation Filters Henriques, Joao F.; Caseiro, Rui; Martins, Pedro ...
IEEE transactions on pattern analysis and machine intelligence,
2015-March-1, 2015-Mar, 2015-3-1, 20150301, Volume:
37, Issue:
3
Journal Article
Peer reviewed
Open access
The core component of most modern trackers is a discriminative classifier, tasked with distinguishing between the target and the surrounding environment. To cope with natural image changes, this ...classifier is typically trained with translated and scaled sample patches. Such sets of samples are riddled with redundancies-any overlapping pixels are constrained to be the same. Based on this simple observation, we propose an analytic model for datasets of thousands of translated patches. By showing that the resulting data matrix is circulant, we can diagonalize it with the discrete Fourier transform, reducing both storage and computation by several orders of magnitude. Interestingly, for linear regression our formulation is equivalent to a correlation filter, used by some of the fastest competitive trackers. For kernel regression, however, we derive a new kernelized correlation filter (KCF), that unlike other kernel algorithms has the exact same complexity as its linear counterpart. Building on it, we also propose a fast multi-channel extension of linear correlation filters, via a linear kernel, which we call dual correlation filter (DCF). Both KCF and DCF outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite running at hundreds of frames-per-second, and being implemented in a few lines of code (Algorithm 1). To encourage further developments, our tracking framework was made open-source.
Currently, studies on vegetation are limited to absolute temperature changes, with insufficient attention directed to the intricate and complex connections between relative temperature (Tr) changes ...and vegetation productivity. This study analyzed the impact of Tr change on vegetation growth by estimating the effects of temperature change, CO2 levels, and precipitation on the vegetation index. The analysis used changes in temperature ordination as a sign of Tr change and employed ridge regression analysis, trend analysis, correlation analysis, and contribution methods. The results indicated that the mean trend of Tr change in China was negative, suggesting that the rate of Tr decreased as compared to the rate of Tr increase, leading to most regions in China becoming relatively colder. Regions experiencing a decrease in Tr were more favorable to vegetation growth due to stable temperatures, while regions with increasing Tr faced intensified water stress and inhibitory effects on vegetation, except in cold regions with sufficient precipitation. Overall, Tr in China had a beneficial impact on the vegetation index, with a lesser effect compared to CO2 and precipitation, but more than temperature, highlighting the significance of Tr in promoting vegetation growth. This study expanded the understanding of the impact of global warming on vegetation by incorporating the novel idea of Tr change and quantifying its consequences for vegetation.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
It is a common saying that testing for conditional independence, that is, testing whether whether two random vectors X and Y are independent, given Z, is a hard statistical problem if Z is a ...continuous random variable (or vector). In this paper, we prove that conditional independence is indeed a particularly difficult hypothesis to test for. Valid statistical tests are required to have a size that is smaller than a pre-defined significance level, and different tests usually have power against a different class of alternatives. We prove that a valid test for conditional independence does not have power against any alternative.
Given the nonexistence of a uniformly valid conditional independence test, we argue that tests must be designed so their suitability for a particular problem may be judged easily. To address this need, we propose in the case where X and Y are univariate to nonlinearly regress X on Z, and Y on Z and then compute a test statistic based on the sample covariance between the residuals, which we call the generalised covariance measure (GCM). We prove that validity of this form of test relies almost entirely on the weak requirement that the regression procedures are able to estimate the conditional means X given Z, and Y given Z, at a slow rate. We extend the methodology to handle settings where X and Y may be multivariate or even high dimensional. While our general procedure can be tailored to the setting at hand by combining it with any regression technique, we develop the theoretical guarantees for kernel ridge regression. A simulation study shows that the test based on GCM is competitive with state of the art conditional independence tests. Code is available as the R package GeneralisedCovarianceMeasure on CRAN.
This paper describes the use of Kernel Ridge Regression (KRR) and Kernel Ridge Regression Confidence Machine (KRRCM) for black box identification of a surface marine vehicle. Data for training and ...test have been obtained from several manoeuvres typically used for marine system identification. Thus, a 20/20 degrees Zig-Zag, a 10/10 degrees Zig-Zag, and different evolution circles have been employed for the computation and validation of the model. Results show that the application of conformal prediction provides an accurate model that reproduces with large accuracy the actual behaviour of the ship with confidence margins that ensure that the model response is within these margins, making it a suitable tool for system identification.
•Black box identification based on Conformal Predictors is used for marine vehicles.•Classical manoeuvres for marine vehicle identification are used to collect data.•A continuous-time model is trained and tested using data from real experiments.•Modelling with Kernel Ridge Regression and Kernel Ridge Regression Confidence Machine.•A confidence margin is proposed where the real behaviour of the vehicle should lie in.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
The increase of the worldwide installed photovoltaic (PV) capacity and the intermittent nature of the solar resource highlights the importance of power forecasting for the grid integration of the ...technology. This study compares 24 machine learning models for deterministic day-ahead power forecasting based on numerical weather predictions (NWP), tested for two-year-long 15-min resolution datasets of 16 PV plants in Hungary. The effects of the predictor selection and the benefits of the hyperparameter tuning are also evaluated. The results show that the two most accurate models are kernel ridge regression and multilayer perceptron with an up to 44.6% forecast skill score over persistence. Supplementing the basic NWP data with Sun position angles and statistically processed irradiance values as the inputs of the learning models results in a 13.1% decrease of the root mean square error (RMSE), which underlines the importance of the predictor selection. The hyperparameter tuning is essential to exploit the full potential of the models, especially for the less robust models, which are prone to under or overfitting without proper tuning. The overall best forecasts have a 13.9% lower RMSE compared to the baseline scenario of using linear regression. Moreover, the power forecasts based on only daily average irradiance forecasts and the Sun position angles have only a 1.5% higher RMSE than the best scenario, which demonstrates the effectiveness of machine learning even for limited data availability. The results of this paper can support both researchers and practitioners in constructing the best data-driven techniques for NWP-based PV power forecasting.
•24 machine learning models tested for day-ahead photovoltaic power forecasting.•Kernel ridge and multilayer perceptron are the overall most accurate models.•Predictor selection is even more important than model selection.•Hyperparameter optimization is essential for the highest accuracy.•Up to 13.9% RMSE improvement over a baseline linear regression model.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Although the past twenty years have witnessed China's remarkable economic development, the cost in terms of greenhouse gas emissions and a deteriorating environment has been enormous. Numerous ...studies have revealed the influence of household factors on household carbon dioxide emissions (HCEs) and called for a reduction of HCEs to mitigate climate change, but few have focused on assessing the most significant household driving factors of HCEs. Using statistical data between 2005 and 2019 in Jiangsu, China, this study developed an extended stochastic impact by regression on population, affluence, and technology (STIRPAT) model to assess the most significant driving factors of HCEs. The results show that the most significant driving factors are household size, total population, unemployment, and urbanisation rate. The study found that HCEs are positively impacted by household size while negatively impacted by the unemployment rate. Based on the study's findings, the following suggestions are proposed to lower HCEs: (i) establish an optimal consumption concept to guide residents towards consuming reasonably; (ii) cultivate a low-carbon concept among residents and promote low-carbon emissions living; and (iii) pay close attention to population structure factors and formulate effective measures accordingly. The study provides insightful information on the key driving factors of HCEs, which can facilitate achieving carbon emissions neutrality.
•H1: The driving factors of household CO2 emissions (HCEs) are identified.•H2: An extended STIRPAT model is developed for assessing the driving factors.•H3: The household factors are ranked resulted from the ridge regression analysis.•H4: The household size is the most significant positive factor of HCEs.•H5: Unemployment has a negative impact on HCEs.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP