This work employed NIR spectroscopy and PLS algorithms for the identification and quantification of goat milk adulteration by adding cow milk, besides the determination of their fat and protein ...contents. Since cow milk can represent a health risk to allergic consumers regardless its amount, PLS-DA was able to identify cow milk additions in goat milk as low as 1.0154 g/100g, likewise the non-adulterated goat and cow milk samples, achieving a 100% of correct classification. For quantification purposes, the Successive Projections Algorithm for interval selection in PLS (iSPA-PLS) provided the best results for the determination of both adulteration and fat contents, while PLS gave better results for the protein quantification. Despite the great similarity of both natural dairy matrices and their intrinsic variability, the prediction results provided suitable values with high correlation coefficients and low RMSEP and REP values, with RPD values higher than 3. Therefore, the proposed methodology proved to be a useful, fast and non-destructive tool for screening the quality of goat milk in terms of its adulteration with cow milk, in addition to the quantification of its fat and protein contents.
•Identification and quantification of goat milk adulteration by adding cow milk.•Fat and protein content quantifications in goat and cow milk and their mixtures.•NIR spectroscopy coupled with different preprocessing techniques and PLS algorithms.•PLS-DA classified correctly non-adulterated goat and cow milk, and their mixtures.•Good quantifications for adulteration and fat by iSPA-PLS and for protein by PLS.
Display omitted
•FT-NIR spectroscopy was used to rapidly identify geographical origin of rice.•The classification models developed using LDA, PLS-DA, C-SVC, PC-NN and KNN tools presented high ...classification results.•Results did not show overfitting during k-cross validation and optimal hyperparameter fine tuning procedure by GridSearchCV.•The extremely randomized trees (Extra trees) was recommended for use due to the smaller number of featured wavelengths.
The mislabelled Khao Dawk Mali 105 rice coming from other geographical region outside the Thung Kula Rong Hai region is extremely profitable and difficult to detect; to prevent retail fraud (that adversely affects both the food industry and consumers), it is vital to identify geographical origin. Near infrared spectroscopy can be used to detect the specific content of organic moieties in agricultural and food products. The present study implemented the combinatorial method of FT-NIR spectroscopy with chemometrics to identify geographical origin of Khao Dawk Mali 105 rice. Rice samples were collected from 2 different region including the north and northeast of Thailand. NIR spectra data were collected in range of 12,500 – 4,000 cm−1 (800–2,500 nm). Five machine learning algorithms including linear discriminant analysis (LDA), partial least squares discriminant analysis (PLS-DA), C-support vector classification (C-SVC), backpropagation neural networks (BPNN), hybrid principal component analysis-neural network (PC-NN) and K-nearest neighbors (KNN) were employed to classify NIR data of rice samples with full wavelength and selected wavelength by Extremely Randomized Trees (Extra trees) algorithm. Based on the findings, geographical origin of rice could be specified quickly, cheaply, and reliably using combination of NIRS and machine learning. All models creating by full wavelength and selected wavelength exhibited accuracy between 65 and 100 % for identifying geographical region of rice. It was proven that NIR spectroscopy may be used for the quick and non-destructive identification of geographical origin of Khao Dawk Mali 105 rice.
Display omitted
•A new ensemble algorithm is designed for cancer diagnosis.•Three algorithms of virtual sample generation are compared.•Such a work is a good reference for developing new tools.
...Cancer diagnosis plays a key role in facilitating treatment and improving survival rates of patients. The combination of near-infrared (NIR) spectroscopy with data-driven algorithms offers a rapid and cost-effective approach for such a task. Due to the limitations of objective cases, the number of tumor samples is usually smaller, and the resulting dataset exhibit the issues of class imbalance, which has a more serious impact on the performance of diagnostic models. To deal with class imbalance and improve the sensitivity, this work investigates the feasibility of NIR spectroscopy combined with virtual sample generation (VSG) as well as ensemble strategy for developing diagnostic models. Based on preliminary experiment, several learning algorithms such as discriminant analysis (DA) and partial least square-discriminant analysis (PLS-DA) are screened out as algorithms for constructing prediction models. Three algorithms of VSG including synthetic minority oversampling technique (SMOTE), Borderline-SMOTE and adaptive synthetic sampling (ADASYN) are used for experiment. A fixed sample subset composed of 27 cancer samples and 54 normal samples are hold out as the test set. Three training sets containing 5, 10, 25 minority class samples and 54 majority class samples are used for model development. The experimental result indicates that overall, with PLS-DA algorithm, all VSG approaches can significantly improve the sensitivity of cancer diagnosis for all cases of training sets with different minority samples, but ADASYN performs the best. It reveals that the integration of NIR, PLS-DA, and ADASYN is a promising tool package for developing diagnosis methods.
Soil visible and near infrared (Vis-NIR) has become an applicable and interesting technique to predict soil properties because it is a fast, cost-effective, and non-destruction technique. This study ...presents an application of diffuse reflectance spectroscopy (DRS) and chemometric techniques for evaluating concentrations of heavy metals in earth-cumulic-orthic-anthrosols soils. 44 soil samples of 0–30 cm were collected from three representative agriculture areas (Fufeng, Yangling, and Wugong transects with 16, 10, and 18 samples, respectively) and analyzed for Cr, Mn, Ni, Cu, Zn, As, Cd, Hg, and Pb by Vis-NIR spectroscopy (350–2500 nm). Average levels of Cr, Mn, Ni, Cu, Zn, As, Cd, Hg, and Pb were 17.95, 274, 12.77, 7.29, 15.81, 7.51, 0.40, 12.58, and 21.05 mg kg-1, respectively. Twenty-four preprocessing methods were extracted sensitive bands. Partial least squares regression (PLSR) used to obtain effective bands and predict soil heavy metals concentrations. The accuracy of the predictive models were assessed in terms of coefficient of determination (R2), the root mean squared error (RMSE), standard error (SE) and the ratio of performance to deviation (RPD). The results revealed that excellent predictions for Hg(Rv2 = 0.99, RPD = 8.59, RMSEP = 0.12, SEP = 0.13), Cr (Rv2 = 0.97, RPD = 5.96, RMSEP = 0.10, SEP = 0.10), Ni (Rv2 = 0.93, RPD = 3.74, RMSEP = 0.13, SEP = 0.13), Pb (Rv2 = 0.97, RPD = 5.57, RMSEP = 0.10, SEP = 0.01), and Cu (Rv2 = 0.92, RPD = 3.38, RMSEP = 0.08, SEP = 0.08). Models for As (Rv2 = 0.87, RPD = 2.58), Mn (Rv2 = 0.80, RPD = 2.09), and Cd (RPD = 2.77) had Rv2 < 0.9 and RPD<3.0, not excellent predictions. For the element of Zn, although Rv2 = 0.91, RPD = 3.13, the offset had too much deviation, and it cannot be considered an excellent model. Therefore, a combination of spectroscopic and chemometric techniques can be applied as a practical, rapid, low-cost and quantitative approach for evaluating soil physical and chemical properties in Shaanxi, China.
Display omitted
•Vis-NIR spectroscopy was combined with chemometric techniques for the determination of soil structural quality.•Application of spectroscopy for evaluating concentrations of heavy metals in earth-cumulic-orthic-anthrosols soils.•Twenty-four preprocessing methods were tested to improve predictions.•PLSR models were built for the quantification of soil heavy metal contents.
The determination of moisture and vanillin content significantly influences the quality of vanilla. Currently, conventional chemical methods employed for assessing these parameters are ...time-consuming, involve complex sample preparation, are expensive, and environmentally unfriendly due to the use of chemical solutions. Portable Near-Infrared (NIR) spectroscopy emerges as a promising alternative, characterized by smaller dimensions and lower costs. This study investigates the performance of two portable NIR spectrometers with distinct wavelengths at 740–1070 nm and 1350–2550 nm, in conjunction with Random Forest (RF) and Partial Least Square (PLS) regression, and preprocessing techniques including min-max normalization, 1st derivative, standard normal variate (SNV), multiplicative scatter correction (MSC), 1st derivative + SNV, and 1st derivative + MSC for predicting moisture and vanillin content. At the wavelength range of 1350–2550 nm, RF coupled with 1st derivative produced the best moisture content prediction model with an R2 of 0.971, and RF paired with 1st derivative+SNV yielded the best vanillin content prediction with an R2 of 0.983. This work highlights that the integration of portable NIR and RF allows for rapid and non-destructive detection of moisture and vanillin content. This methodology provides a novel regression method for predicting vanilla qualities.
Display omitted
•Portable NIR were used for moisture and vanillin content assessment in vanilla.•Comparison between two portable spectrometers at different wavelength ranges.•Predictive performance of Partial Least Square and Random Forest was evaluated.•Random Forest outperformed conventional approaches in outcome reliability.
Display omitted
•Resins from Boswellia exhibit species-specific variations in AKBA and KBA content.•NIR spectra of Boswellia resin confirm species-dependent nature of resin samples.•Principal ...component analysis shows clear separation of Boswellia species.•Prediction of boswellic acid levels in solid resin was achieved with NIRS and PLSR.
The bioactive compounds Acetyl-11-keto-β-boswellic acid (AKBA) and 11-keto-β-boswellic acid (KBA), found in the resin of the Boswellia tree, exhibit anti-inflammatory properties, rendering Boswellia resin an intriguing natural medicinal products. However, the content of boswellic acids varies across different Boswellia species and proper knowledge of its species-dependent nature, as well as alternatives to the resource- and time-intensive HPLC analysis, are lacking. Here we present a comprehensive investigation into the boswellic acid content of seven Boswellia species from ten countries and introduce a novel and non-destructive Near-Infrared spectroscopy method for predicting boswellic acid concentrations in solid resin samples. The HPLC-UV reference analysis revealed AKBA concentrations of up to 7.27 % (w/w) with KBA concentrations reaching up to 1.28 % (w/w). Principal Component Analysis of the HPLC and NIR spectroscopy data unveiled species-specific variations, facilitating differentiation based on boswellic acid content, characteristic chromatograms and NIR spectra. Using the HPLC-UV quantification as reference, we developed a Partial Least Squares regression model based on NIR spectra of the resin samples. This model demonstrated highly satisfactory predictive capabilities for AKBA content, achieving a root mean square error of prediction of 0.74 % (w/w) and an R2val of 0.79 in independent test set validation. Although the model was less effective for predicting KBA content, it still offered valuable estimates. The spectroscopic method introduced in this study provides a cost-effective and solvent-free approach for predicting boswellic acid content, demonstrating the potential for application in non-laboratory settings through the use of miniaturized NIR spectrometers. Consequently, this method aligns well with the principles of green chemistry and addresses the growing demand for alternative analytical techniques.
The quantification of pollutants, as pharmaceuticals, in wastewater is an issue of special concern. Usually, typical methods to quantify these products are time and reagent consuming. This paper ...describes the development and validation of a Fourier transform near-infrared (FT-NIR) spectroscopy methodology for the quantification of pharmaceuticals in wastewaters. For this purpose, 276 samples obtained from an activated sludge wastewater treatment process were analysed in the range of 200 cm−1 to 14,000 cm−1, and further treated by chemometric techniques to develop and validate the quantification models. The obtained results were found adequate for the prediction of ibuprofen, sulfamethoxazole, 17β-estradiol and carbamazepine with coefficients of determination (R2) around 0.95 and residual prediction deviation (RPD) values above four, for the overall (training and validation) data points. These results are very promising and confirm that this technology can be seen as an alternative for the quantification of pharmaceuticals in wastewater.
Display omitted
•NIR was evaluated as a rapid method to measure pharmaceuticals in aqueous solutions.•A new chemometric approach was used to calibrate the data.•The models were able to predict IBU, CRB, E2, EE2 and SMX with high R2 coefficients.•Results showed that this technique can be used to quantify pharmaceuticals in water.
The emission spectrum of micron-scale uranium particulates at high temperatures in the ultraviolet, visible, and near-infrared spectral regions is investigated using a heterogeneous shock tube. ...Temperatures from 3000 to 9000 K are characterized in an inert argon environment and with incremental amounts of added oxygen. Further, atomic line spectra do not emerge above the continuum emission spectrum until between 4500 and 5000 K in pure argon, and 6100 and 6600 K in 1% oxygen. For 5% oxygen, however, the threshold for atomic emission drops below 3800 K. Uranium monoxide molecular emission in the strongest visible band at 595.4 nm is not observed at any condition. Uncertainties in particle temperature determination in high-temperature shock tube environments are discussed, and limitations to such measurements are presented, such as those from experimental factors such as the powder loading method and expected detection limits of uranium species in relevant conditions.
Display omitted
•Six preprocessing methods were used to preprocess the spectrum.•Different methods were used to select the feature bands of SOM spectrum.•Optimal band combination algorithm has great ...potential in selecting feature bands.•The feature band selection method improved the prediction accuracy of SOM content.
Soil organic matter (SOM) is a key index for evaluating soil fertility and plays a vital role in the terrestrial carbon cycle. Visible and near-infrared (Vis-NIR) spectroscopy is an effective method for determining soil properties and is often used to predict SOM content. However, the key prerequisite for effective prediction of SOM content by Vis-NIR spectroscopy lies in the selection of appropriate preprocessing methods and effective data mining techniques. Therefore, in this study, six commonly used spectral preprocessing methods and effective characteristic band selection methods were selected to process the spectrum to predict SOM content. This study aims to determine a stable spectral preprocessing method and explore the predictive performance of different characteristic band selection methods. The results showed that: (i) The first derivative (FD) is the most stable spectral preprocessing method that can effectively improve the spectral characteristic information and the prediction effect of the model. (ii) The prediction effect of SOM content based on characteristic band selection methods is generally better than the full-spectra data. (iii) The precision of FD preprocessing spectrum combined with successive projections algorithm (SPA) in the partial least square regression prediction model of SOM content is the best. (iv) Although the prediction effect of the model based on the optimal band combination algorithm is slightly lower than that of SPA, it shows stable prediction performance, which provides a feasible method for SOM content prediction. In summary, the characteristic band selection method combined with FD can significantly improve the prediction accuracy of SOM content.