Two-parameter Weibull function has been widely applied to evaluate wind energy potential. In this paper, six kinds of numerical methods commonly used for estimating Weibull parameters are reviewed; ...i.e. the moment, empirical, graphical, maximum likelihood, modified maximum likelihood and energy pattern factor method. Their performance is compared through Monte Carlo simulation and analysis of actual wind speed according to the criterions such as Kolmogorov–Smirnov test, parameter error, root mean square error, and wind energy error. The results show that, in simulation test of random variables, the graphical method’s performance in estimating Weibull parameters is the worst one, followed by the empirical and energy pattern factor methods, if data number is smaller. The performance for all the six methods is improved while data number becomes larger; the graphical method is even better than the empirical and energy pattern factor methods. The maximum likelihood, modified maximum likelihood and moment methods present relatively more excellent ability throughout the simulation tests. From analysis of actual data, it is found that if wind speed distribution matches well with Weibull function, the six methods are applicable; but if not, the maximum likelihood method performs best followed by the modified maximum likelihood and moment methods, based on double checks including potential energy and cumulative distribution function.
Wind energy, which is intermittent by nature, can have a significant impact on power grid security, power system operation, and market economics, especially in areas with a high level of wind power ...penetration. Wind speed forecasting has been a vital part of wind farm planning and the operational planning of power grids with the aim of reducing greenhouse gas emissions. Improving the accuracy of wind speed forecasting algorithms has significant technological and economic impacts on these activities, and significant research efforts have addressed this aim recently. However, there is no single best forecasting algorithm that can be applied to any wind farm due to the fact that wind speed patterns can be very different between wind farms and are usually influenced by many factors that are location-specific and difficult to control. In this paper, we propose a new hybrid wind speed forecasting method based on a back-propagation (BP) neural network and the idea of eliminating seasonal effects from actual wind speed datasets using seasonal exponential adjustment. This method can forecast the daily average wind speed one year ahead with lower mean absolute errors compared to figures obtained without adjustment, as demonstrated by a case study conducted using a wind speed dataset collected from the Minqin area in China from 2001 to 2006.
Detrital zircons from Holocene beach sand and igneous zircons from the Cretaceous syenite forming Cape Sines (Western Iberian margin) were dated using laser ablation – inductively coupled plasma – ...mass spectrometry. The U–Pb ages obtained were used for comparison with previous radiometric data from Carboniferous greywacke, Pliocene–Pleistocene sand and Cretaceous syenite forming the sea cliff at Cape Sines and the contiguous coast. New U–Pb dating of igneous morphologically simple and complex zircons from the syenite of the Sines pluton suggests that the history of zircon crystallization was more extensive (ca 87 to 74 Ma), in contrast to the findings of previous geochronology studies (ca 76 to 74 Ma). The U–Pb ages obtained in Holocene sand revealed a wide interval, ranging from the Cretaceous to the Archean, with predominance of Cretaceous (37%), Palaeozoic (35%) and Neoproterozoic (19%) detrital‐zircon ages. The paucity of round to sub‐rounded grains seems to indicate a short transportation history for most of the Cretaceous zircons (ca 95 to 73 Ma) which are more abundant in the beach sand that was sampled south of Cape Sines. Comparative analysis using the Kolmogorov–Smirnov statistical method, analysing sub‐populations separately, suggests that the zircon populations of the Carboniferous and Cretaceous rocks forming the sea cliff were reproduced faithfully in Quaternary sand, indicating sediment recycling. The similarity of the pre‐Cretaceous ages (>ca 280 Ma) of detrital zircons found in Holocene sand, as compared with Carboniferous greywacke and Pliocene–Pleistocene sand, provides support for the hypothesis that detritus was reworked into the beach from older sedimentary rocks exposed along the sea cliff. The largest percentage of Cretaceous zircons (<ca 95 Ma) found in Holocene sand, as compared with Pliocene–Pleistocene sand (secondary recycled source), suggests that the Sines pluton was the one of the primary sources that became progressively more exposed to erosion during Quaternary uplift. This work highlights the application of the Kolmogorov–Smirnov method in comparison of zircon age populations used to identify provenance and sediment recycling in modern and ancient detrital sedimentary sequences.
Numerical data that are normally distributed can be analyzed with parametric tests, that is, tests which are based on the parameters that define a normal distribution curve. If the distribution is ...uncertain, the data can be plotted as a normal probability plot and visually inspected, or tested for normality using one of a number of goodness of fit tests, such as the Kolmogorov-Smirnov test. The widely used Student's t-test has three variants. The one-sample t-test is used to assess if a sample mean (as an estimate of the population mean) differs significantly from a given population mean. The means of two independent samples may be compared for a statistically significant difference by the unpaired or independent samples t-test. If the data sets are related in some way, their means may be compared by the paired or dependent samples t-test. The t-test should not be used to compare the means of more than two groups. Although it is possible to compare groups in pairs, when there are more than two groups, this will increase the probability of a Type I error. The one-way analysis of variance (ANOVA) is employed to compare the means of three or more independent data sets that are normally distributed. Multiple measurements from the same set of subjects cannot be treated as separate, unrelated data sets. Comparison of means in such a situation requires repeated measures ANOVA. It is to be noted that while a multiple group comparison test such as ANOVA can point to a significant difference, it does not identify exactly between which two groups the difference lies. To do this, multiple group comparison needs to be followed up by an appropriate post hoc test. An example is the Tukey's honestly significant difference test following ANOVA. If the assumptions for parametric tests are not met, there are nonparametric alternatives for comparing data sets. These include Mann-Whitney U-test as the nonparametric counterpart of the unpaired Student's t-test, Wilcoxon signed-rank test as the counterpart of the paired Student's t-test, Kruskal-Wallis test as the nonparametric equivalent of ANOVA and the Friedman's test as the counterpart of repeated measures ANOVA.
By using the brute force algorithm, the application of the two-dimensional two-sample Kolmogorov–Smirnov test can be prohibitively computationally expensive. Thus a fast algorithm for computing the ...two-sample Kolmogorov–Smirnov test statistic is proposed to alleviate this problem. The newly proposed algorithm is O(n) times more efficient than the brute force algorithm, where n is the sum of the two sample sizes. The proposed algorithm is parallel and can be generalized to higher dimensional spaces.
The two-sample Kolmogorov-Smirnov and Anderson-Darling tests assess the hypothesis that two given samples come from the same population. In this paper, the power of each test was measured using a ...variety of alternative distributions and varying the sample size. Recommendations for the more powerful test for each common distribution are given, depending on whether the distribution has heavy or light tails.
We propose a sequential nonparametric test for detecting a change in distribution, based on windowed Kolmogorov-Smirnov statistics. The approach is simple, robust, highly computationally efficient, ...easy to calibrate, and requires no parametric assumptions about the underlying null and alternative distributions. We show that both the false-alarm rate and the power of our procedure are amenable to rigorous analysis, and that the method outperforms existing sequential testing procedures in practice. We then apply the method to the problem of detecting radiological anomalies, using data collected from measurements of the background gamma-radiation spectrum on a large university campus. In this context, the proposed method leads to substantial improvements in time-to-detection for the kind of radiological anomalies of interest in law-enforcement and border-security applications.Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
In recent years, there has been an increasing amount of streaming information coming from time series. Learning from data appearing in real time is quite a call, due in part to the speed at which new ...data appears. Hidden data changes that are not previously known to learning algorithms are referred to in the literature as data or concept drift. In classical machine learning, a classifier analyzes new data using past training instances of the data stream. However, the accuracy of the classifier deteriorates due to data drift, which occurs in non-stationary data. In such situations, the classifier must detect a significant change in the data adapt its prediction over time. The motivation of this paper is to show a method for drift detection without knowledge of instance labels. Labels are sometimes not available or periodically missing, making it difficult to apply methods where knowledge of them is required.
With advances in cancer treatments and improved patient survival, more patients may go through multiple lines of treatment. It is of clinical importance to choose a sequence of effective treatments ...(eg, lines of treatment) for individual patients with the goal of optimizing their long‐term clinical outcome (eg, survival). Several important issues arise in cancer studies. First, cancer clinical trials are usually conducted by each line of treatment. For a treatment sequence, we may have first line and second line treatment data from two different studies. Second, there is typically a treatment initiation period varying from patient to patient between progression of disease and the start of the second line treatment due to administrative reasons. Additionally, the choice of the second line treatment for patients with progression of disease may depend on their characteristics. We address all these issues and develop semiparametric methods under the potential outcome framework for the estimation of the overall survival probability for a treatment sequence and for comparing different treatment sequences. We establish the large sample properties of the proposed inferential procedures. Simulation studies and an application to a colorectal clinical trial are provided.