Benford's Law Miller, Steven J
2015, 2015., 20150609, 2015-06-09
eBook
Benford's law states that the leading digits of many data sets are not uniformly distributed from one through nine, but rather exhibit a profound bias. This bias is evident in everything from ...electricity bills and street addresses to stock prices, population numbers, mortality rates, and the lengths of rivers. This work demonstrates the many useful techniques that arise from the law, showing how truly multidisciplinary it is, and encouraging collaboration.
In this work we review the specific and differential pore properties of more than two hundred diverse inorganic porous materials, both lab-made and of natural origin. The relevant datasets include ...pore diameters D, pore surface area S, pore volume V, pore length L and pore anisotropy B = L/D. Pore anisotropy B for isotropic pores corresponds to pore number N per unit mass, e.g. to pore density, therefore NB. Pore density N is related to pore volume via a perfect Zipfian power law N ≈ 1/V. A similar power law L ∼ 1/A holds between pore lengths and pore cross section A. These power laws extend over 10–20 orders of magnitude and are universal for all the examined specific and differential properties. The first digits of all pore properties follow typical Benford's Law distributions. The closeness to Benford depends on the spread of distribution of each property and the final distribution is the result of transfer and aggregation of first digit frequencies from all decadic orders of magnitude of local distributions.
Pore numbers are related to pore volume via Zipfian power law N ≈ 1/V, while distribution of first digits of pore properties obeys Benford's Law which results from the transfer and aggregation of first digit frequencies. Display omitted
•Review of the pore properties (D, V, S, N, L) for >200 diverse porous materials.•Pore density is related to pore volume via Zipfian power law N ≈ 1/V.•Distribution of first digits of pore properties obeys Benford's Law.•Benford distributions results from the transfer and aggregation of first digit frequencies.
Abstract
Background
I use Benford’s law to assess whether there is misreporting of coronavirus disease of 2019 (COVID-19) deaths in the USA.
Methods
I use three statistics to determine whether the ...reported deaths for US states are consistent with Benford’s law, where the probability of smaller digits is greater than the probability of larger digits.
Results
My findings indicate that there is under-reporting of COVID-19 deaths in the USA, although the evidence for and the extent of under-reporting does depend on the statistic one uses to assess conformity with Benford’s law.
Conclusions
Benford’s law is a useful diagnostic tool for verifying data and can be used before a more detailed audit or resource intensive investigation.
This paper proposes a data science approach based on Benford's Law to analyse tourist flows – being tourism a relevant economic sector in Sicily. In particular, we are interested in detecting ...irregular patterns in the numerical data that may represent manipulations, inaccuracies or biases in the self-reported data from tourism organisations. The analysis is carried out by using monthly data for arrivals and overnight stays in hotels, B&Bs, and complementary accommodations in the seven provinces of the island from January 2016 to December 2019.
We perform the analysis by employing several statistical tests and through a visual inspection of the difference between the empirical distributions and the theoretical Benford's.
Conformity to Benford's distribution is mostly confirmed for the total number of overnight stays and the data considered on a yearly basis. On the contrary, we found evident deviations from Benford's Law in the empirical distribution of data broken down by nationality of tourist and accommodation type. Some comments on possible motivations for such deviations are also advanced, even though a detailed exploration of them deserves a devoted study.
•We discuss data validity through compliance with Benford's Law.•We employ a wide set of statistical tests for testing the compliance.•We consider the empirical case of touristic flows in Sicily for a quadrennium.•We discuss deviations and regularities of the data from Benford's Law.
Financial fraud of listed companies can lead to anomalies in the distribution of financial data, which can be detected by Benford's Law. This study takes financial data of Chinese listed companies to ...construct two types of Benford factors for detecting financial fraud. The empirical results show that as the deviation of financial data distribution from Benford's law increases, the probability of financial fraud increases significantly. Furthermore, compared with rustically using traditional financial indicators, the addition of the Benford factors can effectively reduce the Type I or Type II error using the logistic regression model. Finally, we show that the identification indicators selected in this study contributes to the detection of financial fraud with the help of digital distribution laws.
•We derive Benford's law with a Benford term together with an additional err term from the Laplace transform.•The Benford term originates from the structure of the number system, as a result of the ...way that we write numbers.•The err term leads to deviation from Benford's law with the inverse Laplace transform of probability density functions.•The whole family of completely monotonic distributions can all satisfy Benford's law within a small bound.•Distributions with violently oscillating inverse Laplace spectrum generally break Benford's law.
The occurrence of digits 1 through 9 as the leftmost nonzero digit of numbers from real-world sources is distributed unevenly according to an empirical law, known as Benford's law or the first digit law. It remains obscure why a variety of data sets generated from quite different dynamics obey this particular law. We perform a study of Benford's law from the application of the Laplace transform, and find that the logarithmic Laplace spectrum of the digital indicator function can be approximately taken as a constant. This particular constant, being exactly the Benford term, explains the prevalence of Benford's law. The slight variation from the Benford term leads to deviations from Benford's law for distributions which oscillate violently in the inverse Laplace space. We prove that the whole family of completely monotonic distributions can satisfy Benford's law within a small bound. Our study suggests that the origin of Benford's law is from the way that we write numbers, thus should be taken as a basic mathematical knowledge.
The authenticity of China’s economic data has long been questioned. We use a new statistical method, Benford’s test, to evaluate data quality of the Chinese Industrial Census (CIC). We show that the ...method is effective to uncover data irregularities. Based on predicted industrial output by variables that are less manipulatable, such as employment and electricity, we further demonstrate that firms of different ownership types display different behavior in terms of the direction of data manipulation. We find no conclusive evidence of data manipulation by state-owned enterprises (SOEs), whereas private firms tend to under-report performance.
•Use Benford’s Law to test firm-level data quality in China.•Deviations from Benford’s Law correlate with abnormal industrial outputs.•Private firms significantly under-report industrial outputs.•No conclusive evidence on State-owned enterprises (SOEs) data manipulation.