Computational Application of Benford’s First Digit Law to Financial Fraud Detection
Sukanto Bhattacharya School of Business/Information Technology gy Bond University
Kuldeep Kumar School of Information Technology gy Bond University
Newcomb’s discovery „
In 1881, an American mathematician Simon Newcomb discovered to his surprise that the first few pages of a logarithmic table corresponding to the lower significant digits (typically those below 5) were comparatively dirtier than the later pages corresponding to the higher significant digits (typically those above 5)
„
Newcomb attributed this to greater usage of the first pages than the later ones; which in turn led him to reason that the probability distribution of an user accessing any of the pages at any given time was skewed in favour of the earlier pages corresponding to the lower significant digits! This was directly in contrast with the normal theory of probability according to which the probability off randomly picking any number between one and nine should be equal to the unique value of 1/9 or roughly 11.11%
Benford steals the thunder
In 1938, almost half a century after Newcomb’s sensational discovery, another American – the physicist Frank Benford was g going g through g a large g collection of numerical data from disparate sources when he stumbled upon a similar finding
Besides further exploring its mathematical intricacies, Benford also came up with a huge volume of data to empirically support his finding and went on to publish his findings in a number of papers. Thus the ‘principle’ came to be known as “Benford’s Law”
The mathematical structure of Benford’s law
It is specifically a logarithmic probability distribution on the first significant digit of real numbers given as follows: P(D1 = d) = d∫d+1log10(e)D1-1 dD1 = log10(1 + d-1), d = 1, 2, … 9
IIn the th above b f form, it is i also l known k as the th first fi t digit di it law. l However, the law can be generalized to include any digit such that, in its general form it is stated as follows: P(D1 = d1, D2 = d2, … Dn = dn) = log10[1 + (Σdi.10n-i)-1], for all n ∈ Ν
Mathematics (contd.)
An alternative form of the general law may be stated as under: P(mantissa ≤ t/10) = log100 t, for all t ∈ [1, 10)
The mantissa (base 10) of a positive, real number x is the real number r in [1/10, 1) with x = r.10n for some exponent n∈N
Formally, the logarithmic probability measure P is defined on the measurable space (R +, M) where R + is the set of all positive, real numbers b and d M is i the th mantissa ti (b (base 10) sigma i algebra l b – which hi h in i turn; is the sub-sigma algebra of the Borel set generated by the single-valued function x → mantissa(x) |
Invariance properties of Benford’s distribution
Benford’s distribution is characterized by the important statistical properties of scale invariance and base invariance
Scale Invariance: A probability measure P on mantissa space (R +, M) is said to be scale invariant if P(sS) = P(S) for every S ∈ M and s > 0. This p property p y ensures that Benford’s distribution is p particularly y robust even with chaotic data subject to Feigenbaum scaling
Base Invariance: A probability measure P on mantissa space (R +, M) is said to be base invariant if P(S1/n) = P(S) for every S ∈ M. Benford’s distribution is the unique logarithmic probability measure on mantissa space (R +, M) that displays base invariance
Benford’s law as a signature of Nature
It has been mathematically proved that in a form analogous to the central limit theorem, the Benford distribution is characterized as the unique upper limit of the significant-digit frequencies of a sequence of conformably generated random variables g
In accordance with Benford himself, while we count arithmetically as 1, 2, 3, 4, … ; Nature counts geometrically as e0, ex, e2x, … etc. Thus Benford’s distribution is observable in most naturally occurring numbers but not in artificially manipulated or concocted data
Accounting g data is one type yp of data that is expected p to closelyy follow the Benford distribution. Therefore, theoretically, the more an observed set of accounting data deviates from the pattern predicted by Benford, the more are the chances that the data is not authentic
Getting the numbers right
The steady steady-state state Benford first first-digit digit frequencies: D1
1
2
3
4
5
6
7
8
9
P(D1 = d)
0.301
0.176
0.125
0.097
0.079
0.067
0.058
0.051
0.046
Dow Illustrates Benford's Law To illustrate Benford's Law, Dr. Mark J. Nigrini offered this example: "If we think of the Dow Jones stock average as 1,000, our first digit would be 1”
"To get to a Dow Jones average with a first digit of 2, the average must increase to 2 000 and getting from 1,000 2,000, 1 000 to 2,000 2 000 is a 100 percent increase” increase
"Let's say that the Dow goes up at a rate of about 20 percent a year. That means that it would take five years to get from 1 to 2 as a first digit”
"But suppose we start with a first digit 5. It only requires a 20 percent increase to get from 5,000 to 6,000, and that is achieved in one year”
"When the Dow reaches 9,000, it takes onlyy an 11 p percent increase and jjust seven months to reach the 10,000 mark, which starts with the number 1. At that point you start over with the first digit a 1, once again. Once again, you must double the number -- 10,000 -- to 20,000 before reaching 2 as the first digit”
"As As you can see see, the number 1 predominates at every step of the progression progression, as it does in logarithmic sequences"
A suggested Monte Carlo approach
We have voiced slight reservations about direct comparison of observed first-digit frequencies with the expected Benford frequencies as the Benford frequencies are necessarily steady state frequencies and may not therefore be truly reflected in the sample frequencies. frequencies As samples are always of finite sizes, it is therefore not appropriate to arrive at any conclusion on the basis of such a direct comparison, as the sample frequencies won’t be steady state frequencies
We have shown (Kumar and Bhattacharya, 2002) that if we draw digits randomly using the inverse transformation technique from within random number ranges derived from a cumulative probability distribution function based on the Benford frequencies; then the problem boils down to running a goodness of fit kind of test to identify any significant difference between observed and simulated first-digit frequencies This test may be conducted using a known sampling frequencies. distribution like for example the Pearson’s χ² distribution
The final test Test for significant difference in sample frequencies between the first digits observed in the sample and those generated by the Monte Carlo simulation by using a goodness of fit test using the Pearsonian χ² distribution. The null and alternative hypotheses are as follows: H0: The observed first digit frequencies approximate a Benford distribution H1: The observed first digit frequencies do not approximate a Benford distribution
The above statistical test will not reveal whether or not a fraud has actually been committed. All it does is establish at a desired level of confidence, whether the accounting data has been manipulated (if H0 is rejected)
A Neutrosophic Extension However, given that H1 is accepted and H0 is rejected, it could imply any of the following events:
II. There is no manipulation - occurrence of a Type I error ii.e. e H0 rejected when true.
II. There is manipulation p and such manipulation p is definitely y fraudulent.
III. There is manipulation and such manipulation may or may not be fraudulent. fraudulent
IV. There is manipulation and such manipulation is definitely not fraudulent.
A Neutrosophic Extension (continued)
Neutrosophic probabilities are a generalization of classical and fuzzy probabilities and cover those events that involve some degree of indeterminacy
Neutrosophy provides a better approach to quantifying uncertainty than classical or even fuzzy probability theory. Neutrosophic probability theory uses a subset-approximation for truth-value as well as indeterminacy and falsity values
Also, this approach makes a distinction between “relative true event” and “absolute true event” the former being true in only some probability subspaces while the latter being true in all probability sub-spaces. Similarly, events that are false in only some probability sub-spaces sub spaces are classified as “relative false events” while events that are false in all probability subspaces are classified as “absolute false events”. Again, the events that may be hard to classify as either ‘true’ or ‘false’ in some probability sub-spaces are classified as “relative relative indeterminate events events” while events that bear this characteristic over all probability sub-spaces are classified as “absolute indeterminate events”.
A Neutrosophic Extension (continued)
While in classical probability n_sup ≤ 1, in neutrosophic probability one has n_sup ≤ 3+ where n_sup is the upper bound of the probability space. In cases where the truth and falsity components p y, i.e. there is no indeterminacy, y, the components p are complimentary, sum to unity and neutrosophic probability is reduced to classical probability as in the tossing of a fair coin or the drawing of a card from a well-shuffled deck
Coming back to our original problem of financial fraud detection, let E be the event whereby a Type I error has occurred and F be the event whereby a fraud is actually detected. Then the conditional neutrosophic probability NP (F | Ec) is defined over a probability space consisting of a triple of sets (T, I, U). Here, T, I and U are probability sub-spaces wherein event F is t% true, i% indeterminate and u% untrue respectively, given that no Type I error occurred
Statistical sampling issues „
A statistical sampling method particularly useful for the investigative accountant is the monetary unit sampling, which takes into account the materiality of various items by giving proportionately greater weightage to those items that have higher g monetaryy values
„
The monetary unit sampling technique treats each monetary unit in the account balances under examination as a separate part of the population. The items with larger monetary values have a greater probability of selection (as they are automatically given a larger weightage in proportion to the size of the monetary units contained therein)
„
The monetary unit sampling method is particularly suitable for forensic accounting purposes where the investigator suspects material overstatement of accounts on a selective basis in an otherwise robust accounting g system y
Direction of future research
We are still trying to come to terms with the deep statistical and topological properties of this strange law of anomalous numbers
We have already attempted to add a neutrosophic dimension to the problem of determining the conditional probability that a financial fraud has been actually committed, given that no Type I error occurred while rejecting the null hypothesis (Bhattacharya, (Bhattacharya 2002)
The possibilities of coming up with a neuro-fuzzy multinomial fraud classification system are presently being explored. This is intended as the first step towards building a comprehensive fraud classification and detection tool-kit incorporating the statistical features of Benford’s law along with sophisticated audit-sampling methodologies
High Five An open workgroup has recently been formed for further collaborative research on application of Benford’s law in fraud detection. The group presently involves the following researchers: 1. Florentin Smrandache, Department of Mathematics, University of New Mexico, U.S.A. 2. Jean Dezert, ONERA (National Aerospace Research Establishment), France 3 Kuldeep Kumar, 3. Kumar School of IT, IT Bond University, University Australia 4. Sukanto Bhattacharya, School of Business/IT, Bond University, Australia 5. Mohammad Khoshnevisan, School of Accounting & Finance, Griffith University, Australia