Available online at www.sciencedirect.com
Science and Justice 48 (2008) 118 – 125
A logical framework to ballpoint ink dating interpretation Céline Weyermann ⁎, Beatrice Schiffer, Pierre Margot Institut de Police Scientifique, Lausanne University, Batochime, CH-1015 Lausanne, Switzerland Received 7 June 2007; accepted 20 October 2007
Abstract Since its beginnings, the forensic field of questioned documents has been concerned with the dating of inks. Ink ageing processes follow complex paths. Disagreements about the feasibility of current methods have been voiced worldwide among the scientific and legal communities. This controversy has been the starting point of the present work. Its aim was that of evaluating the interpretation processes of such evidence. Subjective statistical data have been assigned from earlier works and illustrate the kind of data necessary to date ballpoint ink, and how to use it for this purpose. This work also suggests that the court and scientific requirements for standards of reliability are not yet fulfilled by actual ink dating methods for regulatory use in expert testimony. © 2007 Forensic Science Society. Published by Elsevier Ireland Ltd. All rights reserved. Keywords: Questioned documents; Dating; Ink; Phenoxyethanol; Validation; Bayes
1. Introduction On a daily basis the court is confronted to trials in which technical and scientific aspects play a major role. Until recently in the United States, the admissions of an evidence or expert opinion were evaluated principally according to Frye's standard (Frye v. United States, 54 App. D.C. 46, 293 F. 1013, 1923), which was formulated in 1923. A scientific expertise was accepted if the validity of the scientific processes was generally accepted among the pertinent scientific community. In 1975, it was then replaced by the Federal Rules of Evidence (FRE), which attributed the responsibility of evaluating the validity of scientific evidence to the judge [1]. The debates were revived in 1993 with the new reliability standards enunciated by the United States Supreme Court in “Daubert v. Merrell Dow Pharmaceuticals, Inc” (509 U.S. 579–601, 1993). The Supreme Court reiterated the necessity of the
⁎ Corresponding author. Tel.: +41 216924649; fax: +41 216924605. E-mail address: celine.weyermann@unil.ch (C. Weyermann).
competence and qualification of the expert and stated that the methods and procedures should be reliable according to the following Daubert criteria: – – – –
verification of the theory or technique through tests, peer review and publications, known error levels, general acceptance within relevant scientific community.
Daubert states that the reliability of the evidence should be decided based solely on principles and methodology, not on the conclusions of the expertise. However, contradictory decisions among trial courts still happen, because judges and jurors have a great deal of difficulty to understand scientific testimony and distinguish the demarcation of science from “pseudoscience” [2]. Daubert criteria and their applications have been discussed at length by many scientists and lawyers [3–6] and it is not the aim of this article to debate those issues. These requirements for standards of reliability apply to all scientific evidence, including ballpoint ink dating. However, no ink dating method fulfills all Daubert criteria yet. In fact, the testability of the theory was not addressed. Many publications exist on the subject, but they are lacking the necessary information [7–10] or are contradictory
1355-0306/$ - see front matter © 2007 Forensic Science Society. Published by Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.scijus.2007.10.009
C. Weyermann et al. / Science and Justice 48 (2008) 118–125
[11–16]. Furthermore, error rates were not established and general acceptability was not achieved among questioned document examiners. For these reasons, ink dating evidence was refused in court on at least two occasions (Regina v. Michael Gurmann in Ontario, Canada, 1993; Learning Curve Toys, L.P. v. Playwood Toys, Inc. 2000 U.S. Dist. Lexis 5130). By overcoming the drawbacks mentioned above, ballpoint ink dating has potential to assist the court in its tasks. The objective of this paper is to propose a statistic model based on Bayesian probabilities to extract the pertinent information from ink dating evidence. In a first part, ink dating principles and methods, as well as Bayesian inference will be explained. Then modeling for ink dating interpretation will be presented and illustrated by a case example. The advantages and drawbacks of this model over actual interpretation and reporting of ink dating evidence are finally discussed. 2. Ink dating
are based on sequential extractions of dyes [21] or solvents [11], respectively and make use of artificial ageing. A two steps extraction of the ink, i.e. first in a weak and then in a strong solvent, followed by quantitative analysis of the dyes (by thinlayer chromatography) or solvents (by gas chromatography), allows for the calculation of a percent extraction (P): the amount extracted in a weak solvent (M1) is divided by the total amount extracted by both the first (weak) and the second (strong) solvents (M1 + M2): PðkÞ ¼ 100d ðM1 Þ=ðM1 þ M2 Þ
ð1Þ
Artificial ageing of the ink entry by exposure to heat is then used to calculate a portion of ageing curve (D). A second sample is heated moderately and then analysed by the same way to obtain the percent extraction (PT). D characterizes the distance between the percent extraction values: DðkÞ ¼ P PT
Two fundamental ink dating approaches can be distinguished: the static approach [17] focused on production dates, and the dynamic approach [18] centred on the ageing processes of inks. This paper concentrates on the latter. Dynamic ink dating methods rely on quantitative measurements of physical (e.g. motions) or chemical (e.g. reactions) changes of the ink on paper as a function of time. The ageing processes measured must be reproducible under a given set of conditions to insure a correct determination of the date of entry on a document. Ballpoint pens are the most common instruments for writing and their ink contains the following major components: solvents (50%), dyes and pigments (25%) and resins (25%). Once ink is applied on paper, the ageing processes start: the solvents migrate into the paper and evaporate, the dyes fade, and the resins polymerise. To study ink ageing, one can quantitatively analyse dye degradation (relative amount of dye to its degradation products) [19] or solvent drying (concentration diminution in ink) [20]. Resins are more difficult to analyse because of their relatively high molecular weight. The ageing processes are strongly influenced both physically and chemically, by the storage conditions (e.g. exposure to light and temperature), the initial composition of ink and the paper substrate type [16]. In real casework, no information about these factors is generally available. Ultimately, in order to use ink dating methods in casework, predictable variations have to be larger than measuring errors and blind testing should be carried out for validation purposes. Two types of dynamic dating methods can be distinguished, relative and absolute dating. The first deals with comparison of ink entries on an identical substrate having the same composition and history, and determine which one is older. These requirements are rarely met in typical casework situations. Therefore, efforts were invested in developing absolute dating methods. The first step is that of determining ageing curves (measuring the changes as a function of time) while taking into account the factors influencing the processes. Two absolute dating methods actually used in case reports were developed on the fact that as ink gets older it is harder to extract. Both measure changes in the extractability of the ink caused by the hardening of the resins as a function of time. They
119
ð2Þ
Ageing slopes are generally decreasing as a function of ink age and level off after some time (Fig. 1). The value of D shows if the ageing is still going on (fresh inks) or has already levelled off (old inks). A third ink dating method used in casework is based on sequential thermodesorption of a single ink entry using a low and a high temperature [22]. The desorbed solvents are quantified by gas chromatography/mass spectrometry. The calculated ratio (V ) of the amount of solvent desorbed at low temperature (T1) divided by the total amount of desorbed solvent (T1 + T2) decreases as a function of the age of the ink entry: V ðkÞ ¼ 100d ðT1 Þ=ðT1 þ T2 Þ
ð3Þ
As mentioned above ageing strongly depends on ink composition. Therefore, current dating methods generally rely on ink databases to identify the type of ink present on the questioned document (through dyes, solvents or resins qualitative
Fig. 1. Example of ageing curve for solvent drying: ballpoint solvent ethoxyethoxyethanol was deposited on a piece of paper and loss of weight was measured with a microbalance [16]. The slope is large during the first drying phase (constant rate) and continually diminishes after that (falling rate period). Measurement errors increase as solvent quantities diminishes.
120
C. Weyermann et al. / Science and Justice 48 (2008) 118–125
analyses). One more difficulty is to always extract exactly the same amount of ink from the paper. The presented methods have the advantage of being mass or size independent. It means that the quantity of ink extracted has no influence on the results. However studies about the influence of storage conditions and paper type on the dating results remain partial. It is now generally admitted that methods based on dye analysis are not reliable [11–13,15] and interest has shifted to dating methods based on solvent drying [10,22]. Hence, the present work focuses on the interpretation of ink dating results provided by analysis of solvent quantities in ballpoint inks. 3. Interpreting evidence The evaluation of scientific evidence is probably the most critical step in the forensic process. Generally, the forensic scientists assess a comparison between evidential and control materials. Using a given method, they will be able to conclude that two samples are not distinguishable. Concluding that they effectively have the same source is more difficult, as several objects/persons may produce indistinguishable traces (evidence). Many forensic studies focus on the inference of sources, by classifying potential common sources for an evidence item and attempting the individualization of a singular source for two items [23]. There has been an ongoing debate to find the best way to interpret and communicate forensic results [24–27]. Of late Bayes' theorem has gained considerable importance in the interpretation of forensic data [28,3,29,30,23]. The Bayesian approach offers the advantages of considering alternative hypotheses and incorporating new evidence to update prior knowledge about the given hypotheses (known as prior odds) [31]. The formulation of the alternative hypotheses depends on the circumstances of the case, the observations and the background information (I ) available to the scientist [32]. When the date on a document is contested, one has to look at the probability of the evidence (E) given the ink entry has been made at a time t1 (hypothesis of the prosecution, Hp ) compared to the probability of this same evidence given the ink entry has been written at a different time t2 (hypothesis of defense, Hd). Then the prior odds of the hypotheses of the prosecution Hp and the defense Hd existing prior to the observation of the evidence E: P Hp jI ð4Þ prior odds ¼ PðHd jI Þ are multiplied by a factor LR to obtain the posterior odds that account for the new evidence E: P Hp jE; I posterior odds ¼ ¼ LRdpriorodds ð5Þ PðHd jE; I Þ The likelihood ratio (LR) is an indication of the strength of the evidence in supporting one of the hypotheses in Bayesian logic. It is defined by the probability of E given Hp is true divided by the probability of E given Hd is true: LR ¼ P EjHp ; I =Pð EjHd ; I Þ ð6Þ
When applying the Bayesian framework one has to be aware of the following points. It is necessary to consider two alternatives hypotheses to introduce a balanced evaluation of the evidence. The resulting LR value strongly depends on the formulation of the alternatives. If the prior odds change then the interpretation must be revised [32]. Robertson and Vignaux wrote [3, page 12] that an ideal piece of evidence would be something that always occurs when what we are trying to prove is true and never occurs otherwise. In a Bayesian approach, this would mean that the evidence (i.e. analytical results) is always observed when the hypothesis is true, and never when the hypothesis is false. In practice, no evidence is ever as easy to interpret and time is not a discrete variable that can take two values t1 and t2, but is a continuous scale. Therefore, the scientists must deal with continuous data and the additional problem that an observed difference may be due to time (which may be the question), but may also be due to the evidence coming from two different sources. Furthermore, special care must be exercised in checking the reproducibility and reliability of the data. 4. Proposed modelling for ink dating interpretation Very few thoughts were directed towards into the interpretation of ink dating evidence. Efforts have been essentially put on the development of methodologies that give precise answers about the age of the ink entry. That is probably the main reason for the controversy about ink dating. In fact, the statement of Evett [27] that exactness is illusory supports the belief that uncertainty is part of the nature of scientific inference. For example, Aginsky [10] proposed threshold values that can be used to differentiate fresh and old inks of a given type (Eqs. (1) and (2)). Three cases were differentiated for the ageing characteristic D representing a portion of solvent ageing curve: – D N ca. 15% suggests that the questioned writing is fresh, i.e. it is less than an eight month old one. If such a result has been obtained for a questioned document dated by over a year preceding analysis, the examiner can state with confidence that this document has been backdated. – D b ca. 10% suggests that the questioned writing is old, that its age is larger than ca. two months, on condition that the document has been stored under normal environmental conditions. – If D lies between ca. 10% and ca. 15%, additional samples should be taken to ascertain statistically if the mean is closer to 10% or 15%; in this case conclusion on whether the ink in question is fresh or old is made with a certain degree of confidence. Bügler [22] proposed similar general rules for the dating of ballpoint inks (Eq. (3)). Paper type is stated to have no influence and the method is size independent. However, half of the investigated ballpoint inks were not suitable for age determination by this method. The following rules apply for the ageing characteristic V representing the solvent percent extraction: – V N 20% leads to the conclusion that the ink under investigation is not older than 3 months.
C. Weyermann et al. / Science and Justice 48 (2008) 118–125
Fig. 2. Frequency probabilities (histogram scale with 0.05 steps) for the initial concentration [μg/cm] of the solvent phenoxyethanol determined by gas chromatography / mass spectrometry in the inks on paper from 31 blue ballpoint pens [16].
– V N 15% indicate that the ink is not older than 9 months. – V N 10% indicate that the ink is not older than 15 months. – V b 10% no conclusion is drawn, and it is stated that the method is not applicable in this case. These propositions are based on frequency probabilities related to the observation and experience of the scientists independently of the forensic context. For example, in their experience, when they obtained a value for ink dating evidence above a given threshold, the ink is ‘always’ relatively fresh and they never observed a value above the threshold for an older ink. That does not mean it is not possible. Transposed into the Bayesian framework, the prior odds of the case (probability that ink is fresh divided by probability that ink is old) will be updated with the new evidence and an alternative proposition must be considered. This can be done by comparing the probability of observing the evidence (i.e. analytical results) if the ink entry was ‘fresh’ and the probability of observing the evidence if the ink was ‘old’. It is essential to actually evaluate the evidence in the context of the case, because its value will depend on the formulation of the two hypotheses. For example, the work of Bügler allows the assignment of subjective probabilities for an observed value over 10%. The probability of evidence if the ink entry is younger than 15 months is 0.5, since the method is suitable for about 50% of the ballpoint pens. The probability of evidence if the ink entry is older than 15 months tends to zero, since it was never observed until now. In this case, the evidence provides very strong support for the hypothesis that the ink is younger than 15 months. However, these threshold values do not allow interpretation of observed values under 10% (majority of the cases). Since the comprehensive statistical data required to evaluate probative values of ink dating evidence are not available from the works published by Aginsky or Bügler, subjective probabilities were assigned using results from another research [16]. The aim of this exercise was to give an example of how such values might be used in a forensic case if a comprehensive
121
amount of data was available, and what type of information it could provide. The Bayesian approach is ideally suited for this kind of purpose, since data are not a necessary part to the evaluation of evidence [33]. Subjective probabilities, also called Bayesian probabilities, have been defined as a degree of belief regarding the truth of a statement or event by De Finetti [30]. They are based on knowledge, experience and information, and as a consequence, may vary amongst individuals and depend on background information, but the overall assessment should be broadly the same! As mentioned above, the concentration of solvents in ballpoint pen ink diminishes with time (drying). Phenoxyethanol was chosen because this solvent is found in a high percentage of ballpoint ink formulations. Frequency probabilities were calculated for fresh ink (time t = 0) from the initial concentrations of phenoxyethanol in ink entries from 31 blue ballpoint pens (Fig. 2). The remaining probabilities are subjective and were assigned by considering referenced solvent drying knowledge from gas chromatography/mass spectrometry quantitative analysis. The quantity of phenoxyethanol in an ink stroke decreased very strongly in the first hours after deposition on paper, it then diminished at a slower rate during the next days. After two weeks, the quantities generally found were below 0.1 μg/cm. In the data covering the time beyond a few weeks, measuring errors increased when analyzing the low solvent quantities found in the ink entries. These are precisely the pertinent results to the dating of ink lines, because the time needed for a judge to order an expertise is usually months after the initial question is raised. After four years, only few ballpoint pen entries still contained detectable quantities of phenoxyethanol [20]. The resulting subjective probabilities are summarized in Table 1. They take into account any type of ballpoint pen ink entries up Table 1 Subjective conditional probabilities proposed for the drying of a solvent from ballpoint ink entries Subjective conditional probabilities for solvent drying data Age of entry
c1
c2
c3
c4
S
(condition)
0.1–1 μg/cm
0.01–0.09 μg/cm
0.001–0.009 μg/cm
0 μg/cm
All possibilities
t=0 1 day 1 week 1 month 2 months 4 months 6 months 1 year 2 years 3 years 4 years 5 years 10 years
0.91 0.80 0.47 0.05 0.03 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
0.03 0.14 0.47 0.85 0.80 0.80 0.45 0.20 0.05 0.01 0.01 0.01 0.01
0.01 0.01 0.01 0.04 0.11 0.12 0.45 0.50 0.45 0.20 0.10 0.05 0.01
0.05 0.05 0.05 0.06 0.06 0.07 0.09 0.29 0.49 0.78 0.88 0.93 0.97
1 1 1 1 1 1 1 1 1 1 1 1 1
It was assumed that the ink entries were not kept in tightly sealed containers (drying actually occurred) and that temperature did not exceed a certain value (no accelerated ageing). Paper type and exact storage conditions were disregarded when choosing these values. They were divided in four concentration ranges assumed to cover all possible cases. The limit of detection was set at 0.001 μg/cm.
122
C. Weyermann et al. / Science and Justice 48 (2008) 118–125
to ten years old. It was assumed that the ink entries had not been kept in tightly sealed containers (drying actually occurred) and that temperature did not exceed a certain value (no accelerated ageing). Paper type and exact storage conditions were disregarded when choosing these subjective values. The concentrations were divided into four ranges assumed to cover all possible cases S: c1 = 0.1 to 1 μg/cm; c2 = 0.01 to 0.09 μg/cm; c3 = 0.001 to 0.009 μg/cm and c4 = 0 μg/cm, the set of all possibilities being S = {c1, c2, c3, c4}. The limit of detection was set at 0.001 μg/cm. Logically, one would expect a fresh entry to contain high concentrations of solvent, whereas an older entry would tend to imply low concentrations. A subjective probability close to 0 means that, the forensic scientist believed that, under the given conditions, the concentration ci was very rarely encountered. Thus, if p(c1|t ≥ 4 months) = 0.01 then c1(≥4 months) was believed to be encountered 1 time out of 100. In other words, it was believed that the concentration range c1 was rarely found in an ink older than four months. On the other hand, if p(c2|t = 4 months) = 0.8, then it was believed that c2 (4 months) was encountered 8 out of 10 times. This means that the concentration range c2 was believed to be found in 80% of the 4 months old ink entries. The added probability of S covers all four sets of concentrations, and for a particular time/age condition is equal to 1:
Table 2 The likelihood ratio (LR) is an indication of the strength of the evidence in supporting two hypotheses: hypothesis Hp (entry made 2 months, 1 year or 3 years ago) as compared to hypothesis Hd (entry made at a given anterior time)
pðc1 ; c2 ; c3 ; c4 jt Þ ¼ 1
ð7Þ
Having subjective conditional probabilities available, it was then possible to compare the probability of observing the evidence ci (concentration in μg·cm− 1) given the ink entry considered was made at time t1 (hypothesis of the prosecution Hp: suspected age of an entry, generally younger than the purported age on the document), to the probability of this same evidence given the ink entry was written at a prior time t2 (hypothesis of the defense Hd: claimed age of entry or age written on document). The likelihood ratio for these two propositions is: LR ¼ P ci jHp =Pðci jHd Þ ð8Þ Explicit mention of the background information (I) was omitted for ease of notation. The likelihood ratio is an indication of the strength of the evidence in supporting one or the other of the two hypotheses: hypothesis Hp (entry made at time t1) as compared to hypothesis Hd (entry made prior to that time). Values below 1 support the defense hypothesis Hd, while values above 1 support the prosecution hypothesis Hp. An LR value of 1 means that it is just as probable to observe the evidence if Hp is true than if Hd is true. In these cases the evidence does not alter prior odds (prior odds equal posterior odds). Results in Table 2 represent typical forensic cases. Three prosecution hypotheses Hp are evaluated (ink entry is two months, one year or three years old) against several defense hypotheses Hd (ink entry is up to about nine years older). The larger the value of LR, the more probable is the observation of the analytical results (ci) given Hp is true. On the contrary, the smaller the value of LR, the more probable is an observation of the analytical results (ci) if Hd is true. The LRs became larger when
LR = P(ci|Hp) / P(ci|Hd) Age of entry
Evidence E
Hp = 2 months
c1
c2
c3
c4
Hd
0.1–1 μg/cm
0.01–0.09 μg/cm
0.001–0.009 μg/cm
0 μg/cm
4 months 6 months 1 year 2 years 5 years 10 years
0.33 3.00 3.00 3.00 3.00 3.00
1.00 1.78 4.00 16.00 80.00 80.00
1.09 0.24 0.22 0.24 2.20 11.00
1.17 0.67 0.21 0.12 0.06 0.06
Hp = 1 year
c1
c2
c3
c4
Hd
0.1–1 μg/cm
0.01– 0.09 μg/cm
0.001–0.009 μg/cm
0 μg/cm
2 years 3 years 4 years 5 years 10 years
1.00 1.00 1.00 1.00 1.00
4.00 20.00 20.00 20.00 20.00
1.11 2.50 5.00 10.00 50.00
0.59 0.37 0.33 0.31 0.30
Hp = 3 years
c1
c2
c3
c4
Hd
0.1–1 μg/cm
0.01–0.09 μg/cm
0.001–0.009 μg/cm
0 μg/cm
4 years 5 years 10 years
1.00 1.00 1.00
2.00 4.00 20.00
0.89 0.84 0.80
1.00 1.00 1.00
Values above 1 (bold) support the prosecution hypothesis Hp, while values below 1 (italic) support the defense hypothesis Hd.
the time differences compared by the hypotheses were larger (e.g. up to 50 for Δ = 9 years, and down to 1.11 for Δ = 1 year). When the LR is close to 1, then the evidence does not bring useful information to alter prior odds in any way. Subjective probabilities may be helpful in rendering a judgment, but are only indicative of the real age of an ink. The forensic scientist must take into account real casework conditions, such as uncertainties about storage conditions, type of paper and ink composition. The subjective conditional probabilities should be continually updated with new experience and information. On the other hand, the alternative propositions and odds must also be carefully chosen. If both hypotheses are believed to be impossible, then another set of hypotheses should be investigated. 5. Hypothetical casework example In June 2006, the tax office requested that Mr X submit a receipt for costs supposed to have taken place in June 2005. One week later, the receipt was received and suspicion arose that it had been backdated and actually written only one week ago (when the evidence receipt was requested). The alleged age of the document by the tax office was therefore one week (hypothesis of the prosecution Hp), while Mr X claimed that
C. Weyermann et al. / Science and Justice 48 (2008) 118–125
the document was written one year ago (hypothesis of the defense Hd). An investigation was ordered and the scientist immediately performed a quantitative analysis of phenoxyethanol on the ballpoint ink from the document. For the given set of hypotheses, the pre-assessment phase shows that there are four possible LRs to be expected, depending on the concentration ci found in the ink on the document (see subjective probabilities in Table 1): 1) If a single value of 0.4 μg/cm was obtained, then following subjective probabilities could be assigned: p (0.4 μg/cm| Hp) = 0.47 and p (0.4 μg/cm|Hd) = 0.01. The likelihood ratio value of 47 means that it is 47 times more likely to observe the evidence if the ink entry is one week old (Hp) than if it is one year old (Hd). Moreover, the scientist must take into account the possibilities of contamination through other documents (in a notebook or file folder) and storage in semi-hermetic container (plastic covers) [20]. These storage conditions may stop or quench the ageing and higher concentration of phenoxyethanol than expected may be found in older ink entries. 2) If a single value of 0.04 μg/cm was obtained, then the following subjective probabilities could be assigned: p (0.04 μg/cm|Hp) = 0.47 and p(0.04 μg/cm|Hd) = 0.2. The likelihood ratio value of 2.35 means that it is 2.35 times more likely to observe the evidence if the ink entry is one week old (Hp) than if it is one year old (Hd). 3) If a single value of 0.004 μg/cm was obtained, then the following subjective probabilities could be assigned: p (0.004 μg/cm|Hp) = 0.01 and p(0.004 μg/cm|Hd) = 0.5. The likelihood ratio value of 0.02 means that it is 50 times more likely to observe the evidence if the ink entry is one year old (Hd) than if it is one week old (Hp). Phenoxyethanol concentrations vary greatly between ballpoint inks (Fig. 2) and a non-negligible number of ballpoint inks age very quickly on paper [14,22,20]. This means that a lower concentration of phenoxyethanol than expected may be found in fresh ink entries. 4) If a single value of 0 μg/cm was obtained, then the following subjective probabilities could be assigned: p (0.004 μg/cm|Hp) = 0.05 and p (0.004 μg/cm|Hd) = 0.29. The likelihood ratio value of 0.17 means that it is about 6 times more likely to observe the evidence if the ink entry is one year old (Hd) than if it is one weak old (Hp), with the same caveat as in point 3) above. In most instances and, often for administrative reasons, the case will not be sent to the laboratory, or only much later. If the analysis is then performed two months later, the hypothesis of the prosecution would change from ‘one week’ to a ‘two months’ old document. If a value of 0.04 μg/cm of ink is obtained, the following subjective probabilities could then be assigned: p (0.04 μg/cm|new Hp) = 0.8 and p (0.04 μg/cm|Hd) = 0.2. The likelihood ratio value of 4 means that it is 4 times more likely to observe the evidence if the ink entry is two months old (Hp) than if it is one year old (Hd). This case shows the importance of the proposed hypotheses. In fact, the circum-
123
stances are provisional and if they change then the interpretation must be revised accordingly. It is then the task of the judge or jurors to use the LR ratios to update the prior odds of the hypotheses based on background information (I). In the example given above, the tax office asked for a missing receipt to verify that the declared expenses had actually been incurred. If Mr X had pretended to have lost the receipt and nevertheless sent it a few days later to the tax offices, there might be strong doubts as to the authenticity of the receipt. Thus prior odds will be in favour of the prosecution hypothesis. On the other hand, if the tax offices had received the receipt at the same time as the tax form, their suspicions on the authenticity of the document would have been lower. Thus prior odds would likely be in favour of the defence hypothesis. This shows that ink dating evidence is only a part of the whole evidential process and cannot be considered separately: it is highly circumstantial. 6. Discussion Ballpoint ink dating acceptance by the scientific community and court requirements can be considered by dividing the analytical and interpretation processes according to the following two questions: 6.1. Is the ink dating method reliable? The important aspects of scientific reliability and validation for an analytical method were summarized early by Horwitz [34]: reproducibility (between-laboratory precision), repeatability (within-laboratory precision), systematic error or bias (accuracy), selectivity and limits of reliable measurements. At the moment, error rates and repeatability are rarely mentioned in literature and do not appear on published ageing curve representations. Method developments were logically carried out on samples prepared and stored in controlled conditions. However, in order to validate them there is a need of blind testing on controlled and realistic samples before using them in real casework. Error rates must be available, especially when using single values for interpretation. Given that enough ink is available, the analysis could be repeated several times in order to obtain a mean value and standard deviation, however it is rarely possible. Between-laboratory precision proves further to be the most difficult aspects encountered with dating methods, and until today, any given ink dating methods is always only performed by one single laboratory. In the process of using ink dating for forensic purposes, experts should first deploy their efforts in a proper development and validation of methods. 6.2. Is the interpretation of ink dating evidence pertinent? Interpretation of ink dating evidence deals with elucidating the meaning of the analytical results and answering the questions raised by courts. The general expectation should be that clear answer about the date of the document is possible, an answer easily extracted from the analytical results. For this
124
C. Weyermann et al. / Science and Justice 48 (2008) 118–125
particular reason, scientists have endeavoured to develop ink dating methods giving precise answers. Two main drawbacks can be noted from this habitual approach. First it can be hazardous to consider exclusively such answers since the interpretation of evidence is highly circumstantial and is to be challenged in the current situation. Secondly, it can be argued that the usefulness of ink evidence might be very limited if one considers only current results. The proposed Bayesian model demonstrates an appropriate use of ink dating evidence in court with knowledge currently available. First, it allows for a balanced assessment of the value of ink dating evidence by taking into account two alternative propositions. Then, all ink dating evidence (and not only the value above or under a given threshold) can be considered through the determination of a likelihood ratio that allows prior odds to be converted into posterior odds. Besides, the Bayesian approach enables the combination of objective (i.e. based on data) and subjective probabilities to assist the evaluation of evidence when data is lacking [33]. An additional advantage of subjective probabilities is pre-assessment of the case, where expectations can be turned into probability distributions for the expected weight of evidence, given the circumstances [32]. It is argued that there is still someway to go in order to answer positively to the two questions raised here. Data are available but proper validation still needs further research. As data become more controlled, the Bayesian interpretation proposed here helps determine the value of the available evidence. 7. Conclusion Research, controversies and data obtained while testing ink ageing has led to consider available analytical data as well as experimental data to contribute to how this could be best presented as evidence to court. It was soon found that available data still suffer from an apparent lack of proper validation even if ageing curves can be demonstrated under controlled laboratory conditions and that a large part of subjective, circumstantial and experimental knowledge might be available to estimate some reliable age range for the writing of ballpoint pen ink lines. Current research results and available data are difficult to interpret when faced with judicial hypotheses and a Bayesian framework is proposed here to evaluate ink dating methods and evidence. Subjective conditional probabilities were used to illustrate which data are necessary to date ballpoint ink on a document, and how to use them properly for the purposes of developing ink dating methods, pre-assessing the expected weight of the dating evidence and finally interpret correctly the value of the evidence given the prior odds of the case. Subjective probabilities based on preliminary tests for an ageing parameter (e.g. concentration of phenoxyethanol) provide an insight into the potential of dating methods (i.e. values of LR that can be expected). Development of methods should preferably aim for large and low LRs, rather than LRs close to 1. The adequacy of the ageing parameters for dating purposes, and consequently the feasibility of an ink dating method, can be checked early in the process of setting up the necessary resources using Bayes' theorem.
Current ink dating methods used in real casework are not yet reliable in the view of these authors. To achieve and conform to reliability criteria such as requested by the Daubert decision, it is necessary to determine the error levels and to verify the methods through blind testing. Effort should then focus on collecting the necessary data to obtain probabilities that can be used in determining likelihood ratios. The inference of possible sources is an essential criterion to meet not only the objectives set up in the legal requirements discussed presently in US courts, but also the scientific standards. At least two alternative propositions should be addressed based on the background information about the case. If no information about storage conditions, paper type and ink composition is available, the scientist must take into account these additional uncertainties. The external factors are known to have an essential effect on age determination, and the effects of such factors were demonstrated through specific research by one of the authors [16]. It is finally imperative to publish comprehensively the resulting methods in order to submit the full process for peer reviewing, to allow inter-validation by other laboratories and to obtain general acceptance within the scientific community. It is emphasized that the Bayesian framework presented here allows for a balanced interpretation of the ink evidence and even proposes a verbal scale to translate scientific findings into words for general understanding [25], words that illustrate the strength and the limits of the evidence given its circumstantial nature. Ageing follows complex pathways that need further research to fulfill scientific requirements. It also depends on a quantity of unknown variables (for example the environmental conditions to which the evidence was exposed prior to the analysis), like in many forensic cases and we submit that it should be handled in the same way. Acknowledgements The author would like to thank Magali Bernard from the Institut de Police Scientifique, Law and Criminal Justice Faculty, University of Lausanne, who gave useful advices in the statistical aspects of this work. References [1] H.H. Kaufmann, The expert witness. Neither Frye nor Daubert solved the problem: what can be done? Science & Justice 41 (1) (2001) 7–20. [2] M.J. Saks, The aftermath of Daubert: an evolving jurisprudence of expert evidence, Jurimetrics Journal 40 (2000) 229–241. [3] B. Robertson, G.A. Vignaux, Interpreting Evidence — Evaluating Forensic Science in the Courtroom, John Wiley & Sons Ltd., Chichester, United Kingdom, 1995. [4] A. Schwartz, A “dogma of empiricism” revisited: Daubert v. Merrel Dow Pharmaceuticals, Inc. and the need to resurrect the philosophical insight of Frye v. United States, Harvard Journal of Law & Technology 10 (2) (1997) 149–237. [5] D. Goodstein, “How Science Works”, chapter in the Reference Manual on Scientific Evidence, 2nd editionLRP Publication, US, 2000. [6] D.M. Risinger, M.J. Saks, The Daubert/Kumho implications of observer effects in forensic science: hidden problems of expectation and suggestion, California Law Review 90 (1) (2002) 1–56. [7] A.A. Cantu, R.S. Prough, On the relative ageing of ink — the solvent extraction technique, Journal of Forensic Sciences 32 (5) (1987) 1151–1174.
C. Weyermann et al. / Science and Justice 48 (2008) 118–125 [8] R.L. Brunelle, H. Lee, Determining the relative age of ballpoint ink using a single-solvent extraction, mass-independent approach, Journal of Forensic Sciences 34 (5) (1989) 1166–1182. [9] V.N. Aginsky, Determination of the age of ballpoint pen ink by gas and densitometric thin-layer chromatography, Journal of Chromatography A 678 (1994) 119–125. [10] V.N. Aginsky, Dating and characterizing writing, stamp pad and jet printer inks by gas chromatography/mass spectrometry, International Journal of Forensic Document Examiners 2 (2) (1996) 103–116. [11] V.N. Aginsky, Measuring ink extractability as a function of age — why the relative ageing approach is unreliable and why it is more correct to measure ink volatile components than dyes, International Journal of Forensic Document Examiners 4 (3) (1998) 214–230. [12] T. Andermann, R. Neri, Solvent extraction techniques — possibilities for dating ball point pen inks, Journal of Forensic Sciences 4 (3) (1998) 231–239. [13] T. Hicks Champod, A. Khanmy, P. Margot, Ink ageing: perspectives on standardization. advances in forensic sciences 3, forensic criminalistics 1, in: B. Jacob, W. Bonte, W. Huckenbeck, P. Pieper (Eds.), Proceedings of the 13th Meeting of the International Association of Forensic Sciences, Düsseldorf, Germany 1993, Verlag Dr. Köster, Berlin, 1995, pp. 304–309. [14] S. Lociciro, W. Mazzella, L. Dujourdy, E. Lock, P. Margot, Dynamic of the ageing of ballpoint pen ink, Science & Justice 44 (3) (2004) 165–171. [15] K. Jahns Altersbestimmung von Schreibmitteln durch chemische Analyseverfahren. Mannheimer Hefte für Schriftvergleichung, Peter E. Baier, Universität Mannheim, Schmidt/Römhild, 29. Jahrgang, 3/04. [16] C. Weyermann Mass Spectrometric Investigation of The Ageing Processes of Ballpoint Ink For the Examination of Questioned Documents. Phd Thesis, Justus-Liebig-Universität Giessen, Germany, 2005. http://geb.uni-giessen. de/geb/volltexte/2006/3044/ last accessed 6th of June 2007. [17] A.A. Cantu, A sketch of analytical methods for document dating part I. The static approach: Determining age independent analytical profiles, International Journal of Forensic Documents Examiners 1 (1) (1995) 40–51. [18] A.A. Cantu, A sketch of analytical methods for document dating part II. The dynamic approach: determining age dependent analytical profiles, International Journal of Forensic Documents Examiners 2 (3) (1996) 192–208. [19] C. Weyermann, D. Kirsch, C. Costa Vera, B. Spengler, Forensic investigation of ageing and degradation of ballpoint dyes on paper by
[20]
[21]
[22]
[23]
[24] [25] [26] [27] [28] [29] [30]
[31] [32]
[33] [34]
125
LDI- and MALDI-MS analysis, Journal of the American Society for Mass Spectrometry 17 (2006) 297–306. C. Weyermann, D. Kirsch, C. Costa Vera, B. Spengler, A study of the drying of ballpoint pen ink on paper by GC/MS, Forensic Science International, 2007 available online 9 August 2006. R.L. Brunelle, E.J. Speckin, Technical report with case studies on the accelerated ageing of ball-point inks, International Journal of Forensic Document Examiners 4 (3) (1998) 240–254. J. Bügler, Dating of ballpoint pen inks by thermal desorption and gas chromatography-mass spectrometry, Meeting of ASQDE, August 2005 Montreal, Canada. K. Inman, N. Rudin, Principles and practice of criminalistics — the profession of forensic science, A Volume in the Protocols in Forensic Science Series, CRC Press, Boca Raton, Florida, 2001. D.A. Rudram, Interpretation of scientific evidence, Science & Justice 36 (3) (1996) 133–138. F. Taroni, C.G.G. Aitken, Correspondence: interpretation of scientific evidence, Science & Justice 36 (4) (1996) 290–292. R. Davis, O. Facey, P. Hamer, D. Rudram, Correspondence: interpretation of scientific evidence, Science & Justice 37 (1) (1997) 64–65. I.W. Evett, Expert evidence and forensic misconceptions of the nature of exact science, Science & Justice 36 (2) (1996) 118–122. I.W. Evett, Bayesian inference and forensic science: problems and perspectives, The Statistician 36 (1987) 99–105. R. Cook, I.W. Evett, G. Jackson, P.J. Jones, J.A. Lambert, A model for case assessment and interpretation, Science & Justice 38 (3) (1998) 151–156. F. Taroni, C.G.G. Aitken, P. Garbolino, De Finetti's subjectivism, the assessment of probabilities and the evaluation of evidence: a commentary for forensic scientists, Science & Justice 41 (3) (2001) 145–150. C.G.G. Aitken, F. Taroni, Statistics and the evaluation of evidence for forensic scientists, Statistics in Practice 2nd edition, Wiley, England, 2004. R. Cook, I.W. Evett, G. Jackson, P.J. Jones, J.A. Lambert, A hierarchy of propositions: Deciding which level to address in casework, Science & Justice 38 (4) (1998) 231–239. F. Taroni, C.G.G. Aitken, Correspondence: interpretation of scientific evidence, Science & Justice 37 (1) (1997) 64–65. W. Horwitz, Evaluation of analytical methods for regulation of foods and drugs, Analytical Chemistry 52 (1) (1982) 67A–76A.