Errors in forensic DNA testing are still pervasive: false matches and wonky statistics

About twenty years ago I spent a good deal of my time testifying for the defense in criminal cases involving DNA evidence. These were trials in which the prosecution claimed that the defendant’s DNA profile had been found to match crime-scene samples (these involve blood or sperm analysis), and in which the prosecution presented “match probabilities: the supposed chance that a randomly-selected and innocent person would have had DNA that also matched the evidence. (If these probabilities are very low, say one in several million, juries tend to conclude that the suspect is guilty.)

While I favored the responsible use of DNA testing, the prosecution at that time was not being responsible, ergo my involvement. What I testified about, as an unpaid expert witness (I decided that saying I was paid for a case—and the prosecution always asks when you’re on the stand—might make the jury think that I was making money as a “professional witness”) were two issues: match probabilities and lab error rates. Here’s a brief synopsis:

Match probabilities. If the suspect’s own DNA matches that from the crime scene, you can then calculate the chance that a randomly-selected person would match the sample as well. This corresponds to the chance that an innocent person would have been implicated by the DNA evidence. Absent lab errors (see below), these calculations involve population genetics, which was my area of expertise. If, for example, the suspect matches the crime-scene sample at three tested genes, how do you calculate the probability of a random match?

That depends on who you consider to be a “random and innocent” person. Is it the population of Hispanics in America if the suspect is a Hispanic? Probably not, because we don’t know the ethnicity of the perpetrator. It’s thus best to use a series of databases and take the most conservative (highest) probability. Moreover, you can’t just multiply the probabilities for each gene together if gene forms are associated with each other in different groups, as they tend to be.  One of my beefs was that the prosecution would use a database corresponding to the ethnicity of the defendant, and then just multiply the probabilities for each tested gene together; or they would use a variety of databases and select the lowest probability. Neither of these is kosher given the way human populations are genetically structured.

Nowadays, when we can do almost full DNA sequences rather than just matches at a few sites in three genes or so, this problem has been ameliorated. If a match is not perfect, then the suspect is exculpated. But one big problem remains, and it is one about which I testified at length:

Lab error rates.  Labs aren’t perfect, and sometimes two samples whose DNA doesn’t match can be found to match if tubes get mixed up or if there is contamination. (Since genes are amplified thousands of times before sequencing, a small bit of contamination can be magnified.)

This happens more often than you think. When I was testifying (I stopped doing that after the Simpson trial), blind testing of labs gave an error rate of around 2%. That is, about one time in fifty, two non-matching samples sent to a lab to test its prowess would be seen to match based on lab error.

With error rates like this, match probabilities from popuation genetics become virtually useless. That’s because the chance of a random match becomes about equal to the lab error rate. If the rate of an innocent subject matching a crime sample includes both the random match probability based on the frequency of DNA profiles PLUS the chance that a match would occur from lab error, then the largest probability—lab error—dominates. If the former probability is, for example, one in a million (0.000001) and the latter one in fifty (0.02), then the total “random match” probability is the sum of these, or 0.020001. That’s about 2%.  The default match probability is thus not one in a billion or one in a million, even if many genes are used, but simply the probability that a random person will match the crime sample because of lab error. The probability of lab error is invariably higher than the population-genetic probability.

This is all common sense, but prosecutors hated my testimony, because it made their evidence look a lot less incriminating than it was. So they tried to get around it, saying that the population-genetics calculation and the error rate calculations were “apples and oranges” and couldn’t be combined. (As I discovered from my courtroom experience, the prosecution is often less interested in presenting an honest case than in securing a conviction.) They also used irrelevant arguments that might appeal to non-scientists, often saying that my testimony was unreliable because it involved humans but my research was on fruit flies. (Both species, of course, have genes!)

One way I suggested to ameliorate the lab error rate was to label the samples blindly and to have the DNA tested in at least two or three labs independently. If all of them matched, the chances of error causing this would be reduced. (For three labs it would have been 0.02 X 0.02 X 0.02, or 8 in a million—comparable to some population-genetic calculations). But at that time the prosecution didn’t do this, and I don’t know if they do it now.

These considerations may seem simple to you, but juries are composed of a sample of voters, most of whom don’t even know what DNA is. To try to educate them about error rates, population genetics, and probabilities was a daunting task, and I often spent several days on the stand. Even then the jury was often baffled, as I suspect it was in the Simpson case.

While the population-genetic calculations have been improved by more extensive DNA analysis, the problem of lab errors remains, as shown in this new article from the New York Times (click on screenshot to read it). It’s by Greg Hampikan, a professor of biology at Boise State University, one of whose concerns is forensic DNA (he has a joint appointment in Criminal Justice).

Outside testing of forensic DNA labs have shown that there’s still a very large probability of lab error, and that error comes from two sources. (The article cites “an alarming new study of crime laboratories published this summer”, but I can’t find it and it isn’t cited.) I quote:

Researchers from the National Institute of Standards and Technology gave the same DNA mixture to about 105 American crime laboratories and three Canadian labs and asked them to compare it with DNA from three suspects from a mock bank robbery.

The first two suspects’ DNA was part of the mixture, and most labs correctly matched their DNA to the evidence. However, 74 labs wrongly said the sample included DNA evidence from the third suspect, an “innocent person” who should have been cleared of the hypothetical felony.

The test results are troubling, especially since errors also occur in actual casework.

In other words, an innocent person was deemed a match over 70% of the time due to lab error. This involves two types of mistakes: switching of tubes and the new possibility that the sensitivity of DNA tests allows the DNA of completely innocent people to be present in low concentration in crime-scene samples, but concentrations high enough to be detectable and thus judged “culpable”.

Tube swaps are easy to understand. But some laboratory errors are far more difficult to detect. For example, it’s hard to interpret DNA mixtures from three or more people. As DNA testing has become more sensitive, most laboratories are now able to produce profiles from anyone who may have lightly touched an object. The result is that DNA mixtures have become more common, making up about 15 percent of all evidence samples.

Moreover, there’s still the problem of different labs calculating different match probabilities, probably because they use different population-genetic calculations (my emphasis):

One shocking result from the new N.I.S.T. study is that labs analyzing the same evidence calculated vastly different statistics. Among the 108 crime labs in the study, the match statistics varied over 100 trillion-fold. That’s like the difference between soda change and the United States’ gross domestic product. These statistics are important because they are used by juries to consider whether a DNA match is just coincidence.

One would think that the data in the new paper (and again, I can’t find it) would make the prosecution think twice about how it presents data. But even the authors of that paper larded it with disclaimers, and the journal took four years to get the paper out, meaning that its results didn’t affect criminal cases over that period. Here’s Hampikan’s angry but justifiable complaint:

While this lapse in publication is troubling, more disturbing is that the authors try to mute the impact of their own excellent work. Neither the paper’s title nor the abstract mention the shocking findings. And the paper contains an amazing number of disclaimers.

In fact, the conclusion begins with a stark disclaimer apparently intended to block courtroom use:

The results described in this article provide only a brief snapshot of DNA mixture interpretation as practiced by participating laboratories in 2005 and 2013. Any overall performance assessment is limited to participating laboratories addressing specific questions with provided data based on their knowledge at the time. Given the adversarial nature of the legal system, and the possibility that some might attempt to misuse this article in legal arguments, we wish to emphasize that variation observed in DNA mixture interpretation cannot support any broad claims about “poor performance” across all laboratories involving all DNA mixtures examined in the past.

People serving time behind bars based on shoddy DNA methods may disagree. It is uncomfortable to read the study’s authors praising labs for their careful work when they get things right, but offering sophomoric excuses for them when they get things wrong. Scientists in crime labs need clear feedback to change entrenched, error-prone methods, and they should be strongly encouraged to re-examine old cases where such methods were used.

That disclaimer is absolutely unconscionable. ANY participating lab must be blind tested, and the results of that testing presented in the courtroom. There is no other way to ensure a fair presentation of evidence.

I’ve been out of this game for some time, so I wasn’t aware of this and had assumed that the lab error issue had been corrected. It hasn’t.

And those errors are important. When DNA testing exculpates a subject, it’s likely not due to lab error (though it could be). But when the testing implicates a suspect, one must be very scrupulous to ensure that the match isn’t an error and, if it isn’t, that the match statistics be presented fairly. I agree with the old dictum that it’s better (and, for DNA evidence, also less likely!) to let a hundred guilty people walk free than to jail one innocent person.

Hampikan suggests some fixes for correcting errors, but they aren’t perfect. Blind testing of samples and use of multiple labs remains two essential ways to ensure that the innocent don’t get jailed.


  1. Draden
    Posted September 23, 2018 at 9:51 am | Permalink

    Nice article. Link is in the NYT online article.

  2. ThyroidPlanet
    Posted September 23, 2018 at 10:09 am | Permalink


  3. Randall Schenck
    Posted September 23, 2018 at 10:10 am | Permalink

    One has to wonder why the legal system cannot at least follow the suggestions from Hampikan. Maybe the same reason the supreme court legal system we are watching today is no better than it was 30 years ago.

  4. mikeyc
    Posted September 23, 2018 at 10:45 am | Permalink

    I went to Gradual School with Greg. Great guy. We worked in the same lab at the University of Connecticut. A note; his last name is actually spelled “Hampikian”.

    • keith
      Posted September 23, 2018 at 8:37 pm | Permalink

      I don’t know if that was intentional or a typo, but I’m definitely going to start referring to graduate school as gradual school.

  5. W.Benson
    Posted September 23, 2018 at 10:57 am | Permalink

    It is time for the Genetics Society of America and the American Society of Human Genetics to get involved by recommending proper protocols, procedures and interpretations of DNA evidence, in the name of justice.

    • Posted September 23, 2018 at 12:01 pm | Permalink

      The National Academy of Science could do it, too: in fact, their task is to help advise the government on scientific issues.

  6. DrBrydon
    Posted September 23, 2018 at 11:05 am | Permalink

    On a related note, did you see the story this week or last about the fellow who sent his dog’s DNA in to a consumer DNA testing company (n Canada)?

    The detailed analysis claimed 12 percent came from the Abenaki tribe and 8 percent from the Mohawk people.

  7. Ken Kukec
    Posted September 23, 2018 at 11:07 am | Permalink

    I’ve done one full-fledged DNA trial, in a capital murder case in Broward County, FL, about 20 years ago. Preparing for trial, I read everything on DNA I could get my hands on, consulted with experts, and developed something of a “bathtub” expertise on the topic. (You know how when you get out of the tub and pull the plug and all the water runs down the drain? Like that.)

    The police had found a shirt with a stain that the prosecution contended came from the murder victim’s blood in an abandoned house where my client had been squatting. (My client was a PhD and former college professor who had gotten involved with a hooker during a trip to Vegas, developed a wicked-bad crack habit, and eventually wound up on the streets — so let that be a lesson for you academics, dear perfesser: stay off The Pipe. 🙂 )

    The blood on the shirt was somewhat degraded, so the lab could match just a few loci, and the odds of a random match were calculated (IIRC) at 1 in 12,000, or something similar. I was able to demonstrate at trial that the match rates were actually higher for people of Bahamian ancestry and that, although the victim wasn’t Bahamian, many people living in the neighborhood where the bloody shirt was found were. (I also had a back-up argument that other people had had access to the abandoned house, so the bloody shirt might not have been left there by my client.)

    I cross-examined the DNA expert in that case, and I gotta say, going for “fruit fly”-style questions would’ve been far down among my priorities. But, then, rare is the prosecutor who’s a skilled cross-examiner. Not really their fault, though; they don’t get nearly as much practice, inasmuch as most of us defense counsel decline to put on voluminous defense cases.

    • mikeyc
      Posted September 23, 2018 at 11:35 am | Permalink

      Come on, Ken, don’t leave us hanging….was your client acquitted?

      • Ken Kukec
        Posted September 23, 2018 at 2:46 pm | Permalink

        Yeah, we walked him. Credit due my co-counsel. I signed on for the DNA and other expert witnesses (and would’ve handled the death-penalty phase, had the client been convicted), but my partner took care of the rest of the case. Did a great job, too, dismantling the prosecution’s circumstantial case and defusing a cryptic statement the defendant had made at the time of his arrest that the prosecution characterized as an admission.

        Last I heard from the client, he was clean & sober and working as a substance-abuse counselor. Not all cases have such a happy ending.

    • Randall Schenck
      Posted September 23, 2018 at 11:43 am | Permalink

      I was on the jury of one rape/murder case 18 or 19 years ago and although there was some DNA information given at trial I don’t think it was much and was not really part of the decision process by the jury. Identity of the person was not in question so it was an easy case as I recall.

  8. Kevin
    Posted September 23, 2018 at 11:42 am | Permalink

    We Brits have had quite serious problem in this direction recently:

    These are blood and urine tests mostly relating to drink/drug driving, but a few concerning rape/murder, family law (violence/paternity?).

    This was due to malicious tampering with samples/controls by employees.

    The issues concerned here are also more complicated than being simply technical.

    Shortly before this scandal blew up, the government dismantled the Forensic Science Service and removed the role of Forensic Science Regulator:

    This is in part an ideological/political issue: many government MP’s (eg. Owen Paterson and crime prevention minister at the time, James Brokenshire) favour shift of services to the private sector. It may at times be cheaper, but there is often a lack of transparency and accountability, and profit incentivises corner cutting.

    Owen Paterson was previously the Secretary for Northern Ireland: this means that he may be responsible for Police and security services in Northern Ireland.
    He has since passed onto the payroll of Randox, the Northern Ireland biological testing company that is supplied forensic testing services to the Manchester Police at the time of the current scandal (several of their employees have been charges pending criminal prosecution.

    Paterson earns an estimated £100,000 per year from Randox:
    This was stated as for 8hrs per month, working out at some £500 per hour

    It is interesting to note that an ex-Minister with responsibility for Police is employed by a private company supplying services to the Police, and that this company is now under investigation concerning criminal tampering with test procedures going back to their start of service (2012).

    Methods used for cross-checking/calibrating of results etc include the use of reference sample standards.
    Ironically, it is these very reference samples that were tampered with.

    I also worked for Randox for eighteen months prior to 2013 and left on principle after an employee made perjurious statements on oath and in writing (affidavit) in High Court in favour of the company and against the public interest. They have expensive lawyers.
    In Britain there is a Public Disclosure Act, but in this case it proved almost useless.

    Other problems which I have seen which may contribute to unreliability of laboratory data include:
    unreliable software: most modern machines have an onboard microprocessor/PC and the software is often unstable, badly written, containing known but unresoveable bugs
    unreliable hardware: the machine (robotics, barcode reader, sampling) does not always behave consistently.
    Some machines are simply too old and were never very reliable throughout their whole history, but the manufacturer will continue selling them.
    company culture: tight deadlines and budgets, management decision, mean that products are knowingly sold even though they are defective. If there is a bad working environment, emplyees may become vindictive.

    • Mike
      Posted September 24, 2018 at 8:09 am | Permalink

      Corruption in Politics, the Tories are obsessed with Privatisation and not one of their privatised Services is fit for purpose, so much for business efficiency, its a bloody Myth.

  9. Bruce Lilly
    Posted September 23, 2018 at 11:50 am | Permalink

    NIST DNA study appears to be
    Thats from a google search, starting with NIST DNA and accepting a suggestion for NIST DNA mixture interpretation. The link above was the top result.

    Note that the link to the New York Times (text, not image) is to a Wikipedia page for Heritage Day (South Africa).

    • keith
      Posted September 23, 2018 at 8:40 pm | Permalink

      That’s the correct study; the article at the NYT has the hyperlink. Possibly it wasn’t linked at the time Jerry read it.

  10. rickflick
    Posted September 23, 2018 at 12:01 pm | Permalink

    I had been under the impression that DNA testing was nearly foolproof. 99% accurate. The truth is rather scary.

    • DrBrydon
      Posted September 23, 2018 at 12:29 pm | Permalink

      Yes, TV crime dramas are particularly bad at how they depict DNA evidence. Even fingerprints are treated as a yes/no match question.

  11. Posted September 23, 2018 at 12:04 pm | Permalink

    In the usual formulation of the dictum, known as Blackstone’s formulation, the ratio of guilty to innocent is 10 to one. Increase Mather, a famous Puritan minister in the early days of the Massachusetts colony, criticized the use of “spectral evidence” during the Salem witch trials by saying:

    “It were better that Ten Suspected Witches should escape, than that one Innocent Person should be Condemned.”

    (His overall record on the witch trials was mixed, though, as he believed in witches, and thought they should be brought to trial.)

  12. Kim McKellar
    Posted September 23, 2018 at 12:29 pm | Permalink

    I frequently appeared on-air (Fox News Channel out if NYC and CNN out of DC) as a criminal defense attorney. My father was a practicing physician at the time, with a work history as a lab tech and an M.S. in Microbiology. Wed had many discyssiins about the way DNA testing was being used in Courts. I can remember bringing up the point of the falliability of DNA testing to identify and convict. As soon as the camera cut to a commercial, the show”s host berated the booker for failing to prepare a guest. I was never asked back on that particular show. Shortly after, the show was cancelled. The DNA mantra has been that it’s an exact, infallible science for identification

  13. Posted September 23, 2018 at 1:04 pm | Permalink

    Were you a forensics?

    • Michael Fisher
      Posted September 23, 2018 at 3:42 pm | Permalink

      No he wasn’t Flo – purely an evolutionist giving some of his time to the court system – see Jerry’s “research Interests” top right of this page

  14. Ken Kukec
    Posted September 23, 2018 at 3:01 pm | Permalink

    When I was testifying (I stopped doing that after the Simpson trial) …

    Was there something specific (besides the obvious) about the Simpson trial that put you off this kind of work?

    I thought the scientific end of the DNA evidence was the one aspect of the case competently handled (by Barry Schenck and Peter Neufeld). There was plenty of shame to go around for all the other participants, particularly for the miserable case put on by the LA district attorney’s office.

  15. Trent McBride
    Posted September 23, 2018 at 4:16 pm | Permalink

    I’m a pathologist, and the degree of regulation in medical labs is significant (routine semi-random inspections, truly random inspections in cases of reported complaints, mandatory proficiency testing of known samples). Can anyone enlighten me on the degree to which forensic and police labs are subject to scrutiny by third-party audit and/or are tested on “known unknowns” as in those described in the study?

    I think these types of labs should be more strictly scrutinized and it is not even close, but I suspect they are not. While medical lab errors get a lot of focus, the truth is they happen not infrequently, but they happen in a context where they can be disbelieved by Bayesian reasoning, easily rechecked from additional samples, or are just ultimately clinically insignificant. But errors of legal consequence don’t as often fit this profile.

    So, anybody in these fields want to weigh in?

  16. Posted September 23, 2018 at 4:31 pm | Permalink

    Kudos to Jerry for doing this public service. I hope the juries actually managed to understand at least the basic points. Explaining these things to juries is probably harder than explaining to U. Chicago undergrads.

  17. keith
    Posted September 23, 2018 at 8:55 pm | Permalink

    Thank you, Jerry, for your service to justice and to public education.

    Radley Balko is an important journalist on the criminal justice/forensic science beat currently at the Washington Post. He’s worth reading.

%d bloggers like this: