Should Police Detectives Have Total Access to Public Genetic Databases?

A crime scene photograph.

(© prathaan/Fotolia)

This past April, an alleged serial rapist and murderer, who had remained unidentified for over 40 years, was located by comparing a crime scene DNA profile to a public genetic genealogy database designed to identify biological relatives and reconstruct family trees. The so-called "Golden State Killer" had not placed his own profile in the database.

Forensic use of genetic genealogy data is possible thanks to widening public participation in direct-to-consumer recreational genetic testing.

Instead, a number of his distant genetic cousins had, resulting in partial matches between themselves and the forensic profile. Investigators then traced the shared heritage of the relatives to great-great-great-grandparents and using these connections, as well as other public records, narrowed their search to just a handful of individuals, one of whom was found to be an exact genetic match to the crime scene sample.

Forensic use of genetic genealogy data is possible thanks to widening public participation in direct-to-consumer recreational genetic testing. The Federal Bureau of Investigation maintains a national forensic genetic database (which currently contains over 16 million unique profiles, over-representing individuals of non-European ancestry); each profile holds genetic information from only 13 to 20 variable gene regions, just enough to identify a suspect. However, since this database and related forensic databases were established, the nature of genetic profiling has significantly changed: direct-to-consumer genetic tests routinely use whole genome scans involving simultaneous analysis of hundreds of thousands of variants.

With such comprehensive genetic information, it becomes possible to discern more distant genetic relatives. Thus, even though public DNA collections are smaller than most law enforcement databases, the potential to connect a crime scene sample to biological relatives is enhanced. The successful use of one genealogy database (GEDMatch) in the GSK case demonstrates the power of the approach, so much so that the genetic profiles of over 100 similar cold cases are now being run through the same resource. Indeed, in the two months since the GSK case was first reported, 5 other cold cases have been solved using similar methods.

Autonomy in the Genomic Age

While few would disagree with the importance of finally bringing to justice those who commit serious violent offenses, this new forensic genetic application has sparked broad discussion of privacy-related and ethical concerns. Before, the main genetic databases accessible to the police were those containing the profiles of accused or convicted criminals, but now the DNA of many more "innocent bystanders," across multiple generations, are in play.

The genetic services that provide a venue for data sharing typically warn participants that their information can be used for purposes beyond those they intend, but there is no legal prohibition on the use of crowd-sourced public collections for forensic investigation. Some services, such as GEDMatch, now explicitly welcome possible law enforcement use.

The decisions of individuals to contribute their own genetic information inadvertently exposes many others across their family tree.

The implication is that consumers must choose for themselves whether they are willing to bring their genetic information into the public sphere. Many have no problem doing so, seeing value in law enforcement access to such data. But the decisions of individuals to contribute their own genetic information inadvertently exposes many others across their family tree who may not be aware of or interested in their genetic relationships going public.

As one well-known statistical geneticist who predicted forensic uses of public genetic data noted: "You are a beacon who illuminates 300 people around you." By the same token, 300 people, most of whom you do not know and have probably never met, can illuminate your genetic information; indeed a recent analysis has suggested that most in the U.S. are identifiable in this way. There is nothing that you can do about it, no way to opt out. Thus, police interaction with such databases must be addressed as a public policy issue, not left to the informed consent of individual consumers.

When Consent Will Not Suffice

For those concerned by the broader implications of such practices, the simplest solution might be to discourage open access sharing of detailed genetic information. But let's say that we are willing to continue to allow those with an interest in genealogy to make their data readily searchable. What safeguards should we implement to ensure that the family members who don't want to opt in, or who don't have the ability to make that choice, remain unharmed? Their autonomy counts, too.

We might consider regulation similar to the kind that limit law enforcement use of forensic genetic databases of convicted and arrested individuals. For example, in California, familial searches can only be performed using the database of convicted individuals in cases of serious crimes with public safety implications where all other investigatory methods have been exhausted, and where single-source high-quality DNA is available for analysis. Further, California policy separates the genealogical investigative team from local detectives, so as to minimize the impact of incidental findings (such as unexpected non-paternity).

Importantly, the individual apprehended was not the first, or even second, but the third person subjected to enhanced police scrutiny.

No such regulations currently govern law enforcement searches of public genealogical databases, and we know relatively little about the specifics of the GSK investigation. We do not know the methods used to infer genetic relationships, or their likelihood of mistakenly suggesting a relationship where none exists. Nor do we know the level of genetic identity considered relevant for subsequent follow-up. It is also unclear how law enforcement investigators combined the genetic information they received with other public records data. Together, this leaves room for an unknown degree of investigation into an unknown number of individuals.

Why This Matters

What has been revealed is that the GSK search resulted in the identification of 10 to 20 potential distant genetic relatives, which led to the investigation of 25 different family trees, 24 of which did not contain the alleged serial rapist and murderer. While some sources described a pool of 100 possible male suspects identified from this exercise, others imply that the total number of relatives encompassed by the investigation was far larger. One account, for example, suggests that there were roughly 1000 family members in just the one branch of the genealogy that included the alleged perpetrator. Importantly, the individual apprehended was not the first, or even second, but the third person subjected to enhanced police scrutiny: reports describe at least two false leads, including one where a warrant was issued to obtain a DNA sample.

These details, many of which only came to light after intense press coverage, raise a host of concerns about the methods employed and the degree to which they exposed otherwise innocent individuals to harms associated with unjustified privacy intrusions. Only with greater transparency and oversight will we be able to ensure that the interests of people curious about their family tree do not unfairly impinge on those of their mostly law-abiding near and distant genetic relatives.

Stephanie Fullerton And Rori Rohlfs
Stephanie Fullerton (left) is Associate Professor of Bioethics and Humanities at the University of Washington in Seattle. She received a DPhil in Human Population Genetics from the University of Oxford and later re-trained in Ethical, Legal, and Social Implications (ELSI) research with a fellowship from the National Institutes of Health. Dr. Fullerton’s work focuses on the ethical and societal implications of genomic research and its equitable and safe translation for clinical and public health benefit. // RORI ROHLFS: With her background in statistical genetics and molecular evolution, Rori Rohlfs is currently an Assistant Professor of Biology at San Francisco State University. In addition to researching the evolution of gene expression, and women’s contribution to science, some of Rori’s work focuses on forensic genetics: estimating error rates of familial searching, and investigating how well statistical frameworks used in forensic genetics describe human genetic variation.
Get our top stories twice a month
Follow us on

Dr. Jha discusses Covid vaccine passports, how supply and demand of the vaccines is about to shift, the AstraZeneca situation, what's new with kids, herd immunity, and more.

Photo of sticker by Marisol Benitez on Unsplash; Jha photo by Brown University.
Making Sense of Science features interviews with leading medical and scientific experts about the latest developments and the big ethical and societal questions they raise. This monthly podcast is hosted by journalist Kira Peikoff, founding editor of the award-winning science outlet

Hear the 30-second trailer:

Listen to the whole episode: "Why Dr. Ashish Jha Expects a Good Summer"

Dr. Ashish Jha, dean of public health at Brown University, discusses the latest developments around the Covid-19 vaccines, including supply and demand, herd immunity, kids, vaccine passports, and why he expects the summer to look very good.

Kira Peikoff
Kira Peikoff is a journalist whose work has appeared in The New York Times, Newsweek, Nautilus, Popular Mechanics, The New York Academy of Sciences, and other outlets. She is also the author of four suspense novels that explore controversial issues arising from scientific innovation: Living Proof, No Time to Die, Die Again Tomorrow, and Mother Knows Best. Peikoff holds a B.A. in Journalism from New York University and an M.S. in Bioethics from Columbia University. She lives in New Jersey with her husband and son.

The Cocoanut Grove fire in Boston in 1942 tragically claimed 490 lives, but was the catalyst for several important medical advances.

Boston Public Library

On the evening of November 28, 1942, more than 1,000 revelers from the Boston College-Holy Cross football game jammed into the Cocoanut Grove, Boston's oldest nightclub. When a spark from faulty wiring accidently ignited an artificial palm tree, the packed nightspot, which was only designed to accommodate about 500 people, was quickly engulfed in flames. In the ensuing panic, hundreds of people were trapped inside, with most exit doors locked. Bodies piled up by the only open entrance, jamming the exits, and 490 people ultimately died in the worst fire in the country in forty years.

"People couldn't get out," says Dr. Kenneth Marshall, a retired plastic surgeon in Boston and president of the Cocoanut Grove Memorial Committee. "It was a tragedy of mammoth proportions."

Within a half an hour of the start of the blaze, the Red Cross mobilized more than five hundred volunteers in what one newspaper called a "Rehearsal for Possible Blitz." The mayor of Boston imposed martial law. More than 300 victims—many of whom subsequently died--were taken to Boston City Hospital in one hour, averaging one victim every eleven seconds, while Massachusetts General Hospital admitted 114 victims in two hours. In the hospitals, 220 victims clung precariously to life, in agonizing pain from massive burns, their bodies ravaged by infection.

Keep Reading Keep Reading
Linda Marsa
Linda Marsa is a contributing editor at Discover, a former Los Angeles Times reporter and author of Fevered: Why a Hotter Planet Will Harm Our Health and How We Can Save Ourselves (Rodale, 2013), which the New York Times called “gripping to read.” Her work has been anthologized in The Best American Science Writing, and she has written for numerous publications, including Newsweek, U.S. News & World Report, Nautilus, Men’s Journal, Playboy, Pacific Standard and Aeon.