With DNA registries, it's the end of anonymity as we know it.



[ad_1]

Already, according to a new study, 60% of Americans of North European descent – the main group that uses genealogical genealogy sites – can be identified using these databases, whether they have joined or not to themselves.

The genealogical genealogy industry is booming. In recent years, more than 15 million people have donated their DNA – a cheek pad, saliva in a test tube – to services such as 23andMe and Ancestry.com to get answers about their heritage. . In exchange for a genetic fingerprint, individuals can find a biological parent, long-lost cousins, or even a connection with Oprah or Alexander the Great.

But as these DNA registries develop, it is increasingly difficult for individuals to remain anonymous. According to a study published Thursday in the journal Science, 60% of Americans of North European origin – the main group that uses these sites – can be identified using these databases, whether or not they joined this list.

According to researchers, in the next two or three years, 90% of Americans of European descent will be identifiable. The future of science fiction, in which everyone knows whether they want it or not, is close.

"It's not a distant future, it's a near future," said Yaniv Erlich, lead author of the study. Erlich, a former researcher in the genetics of privacy at Columbia University, is scientific leader of MyHeritage, a website on genetic ancestry.

Most read stories of the country and the world

Unlimited digital access. $ 1 for 4 weeks.

Science involves a search for third cousins. To identify a person using a DNA sample, an investigator uploads a previously analyzed genetic sequence into a database. The goal is to find someone who shares enough DNA to place him in the third cousin or closer. Most of us have at least 800 people somewhere in the world who fall into this category. As long as one of these people is in a database, an experienced detective can use other publicly available information to create a family tree and determine the actual identity of that person.

This technique has been used in recent months to identify more than 15 suspects in cases of murder and sexual assault. The breakthroughs began in April with an arrest in the Golden State Killer case, which terrorized California with rapes and murders in the 1970s and 1980s. Other successes followed soon. A truck driver in the state of Washington was accused of killing a Canadian couple in 1987; a Pennsylvania DJ was accused of killing a teacher in 1992.

Revealing results

Observing these developments, Erlich wondered about the chances of identifying a particular person through the DNA of his cousins ​​in one of these databases.

Its analysis is based not on large genealogy databases such as 23andMe and Ancestry, but on two of the smaller ones: GEDmatch, which has about 1 million profiles, and MyHeritage, which had about 1.5 million at the time of the study. Indeed, for legal and logistical reasons, it is difficult to use the most important sites to identify people other than customers who send saliva.

But smaller sites, designed to help genealogists maximize the chances of finding parents, are more flexible. GEDmatch allows law enforcement officials to analyze its database in homicide and sexual assault cases. MyHeritage does not do this, but it allows downloads from external labs. With both, it is difficult to know exactly what is downloaded: grandmother's saliva, the blood of the crime scene, a sample of a medical study or other.

To determine the chances of correctly identifying an individual from a given DNA sample, Erlich and his colleagues – from Columbia University, the Hebrew University of Jerusalem and New York Genome Center – analyzed 30 DNA kits randomly selected from the GEDmatch database.

Their results were revealing. The team discovered that a DNA sample of an American of North European origin could be followed successfully up to a distance of a third. cousin of his owner in 60% of cases. A comparable analysis on the MyHeritage site yielded similar results. (The analysis focuses on Americans of North European origin, since 75% of GEDmatch users and other genealogy sites belong to this demographic.)

Some experts asked questions about the methodology of the study. The size of his sample was small and no account was taken of the fact that it often took more than one match to identify a suspect.

CeCe Moore is a genealogist in genetics from Parabon, a forensic medical consultancy. She expressed her fear in an e-mail that the Science paper could mask the difficulty of determining a person's identity; a highly qualified expert is needed to build a family tree from the initial genetic clues.

Nevertheless, she added, the study is not news for us. In recent months, Moore has been involved in a dozen cases of homicides and sexual assaults that had used GEDmatch to identify suspects. Of the 100 crime scene profiles her company had uploaded to GEDmatch in May, half were obviously soluble, she said, and 20 were "promising".

"I think it's a solid and compelling document," said Graham Coop, a population genetics researcher at the University of California, Davis. In May, in a blog post, Coop explained how fortunate the investigators were in the Golden State Killer case. It leads to a statistical conclusion similar to that of Erlich: the company is not far from being able to identify 90% of people thanks to the DNA of their cousins ​​in genealogical databases.

"It's that moment of, wow, oh, that opens up a lot of possibilities, some good and some more questionable," he said.

In an alarming result, the Science study revealed that a supposedly "anonymized" genetic profile from a set of medical data could be uploaded to GEDmatch and positively identified. This shows that an individual's private health data may not be so private after all.

Erlich urged genealogical societies to consider attaching some sort of cryptographic signature to the genetic profiles they analyze. This would help ensure that those who download a DNA profile are what they claim, and would prevent anyone from abusing this data, for example, if they wanted to know who was participating in an event.

Possibilities and limits

Daniel MacArthur, a genomics researcher at Massachusetts General Hospital, said he approved the cryptographic signature, but that does not go far enough. "We live in a world where people are very interested in getting and sharing their genetic data to find out more about themselves," he said. "It's a natural human instinct. But legislative protection is necessary to prevent it from being used for harmful purposes. "

The widespread use of genetic genealogy to identify violent criminals this summer has led to speculation about how it could be used: invade patients' privacy, track down the identity of infiltrators, or search through agents of law or immigration. may be more morally ambiguous for some than to find a killer.

Ethicists said that greater awareness of the possibilities and limitations of technology was needed, since many people do not realize that a public DNA profile contains information, not just a person, but contains a family secret that connects to hundreds of other people. A brother or sister shares half of your genetic profile. A cousin shares an eighth. A second cousin, 1/16.

"By making it real and by making people understand how interconnected our genes are and experienced investigators could use them – with a fairly high success rate – to find second and third cousins ​​or even closer relatives." , emphasizes the power of society. this new technology and really brings out the reality, "said Benjamin Berkman, a bioethics researcher at the National Institutes of Health.

[ad_2]
Source link