How Accurate Are Online DNA Tests?
The age of consumer genomics has arrived. Nowadays you can send a vial of your spit in the mail and pay to see how your unique genetic code relates to all manner of human activity—from sports to certain diets to skin cream to a preference for fine wines, even to dating.The most widespread and popular companies in this market analyze ancestry, and the biggest of these are 23andMe and AncestryDNA, both with more than five million users in their databases. These numbers dwarf the numbers of human genomes in scientific databases. Genetic genealogy is big business, and has gone mainstream. But how accurate are these tests—truly?
First, a bit of genetics 101. DNA is the code in your cells. It is the richest but also most complex treasure trove of information that we’ve ever attempted to understand. Three billion individual letters of DNA, roughly, organized into 23 pairs of chromosomes—although one of those pairs is not a pair half the time (men are XY, women are XX). The DNA is arranged in around 20,000 genes (even though debate remains about what the definition of a gene actually is). And rather than genes, almost all of your DNA—97 percent—is a smorgasbord of control regions, scaffolding and huge chunks of repeated sections. Some of it is just garbage, left over from billions of years of evolution.
Modern genetics has unveiled a picture of immense complexity, one that we don’t fully understand—although we are certainly a long way from Mendel and his pea experiments, which first identified the units of inheritance we know as genes. Throughout the course of the 20th century we gained a firm grasp of the basics of biological inheritance: how genes are passed from one generation to another and how they encode the proteins that all life is built of, or by. In the 1980s we identified genes that had mutated, making faulty proteins, which could cause terrible diseases such as cystic fibrosis or muscular dystrophy, for example.
By 2003, the Human Genome Project had delivered the human DNA sequence in its entirety. One of the most important by-products of that endeavor was the advent of technology that allowed us to read DNA at unprecedented speed and for ever-decreasing costs. We can now pump out the genomes of hundreds of thousands of people for peanuts, and with that data comes greater and greater perspicacity into the profound questions of inheritance, evolution and disease. There’s effectively infinite variation in human genomes, and scrutinizing our DNA helps us to understand what makes us human as a species and as individuals.
With the plummeting costs of gene sequencing came commercial interests. All of a sudden any company could set up shop, and in exchange for some cash and a vial of saliva, could extract your DNA from the cells in your mouth and sequence your genome. Alongside the behemoths 23andMe and AncestryDNA, dozens of companies have done just that.
There are two potential issues arising from the question of their results’ accuracy. The first is somewhat trivial: Has the sequencing been done well? In critiquing this business, it seems fair to assume the data generated is accurate. But there have been some bizarre cases of failure, such as the company that failed to identify the sample DNA as coming not from a human, but from a dog. One recent analysis found 40 percent of variants associated with specific diseases from “direct to consumer” (DTC) genetic tests were shown to be false positives when the raw data was reanalyzed.
Assuming the tests are done accurately, some discrepancies can still arise from differences in the companies’ DNA databases. Almost every DTC genetic test does not sequence your entire genome, but instead looks at positions in your DNA that are known to be of interest. When I was tested by 23andMe, they proclaimed I do not carry a version of a gene that is associated strongly with red hair. Another ancestry company said I did. This merely reflects the fact one company was looking at different variants of the gene that code for ginger hair.
If we assume the data generated is accurate, then the second question that arises is on the interpretation. And this is where it gets murky. Many of the positions of interest in your DNA are determined by experiments known as Genome Wide Association Studies, or GWAS (pronounced gee-woz). Take a bunch of people, as many as possible, that have a shared characteristic. This could be a disease, like cystic fibrosis (CF) or a normal trait, say, red hair. When you sequence all their genes, you look out for individual places in their DNA that are more similar within the test group than in another population. For CF, you would see a big spike in chromosome 7 because the majority of cases of CF are caused by a mutation in one gene. For redheads, you’d see 16 or 17 spikes very close to one another, because there are multiple variants in the same gene that all bestow ginger locks. But for complex traits like taste or ones relating to diet or exercise, dozens of variants will emerge, and all of them only offer a probability of a predisposition toward a certain behavior as a result of your DNA, as measured in a population. This even applies to something as seemingly straightforward as eye color: A gene variant that is associated with blue eyes is still only a probability that you will have blue eyes, and it is perfectly possible to have two blue-eyed genes and not have blue eyes.
Genetics is a probabilistic science, and there are no genes “for” anything in particular. I have severe reservations about the utility of genetic tests that indicate one individual’s propensity for certain conditions outside of a clinical setting; if you don’t have a PhD in genetics, these results can be misleading or even troubling. Even if, as I do, you carry a version of a gene which increases the probability of developing Alzheimer’s disease, most people with this variant do not develop the disorder, which is also profoundly influenced by many lifestyle choices and some blind luck. There is little a geneticist can tell you with this information that will outweigh standard lifestyle advice: Don’t smoke, eat a balanced diet, exercise regularly and wear sunscreen.
When it comes to ancestry, DNA is very good at determining close family relations such as siblings or parents, and dozens of stories are emerging that reunite or identify lost close family members (or indeed criminals). For deeper family roots, these tests do not really tell you where your ancestors came from. They say where DNA like yours can be found on Earth today. By inference, we are to assume that significant proportions of our deep family came from those places. But to say that you are 20 percent Irish, 4 percent Native American or 12 percent Scandinavian is fun, trivial and has very little scientific meaning. We all have thousands of ancestors, and our family trees become matted webs as we go back in time, which means that before long, our ancestors become everyone’s ancestors. Humankind is fascinatingly closely related, and DNA will tell you little about your culture, history and identity.