A Practical Guide to Using Autosomal DNA for Genealogy
Overview
Can you use autosomal DNA (abbreviated as atDNA) to break through brick walls? Absolutely. You can also validate or disprove questionable parts of your pedigree. In fact, if you correctly use atDNA -- and few people are -- you can reach an equivalent of the genealogical proof standard.
Introduction
There's plenty of information online about atDNA but little of it was contributed by people with experience that is both significant (thousands of hours) and relevant (working IBDs). This guide, in contrast, is founded on exactly that kind of experience.
First, I'll cover some of the basics of genetic genealogy. Then, I'll explain IBDs and teach you how to solve them. Three points before proceeding:
1. This article does not cover all the intricacies of using autosomal or atDNA. Rather, it provides a blueprint for building skills.
2. Each and every point made below is important, so you'll need to review this article many times, over the next several months and years, while working IBDs. Sorry but there is no alternative path to competence. No amount of reading or attending conferences will compensate for the hard work described below.
3. Due to revisions/updates, link or bookmark to this article, rather than copying.
Background information: What you need to understand before starting
1. atDNA, is different from yDNA or mtDNA. Understanding the difference is essential.[1]
2. FamilyTreeDNA.com (FTDNA) offers all 3 kinds of DNA testing; their atDNA brand is called "Family Finder." 23andMe also does all 3 kinds, but only a small percentage of people use it for genealogical purposes. Ancestry does only atDNA testing. More recently, there has been a profusion of newer companies that do only atDNA testing. Among them, the only serious contender that I'm aware of is MyHeritage.com
3. Previously I have advised testing on FTDNA. While it is certainly better than Ancestry for serious autosomal researchers, recent changes to the website in July/August 2021 make it much harder to use efficiently and effectively. It's possible these glitches will be fixed. In the meantime, I recommend that beginners use MyHeritage.com
4. Some people promote Ancestry.com. I don't know if they are novices, recycling press releases, or receiving compensation. But Ancestry does not provide users with the basic data and tools to accurately use atDNA for genealogical purposes. Only the hints provided by Thru Lines point toward possible lineages (not necessarily individuals) that might be in your pedigree. My use of the word "possible" is important. Ancestry uses a very liberal process for determining people who might share the same common ancestor. A liberal process means a high error rate. This error rate is further increased by the profusion of errors in ancestry trees. It's no substitute for traditional genealogy in combination with working IBDs. Additionally, a match outside of a Thru Line is 100% meaningless in terms of pointing you to a common ancestor -- unless you have unknown close relatives. Ancestry tries to make much of the fact that many people have tested there, but unless you are an adoptee or the like, it doesn't matter.[2] It's far better to have thousands of usable matches (FTDNA, Gedmatch, MyHeritage) versus hints toward potential ancestors -- often wildly conjectured--via Ancestry Thru Lines.
5. If you've already tested at Ancestry, you can compensate somewhat for its deficiencies by uploading your information elsewhere.
FTNDA: If you don't pay the small transfer fee to FTDNA you will get incomplete matches that will skew your analysis. Once you pay the transfer fee, all tools are free -- but those tools have become more difficult to use.
MyHeritage has an excellent auto-clustering tool that simulates IBDs. This will provide you with a limited set of IBDs to work. No fee required.
Gedmatch: This company now sets the standard for autosomal research and also has a cluster tool-- but you have to subscribe to use it.
6. Should you test at all? Only those with detail-oriented, analytical minds who are willing to dedicate many hundreds or thousands of hours are likely to have much success. HOWEVER, even if you don't have the skills, time, or desire to do atDNA properly, some of your experienced matches may break through brick walls for you. I've done this for quite a few matches, but no one can help unless you provide a pedigree that's as complete, accurate, and accessible as possible. So when considering whether to test, ask yourself if you are willing to collaborate and share genealogical information. If you aren't, there's no point in testing.
7. Who should you test? Always test the oldest generation available in any genealogical line. If your mother is alive but your father and all grandparents are deceased, then start by testing your mother and work her pedigree first.[3] Later, you can test yourself and use that to work your paternal pedigree.
8. atDNA is inherited in a random, haphazard fashion. Each company has an algorithm for estimating how closely you might be related to a match. Unless you are a pretty close match to someone, those estimates can be considerably off. I pay no attention to them.
9. Some people state that atDNA can only be used for genealogical purposes 4-5 generations back. That's wildly incorrect. It's possible to prove some 8th, 10th, even 12th cousins. (Caveat: The more distant the cousin, the farther back in time you're working, so the limitation is finding/generating enough accurate, complete pedigrees.)
10. You MUST understand that every person inherits 22 pairs of chromosomes comprised of one strand from your mother and the other strand from your father. There are segments on each strand that are defined by numeric markers. For example, my father and his maternal first cousin share many segments, including Chromosome 3 from 72447102 to 98877826 - point A to point B. (It's a useful convention to lop off the last 6 digits, i.e. 72-99.) I have mapped the above segment back to someone upstream of Abington Felps b. bef 1707 and Rachel McElroy b. 1711. However, the segment immediately before that on the same strand traces back to an ancestor in a completely different part of my father's maternal pedigree. And the segment after it traces back to yet another unrelated branch in my father's maternal pedigree. In contrast, a segment with similar geographic markers -- but on the opposite strand -- will trace to a set of ancestors in my father's paternal pedigree.
To reiterate: every strand of every chromosome is a patchwork of segments inherited from ancestors on your maternal or paternal side. (The inheritance pattern of the 23rd or X chromosome is different. Explore it only after you have a fair amount of working experience.)
11. Some segments are long enough to be usable for genealogical purposes. At one point, FTDNA determined that a segment should be at least 7.69 cM. This is not a rigid rule, but until you have considerable experience working IBDs, don't bother with smaller segments as many/most of them will be false matches.
Important Sidebar
The next section will explain how to use atDNA properly. But first, I want to explain how not to do it. Beginners start by looking at the surnames of matches. You find someone (Mike) who shares a surname (Hopkins) in your pedigree. You get the idea that Joseph Hopkins of Stafford Co, Va. m. unknown was the source of DNA shared with Mike. Next, you check ICWs (in common withs), thinking that everyone who is ICW with Mike shares atDNA from Joseph Hopkins or his unknown wife.
The beginners process is riddled with errors like this. Unfortunately, most people are clueless about how to approach atDNA and coming to the wrong conclusions. The next section will describe the correct way to use autosomal DNA -- working IBDs.
Before proceeding let me emphasize something: ICWs are a pool of people, many/most of whom do NOT share the same common ancestor. Only the subset that are IBD (see below) might share atDNA from Joseph Hopkins or his wife. The rest are likely to match you in random parts of your pedigree and match Mike in random parts of his pedigree that do NOT overlap with your pedigree.
Practical application: Working IBDs
THIS IS THE MOST IMPORTANT RULE IN USING atDNA FOR GENEALOGICAL PURPOSES: Anyone who shares more or less the same (significant) segment on the same strand of the same chromosome has the same common ancestor (CA). This rule is called IBD or identity by descent. An IBD may involve two people or dozens of people. Each IBD is a puzzle to be solved. Your understanding of the rules of atDNA may be insufficient. Your ability to follow the rules may fail. Your analysis of an IBD may be faulty. Your genealogical research may be insufficient. However, the IBD itself is never wrong. The correct solution to each and every IBD is a set of CAs somewhere in your pedigree -- and in the pedigree of everyone else that shares the same segment/IBD.
Implementing this rule involves several steps: First, identify an IBD or cluster you want to solve. Second, determine all the people who share that IBD. Third, collect as many of their pedigrees as you can and work them as far back as you can. (Make sure you have the correct pedigree of the person.) Last, analyze those pedigrees for overlap. Most people will do just about anything to avoid developing/analyzing pedigrees of matches. Those people are not working IBDs. You should be spending 90% of your time on this task.
IMPORTANT TIPS:
1. Let's say you match Jack Bryant and you too have the surname "Bryant" in your pedigree, so you want to work the single IBD you share with him to see if you share the same Bryant lineage. The chromosome browser tool on FTDNA will provide basic information like you match Jack Bryant on Chromosome 8 from 1 to 7.5 for about 13 cM. But how do you figure out who else matches on the same segment and the same strand of Chromosome 2?
This can be done with the tools on FTDNA, but it's awkward. First, click the ICW button for Jack Bryant. You may get 1-5 pages of people. Then put all of those people through the Chromosome Browser tool in batches of 7, looking for people that share pretty much the same segment with Jack Bryant. (This converts a pool of ICWs into the subset that are IBD.) Instead of using the FTDNA tools, you can save time -- and get more accurate results -- if you upload your kit to dnagedcom.com/ and run a tool called the "Autosomal DNA Segment Analyzer" (ADSA, not Jworks or Kworks). This tool will give you a visual image of all IBDs on a single chromosome.[4] Update: unfortunately, this service is no longer free.
Important Note: Some people are unable to wrap their mind around the fact that each chromosome is actually a pair of strands: one strand from the mother and the other from the father. When you look at an Excel download of matches provided by FTDNA, the maternal and paternal segments are lumped together based on the starting point of the segment. A real IBD never combines segments/matches from maternal and paternal strands. Mixing the two sides will lead to all kinds of confusion.
2. DON'T MAKE ASSUMPTIONS. Very often, as a beginner, you'll think you see something obvious and make an assumption. The vast majority of these assumptions will be proven wrong when you properly work an IBD, so don't make any assumptions and let the IBD reveal it's answer.
Example: Don't assume that simply because you have matches with 5, 10, or 20 people who descend from the same set of CAs on paper, i.e. Cicely Reynolds and William Farrar that you also have them in your pedigree. Many early English and American Colonial couples have hundreds of thousands of descendants. Reynolds-Farrar may inhabit one of the voids in your pedigree, if and only if you have an IBD in which at least 3 family groups trace back to them. And in the case of a progenitor family, I raise the threshold of proof to at least 5 family groups.
Example: If you share 3 significant IBDs with a match, don't assume they all come from the same set of CAs. They may come from 1-3 sets.
Example: If multiple people sharing an IBD have the same surname, don't assume they share the same lineage. Verify the genealogy and look for yDNA confirmation.
Example: Don't assume that any 2 people (including first cousins) have only one set of CAs. The people sharing an IBD usually have multiple CAs with each other. The only way to tell which set is relevant is by comparing complete pedigrees.
Example: Don't assume that the first set of CAs you find for an IBD are the solution. Most IBDs will reveal multiple sets of shared CAs. Only one of these is the solution.
Example: On FTDNA, if A matches B and the maternal first cousin of A does not match B, then don't assume A matches B on her paternal side. First cousins have less than 12.5% shared atDNA of significant length from one set of common grandparents. (My father and his first cousin have less than 11%.)
Example: If you and a cousin share a long segment, don't assume that entire segment came from only one set of common ancestors. More often than not, it will be composed of subsegments from different ancestors.
In short, never push your desires or assumptions into the data. Instead, gather the correct data and approach it curiously. Over time, the mystery will unfold, and you will often be amazed by what it reveals.
3. After you know the identity of people who share a segment/IBD/same ancestor, the next step is to collect their pedigrees. Some people on FTDNA post gedcoms but most don't. Usually, the people I contact are responsive, but it does matter how I approach them.[5] Instead of asking a match about a particular surname, try to get her entire pedigree. Many will have trees on Ancestry. Others will have it in a different format. At the very least you want the full names of all 4 grandparents (maiden names of grandmothers). Most of the time, you can use this information to quickly find various trees on Ancestry, that will -- in aggregate -- approximate your match's pedigree.
Note: Keep those records. Each match on FTDNA has an icon under her name that you can click and add notes. Keep your notes for each match current with date of contact, genealogical information for that person, links to their tree on Ancestry or elsewhere, information on their IBDs, and your progress toward solving them.
4. Beginners should not try to work IBD's that involve a lot of people. The solution will probably be a CA born in the 1500s or early 1600s and, thus, very difficult to solve. IBD's of only 2 people can't be solved at all because you need at least 3 family groups -- and often more -- to solve an IBD. Look for IBDs with longer segments, rather than shorter segments, as there's a better chance (but certainly not a guarantee) of a more recent CA.
5. How do you know when an IBD is solved? The truth is that you can never solve one completely. All the atDNA in your chromosomes is very old; it didn't spontaneously generate in recent generations. Some of it has been so chopped up through combination and recombination, that you'll never know where it came from, but a surprisingly large amount is handed down in roughly intact segments generation after generation after generation. Each of these segments has a pattern or code that allows testing companies to match you with distant and more recent cousins. The hard part is finding overlaps in the relevant pedigrees. Since the interim solution to most IBDs will be someone born in the early 1700s, 1600s, or 1500s, you need fairly complete pedigrees to figure out the CA. Unfortunately, most people don't have very complete pedigrees. If you work with pedigrees that are only 10% complete, then you're likely to have a 90% error rate. So if it's important to solve a particular IBD, you'll have to flesh out the pedigrees of matches yourself. This doesn't mean creating trees from scratch but looking at existing trees and keeping a record of those that -- in aggregate -- show a match's pedigree back to the 1600s, if possible.
Aside: If you are serious about using atDNA for genealogical purposes, then, yes, you'll need an account on Ancestry, so go ahead and post a tree there. Keep in mind that most of the trees on Ancestry are done by amateurs and are riddled with errors, including a lot of people who are now erroneously claiming to have proven this or that lineage through atDNA. If you don't have skills in traditional genealogy, you'll have to build those as well. This is a lifetime hobby, and like most hobbies, it takes time to build skills, and there are expenses along the way.
6. You'll find a lot of references online about "triangulation." That's the idea that, if you can find 3 people on an IBD with the same set of CAs, then you have solved the IBD. HOWEVER, this is a simplification that very often leads to errors. First, many IBDs include people who are closely related to each other. These people must be identified and treated as a single family group. If you have 3 different family groups that have the same set of CAs, then you have a possible solution to the IBD. Almost all IBDs will involve multiple CAs. For example, you may find 4 family groups that have one set CAs, 3 family groups that share another set of CAs, 2 family groups with yet another proven CA, etc. etc. This is extremely common, so don't stop when you find the first group of 2 or 3. You need to find all CAs. And you'll have to keep working until you have a clear solution to the IBD. Since multiple CAs are commonplace, you may have to wait until you find 4, 5, or more family groups with the same set of CAs. This is especially true of larger IBDs, involving many people.
Note: The biggest blunder for people who have mastered some of the basics is failure to fully develop the pedigrees of their matches. I'm currently working a small IBD and have already found 3 sets of proven CAs with a couple more possible CAs.
7. Important: Sharing a set of proven CAs on paper with Person A doesn't mean you inherited DNA from one of those CAs. Only by working a shared IBD can you be sure which who bequeathed the relevant segment of DNA. (It might be someone you don't even have in your pedigree yet.)
8. When you have a solution to an IBD, it will look like this: Nathaniel Tilden b. ca 1583 m. Lydia Hucksteppe b. ca 1587 -- or someone upstream of them. Clearly, the atDNA segment didn't generate spontaneously at the birth of either Tilden or Hucksteppe, and you may later get a match -- on the same IBD -- with someone who has a John Hucksteppe b. 1546 m. Unknown in her pedigree. All solutions are temporary.
9. A child can carry only 50% of a parent's genome, so generation after generation, the genetic material of some ancestors is lost. Most of the time, you will have multiple IBDs that point to the same set of CAs. The more IBDs you work, the more clarity you will have about certain, limited parts of your pedigree. (These tend to be the areas where you have a doubling up of the same sets of CAs, whether you are aware of the doubling or not.) However, other parts of your pedigree were shortchanged by the DNA lottery game. To work them, you may eventually need to test additional family members.
10. If you and your father have both tested, always use your father's kit/results to work paternal matches. If you and your mother have both tested, always use your mother's kit to work maternal matches.
11. Keep records that map segments of your chromosome back to the CAs you've proven.
12. Every significant segment of atDNA will have originated in some remote crook of a remote branch your pedigree. A rough estimate is that there are 200+ potentially relevant surnames for any given IBD. Posting a few surnames on your surname list at FTDNA won't help you or anyone else. Develop all branches of your pedigree as far back as possible and list all the surnames in your FTDNA profile. This will save you a lot of time over the long haul. When you flesh out new surnames, add them to your list.
13. Think of your atDNA information as a legacy, just like a tree posted online. If you provide a gedcom (or link to a tree) and a complete surname list, your relatives and other researchers can pick up where you left off. (Do this for all kits you manage.)
Summary
Take your time and incrementally build skills. Remember: this is a long-term endeavor. If you can't solve a particular IBD, put it aside and wait for more matches or try to find additional matches on a different website. If things don't add up, after considerable effort, the most likely explanation is that you've combined matches from different strands, or you have an error in your tree. Sometimes, you'll solve an IBD without knowing exactly where the CAs fit into your pedigree. The more IBDs you work, the clearer that will become.
___________________________________________________
[1] Only males can do the yDNA test and it tells the lineage only of the father's father's father's father, etc. This is one bookend in your pedigree but only a tiny fraction of your family tree. Both men and women can do mtDNA. It tells the lineage of your mother's mother's mother etc. This is the other bookend in your pedigree but only a tiny fraction of your family tree. Both males and females can test for atDNA, a complex of myriad segments, tracing back to ancestors in ALL branches of your pedigree -- the bookends and everything in between.
[2] If you are an adoptee or missing a close family member, you probably need to use both FTDNA and Ancestry.
[3] In most cases, it's not necessary to test multiple siblings. Begin with one test of a parent or grandparent and add kits only as necessary and as your experience increases. A first cousin of the oldest generation tested is a good 2nd step. If possible, try to pick a cousin that has only one set of CAs with your first kit.
[4] Use the ASDA tool or FTDNA tools, rather than csv/Excel downloads, showing all of your matches by chromosome, because the csv/Excel download jumbles both chromosome strands together.
[5] Don't overwhelm people with information in your first contact. Personalize emails. Don't send group emails. Offer a phone number or some other indication that you are not a scam artist. If you don't get a response the first time, wait a while and try again. I usually try three times before giving up. (When you first test, a large percentage of your matches will be people who tested years before you did. Some will be deceased. So don't delay in contacting people.) Even without a response or gedcom, I've been able to put on my detective hat and figure out the ancestry a lot of people. You can too.