Whit Athey, creator of the Y-haplogrup predictor

Discussions about relatives, family stories and myths.
User avatar
Posts: 16
Joined: Wed Mar 14, 2012 2:30 pm
PostPosted: Sun Mar 18, 2012 6:50 pm
Whit Athey, creator of the Y-haplogrup predictor (in Russian)

http://www.molgen.org/index.php?name=Ne ... le&sid=158

Please use Google Translate.
Y: O3a3c
mt: J1c

Posts: 326
Joined: Thu Mar 15, 2012 1:14 am

YDNA:
R1b-Z12*
MtDNA:
I3b (FMS)
PostPosted: Sun Mar 18, 2012 10:04 pm
[courtesy of Google Translate. See original article for graphics.]

I started thinking about the problem of prediction of haplogroup almost as soon as I saw my own DNA results at the end of 2003. The company, which conducted tests could not predict my haplogroup, so I wanted to find a way to do it yourself. I tried several methods before they stopped on the development of the work that I put online (the Internet) sometime in September 2004. This method has been described in the Journal of Genetic Genealogy ...

"I've always been interested in molecular biology, and even my thesis, physics, and contained part of molecular biology. When, some 20 years ago, an article was published in «Mitochondrial DNA and Human Evolution» authors Rebecca L. Cann, Mark Stoneking and Allan C. Wilson, the first time I noticed a real opportunity to better understand human origins. However, when I was heavily involved in other things and as a result of many years was the only witness of the events concerned.

In 2001 I bought, only appeared in the sale of the book Bryan Sykes' Seven Daughters of Eve ", the contents of which caused me great concern. I almost decided to order a mtDNA test, which is offered his company, but at the time such services were worth quite a long time, and again ... I was a bystander. However, I have developed some presentations, which he called "The human family» (The Human Family), and which I showed you a few times in 2001 and 2002. small groups of people.

In 2003, I finally ordered tests for himself "in full» - Y-STR and mtDNA, and immediately organized a family project to his namesake. Then, during 2004 and 2005. I organized five projects, mainly for the preparation of the comparative characteristics of Y-profile (DNA-passport), other interesting names to me, and in parallel studied the mtDNA origin, ie the origin of the female line, by testing their cousins. "

Thus began the story Atheios Whit (Whit Athey), with whom I spoke in August-September 2008, better known native DNA genealogy, not as the organizer of family projects, as well as the founder of Y-predictor (predictor of Y-haplogroups).

DG - Tell us what you think is the most important and prominent family in the development of your projects? What difficulties are encountered in the development?
W.A. - In my opinion, the most important achievement of my project is that we finally found a group of people with a different name, whose Y-STR-results are consistent with members of a family of my project «Athey / Athy / Atha / Athon». Until now it was not possible to identify any family relationship between these two names, as they lived in different places - Ireland and England, besides my ancestor Athey in 1661 immigrated from Ireland to North America. Familial relationship was not, but only so long until we have traced the history of our families until about the year 1500. This group of people is the name of Whitfield, the same as the Rev. George Whitfield (Rev. George Whitfield), one of the founders of Methodism (Methodism), who had no children, boys. This coincidence is quite ironic, because my own name - Whitfield. (Laughs)

The main problem that I face in all my projects - it is the recruitment of participants. There are a lot of lines from my ancestors, immigrants who are not represented in my projects, and locating people, suitable for the project is difficult. The subsequent conviction of these people to do a DNA test for genealogy is a super-challenging, even when I offer to pay for the cost of testing. I have two people whose whereabouts I was able to determine, and who would be very valuable to one of my projects, and which I can not convince the testing.

More information on projects Whit Athey, I will in the second part of the story. But, looking ahead, here is a quote from the next part of the story: "My own profile mtDNA haplogroup refers me to U5a1a. This haplogroup is characteristic of Central and Eastern Europe, including Russia. In fact, the closest match to my profile (mtDNA) was discovered by me when comparing haplotypes from the recent work of Russian scientists - and Malyarchuk Derenko. If my female line actually came from Russia, and the way, I do not have any evidence on this score, it must be this migration occurred many centuries ago. I do not know anyone from the "recent" ancestors from Russia.

Through the male line, I'm a Y-haplogroup G2a, and perhaps many centuries ago, my very distant ancestor arrived in Ireland from the territory of the Caucasus. "

DG - Many people know you as the creator of a predictor of Y-haplogroups. Tell us how and when did you start a predictor? You do it by yourself or with someone else? (Information as of August 2008).
W.A. - I started thinking about the problem of prediction of haplogroup almost as soon as I saw my own DNA results at the end of 2003. The company, which conducted tests could not predict my haplogroup, so I wanted to find a way to do it yourself. I tried several methods before they stopped on the development of the work that I put online (the Internet) sometime in September 2004. This method has been described in the Journal of Genetic Genealogy (JoGG, http://www.jogg.info) in early 2005. Unfortunately, the magazine is only available in English. This method calculated the "suitability gaplogruppnuyu» (haplogroup fitness score - gaplogruppnaya convention, predisposition, like - I could not find the most suitable transfer - approx. DG), showing the percentage probability that a given Y-STR haplotype matches a known predictor haplogroups. The level of compliance or weight («Fitness score», see Predictor at http://www.hprg.com/hapest5/hapest5b/hapest5.htm) of 100 said that this haplotype is an exact match modal (ancestral). Most often, the value of all in the range of 40-60.

Next working version of the program is already included in the calculation of a Bayesian probability, and has also been published in the Journal of Genetic Genealogy (JoGG, 2006). This version of the predictor showed two results: "gaplogruppnuyu suitability" and the result is based on the Bayesian calculations, both speak of belonging to a haplotype of a haplogroup.

By the end of 2006 I programmed everything in Excel and then converted the data into the program for the website. With the addition of more markers and haplogroups, this option is no longer useful, because required much more time to complete the program and to download.

Doug MacDonald (Doug McDonald), a very good programmer, offered to take my program and reprogram it in C + so that a new version of the program was much "easier" and "fast", ie the old version better. It is this version of my predictor appeared in early 2007 and is still relevant.

As of August 2008 there were 23 predictors in the haplogroup. The only limitation to the addition of more haplogroups is that the program requires the distribution of allele frequencies for each marker in each haplogroup. I can not add new haplogroup as long as I will gather a sufficient number of haplotypes, which include all of the markers, which are usually tested by representatives of the people, this haplogroup. In June 2008 the program was added to haplogroup C3 and G1. I'm trying to collect a sufficient number of sub-haplotypes of haplogroup O to complete the program. Also, I would like to add to haplogroup predictor of N2, which I think should be of interest to many residents of central Russia. The program is currently able to relate to the haplotype haplogroup N, but in fact, this haplogroup N3, which is typical of the North-West of Russia. But again, that would be included in the program of haplogroups N2 and N3.

Here you are asking me about the haplogroup O3 predictors. To add to the O3 or O3a3 predictor I need a few dozen haplotypes that were definitely attributed to these haplogroups. Using the characteristics of these haplotypes, I could work with the SMGF database and add the current number of new markers belonging to the same haplogroup. The same thing I can say for haplogroup N2. By the way, O3 I have a collection of "minimal" haplotype collected from various scientific publications. So the beginning has been made.

DG - Here is the magazine you mentioned JoGG. As editor, you will probably happen to communicate with different experts in the field of genetic genealogy. Tell me, do you happen to talk with someone from the Russian specialists?
W.A. - Unfortunately, I'm not much in contact with Russian amateurs and professionals. As editor of JoGG I corresponded with one of your countrymen, who, it seemed to me, is very experienced and knowledgeable expert in the field of mtDNA. He was a reviewer for the journal.

Also, I had a brief correspondence with Malyarchuk BA, which helped me to get an article from a Russian magazine. I can not read in Russian, but I have important data tables that have been published only in paper. I think that your population genetics are very lucky - in your area, so many diverse and interesting population! In general, the Russian territory (FSU) - the best place where you can work population genetics (possibly excluding salaries).

DG - Since you are seriously carried away with genetic genealogy, you are constantly monitors the status of the new science, attend and participate in conferences. What are the trends observed in DNA genealogy? What do you expect?
W.A. - It's hard to predict anything - most surprising event, but there are some things that are expected. I like many worried about the price of testing and I look forward to positive developments in this direction - with the emergence of new companies offering DNA testing and the development of competition, prices should drop tests. I believe that the price of the main (basic) tests should be reduced or for the same price will be offered more information. And I hope that this trend will allow more people around the world to participate in DNA testing.
The company once FTDNA has announced that they are working on a project that will allow sequencing of small plots of Y-chromosome. This could be a notable achievement of the company.

I am sure that soon we will see more use of "genochipov» (gene chips), which will receive hundreds and even thousands of Y-SNP results at a time. It is already available but is expensive, besides still not enough focus on the Y-chromosome SNP.

I also strongly believe that the same as we are "fans" that appear all over the world, will make an invaluable contribution to the favorite of all the young science. In the beginning, we were a mere "consumers" of the information we were given professional scientists in their scientific publications, and now the "fans" have themselves become professionals and bring to the public more and more new discoveries and achievements. I have no doubt that this phenomenon will be observed to grow in the future.

DG - I know that you are actively communicating with many companies offering DNA tests, but still stopped on the Family Tree DNA. Why not?
W.A. - Yes, I've been doing tests in almost all well-known companies. And I should note that almost everywhere is a good customer support via e-mail - somewhere better, somewhere worse. But I have only good memories of similar experiences.

The company FTDNA offers good service, and almost all of my current projects and their participants are tested in this company. My projects have "official representation" on the site ftdna.com, because the company has taken care of in advance to provide such a service to its customers, which is very convenient - all built on templates and people are easy to orient them. But as I said earlier, I had experience with many companies, so my projects are independent and a platform where I can place the data from all laboratories.

Returning to the company FTDNA, I want to say that I have a good relationship with the company president Bennett Greenspan (Bennett Greenspan) and Thomas Crane (Thomas Krahn). I can write an email to any of them and be assured that I will get a thoughtful answer. But I try not to abuse the trust, as I understand that every day of their mailboxes filled up letters. It should be noted that the remaining employees are responsible and always on the case.

Talking to Uitom on different companies, we agreed that the presence of such companies can cover all aspects of DNA testing (including the issue price), as well as the characteristics of different regions. Having your own web site project, or "representation" in any community, it is possible to accumulate information on your project without breaking the rules and collecting information to companies around the world.
In addition to the family of projects, we discussed the Journal of Genetic Genealogy (JoGG, http://www.jogg.info), edited by Whit Atheios. But, as a family project, and the journal I plan to tell in one of the following materials.

Denis Grigoriev,
August 2008.
"Molecular Genealogy"
http://www.molgen.org
Use Profile/Edit Profile in User Control Panel to add your Y-DNA and mtDNA values.
User avatar
Posts: 202
Joined: Sun Apr 01, 2012 5:38 am

YDNA:
R-L21+, L226+
PostPosted: Fri Apr 06, 2012 6:07 am
I really wish that DNA testing companies would invest more development into tools like Whit's Y-Haplogroup predictor. By having better prediction technology, it reduces the costs of their "deep clade" tests since prediction tools would reduce need for the random testing down the haplotree. As the number of Y haplogroups continue explode in numbers, prediction tools will be needed even more. This is offset by testing costs continual decline towards the $1,000 (and below) full genome test.

However, there is limit to how many Y-SNPs can be accurately predicted based on Y-STR values. As the SNPs become more recent, many SNPs no longer have unique Y-STR patterns since Y-STRs are volatile and Y-STR patterns in recent times overlap among many SNPs. With the ever increasing number of SNPs being discovered and more recent SNPs are discovered, this overlap becomes more and more common. For my SNP of interest (R-L21), only L226 and M222 are unique enough (Y-STR patterns) and are large enough in scope to predict. There are several other smaller scope SNPs that could be added as well - but the numbers are too small to significantly increase coverage. This approach is old to recent approach where more recent SNPs can not be reliably be predicted due to overlapping Y-STR patterns. Mark Jobling's overlapping haplotypes scenario.

Over the last two years, I became very alarmed of the random nature of testing for SNPs under R-L21. I also became very excited about how these recent SNPs reveal significant information being overlooked by most genealogists. There are now 60 downstream SNPs of R-L21 with one or two being added every month now. Mike W's R-L21 spreadsheet is a goldmine for any serious R-L21 researcher and now has around 5,500 67 marker submissions that are suspected to be R-L21. Of the 60 SNPs, around 50 SNPs have single Y-STR patterns that make them unique to all other R-L21 submissions. The remaining older 10 SNPs have two to twenty Y-STR patterns found where the SNP tests positive. I have developed a recent to past R-L21 Y-SNP predictor based on logistic regression (curve fitting) methodology.

I would love to have some feedback on my R-L21 SNP predictor as I am now attempting automate and validate this prediction tool. For recent SNPs, the classic S-Curve works very well: P = exp(a+bx)/(1+exp(a+bx)). P = probability of testing positive for the SNP and X is how well you match the MRCA of the R-L21 off modal values for those submissions testing positive for the SNP.

Here are some curves using the above formula:

http://www.rcasey.net/DNA/R_L21/stats/M222.pdf

http://www.rcasey.net/DNA/R_L21/stats/L706_2.pdf

http://www.rcasey.net/DNA/R_L21/stats/L643.pdf

Here is a graphic of the R-L21 SNPs and there connected currently:

http://www.rcasey.net/DNA/R_L21_Descendant_Chart_Intro.html

Here is the actual R-L21 SNP tool:

http://www.rcasey.net/DNA/R-L21_SNP_Predictor_Intro.html

Here is the analysis for each "single fingerprint" R-L21 SNP:

http://www.rcasey.net/DNA/R_L21/R_L21_Private.html
User avatar
Posts: 163
Joined: Wed Mar 14, 2012 6:14 pm
Location: Sault Ste Marie, Northern Ontario, Canada
YDNA:
L21-L513*
MtDNA:
H1
PostPosted: Fri Apr 06, 2012 6:10 pm
Robert
I tried this link you provided... http://www.rcasey.net/DNA/R-L21_SNP_Pre ... Intro.html
and I could not get to the page, is the link broken somehow?

fyi

Mike
Furthest Y line=Patrick Whealen 1816-1874, Tipperary Co. Ire. to Kincardine On

Y-DNA-RL21, R-513* (still looking for the 'lost Irish 'C' boys')

FTDNA=P312+ P25+ M343+ M269+ M207+ M173+ L513+ U198- U152- U106- SRY2627- P66- P107- M73- M65- M37- M222- M18- M160- M153- M126- L705- L577- L193- L159.2- L1333-
23&me=L21+
E.A.= S21-, S26-, S28-, S29-, S68-

Co Administrator of the Whalen/Phelan DNA Surname Project
http://www.worldfamilies.net/surnames/whalen
User avatar
Posts: 202
Joined: Sun Apr 01, 2012 5:38 am

YDNA:
R-L21+, L226+
PostPosted: Fri Apr 06, 2012 9:05 pm
My had trouble with all my URLs and other URLs for an hour or so today. Try again as the outages seems to have passed.
User avatar
Posts: 163
Joined: Wed Mar 14, 2012 6:14 pm
Location: Sault Ste Marie, Northern Ontario, Canada
YDNA:
L21-L513*
MtDNA:
H1
PostPosted: Sat Apr 07, 2012 12:38 am
yep, works fine now
Furthest Y line=Patrick Whealen 1816-1874, Tipperary Co. Ire. to Kincardine On

Y-DNA-RL21, R-513* (still looking for the 'lost Irish 'C' boys')

FTDNA=P312+ P25+ M343+ M269+ M207+ M173+ L513+ U198- U152- U106- SRY2627- P66- P107- M73- M65- M37- M222- M18- M160- M153- M126- L705- L577- L193- L159.2- L1333-
23&me=L21+
E.A.= S21-, S26-, S28-, S29-, S68-

Co Administrator of the Whalen/Phelan DNA Surname Project
http://www.worldfamilies.net/surnames/whalen

Return to Histories and Stories

Who is online

Users browsing this forum: No registered users and 1 guest