Big Y-DNA test

H2a2a1 (rCRS)
PostPosted: Thu Nov 14, 2013 11:41 am
My Big Y-DNA test is in the batch 542 due 12/31/2013.

I hope it will be my gift of New Year with a lot of SNP below L497 to improve the knowlege of this branch probably come from Caucasian region to Central-Western Europe with the first Neolithic vagues 7500-6000 years ago.

From oldest to newest mutations, I am G2a3b1-P303+ L140+ L497+ Z725+ Z726+ CTS4803+

PostPosted: Fri Nov 15, 2013 12:19 pm
Today last G newsletter from Ray Banks. I copy parts concerning Big Y-DNA test

"Saying Goodbye

This is nearly the final haplogroup G newsletter I will be sending out. Someone is welcome to take over the task.

There are multiple reasons. This was was started multiple years ago primarily to let G Project members know of the availability of new tests to better define one’s place in the G tree. This worked well the first several years, but the last three newsletters have resulted in zero orders resulting from the newsletter for important new SNPs important to large numbers of project members. Thousands of dollars were spent identifying these SNPs, and the response has been a huge disappointment. Generous members have allowed us, however, through General Fund contributions to validate new subgroups. In addition, only a few members ask for help in ordering SNPs. As always, many inappropriate tests are ordered though my annual newsletters have tried to combat this with emphasis on free help available.

The second problem is deteriorating relations with the labs. Full Genomes has locked me out of participation in many features there and on Facebook without providing any clear explanation. Earlier it was also difficult getting responses from Thomas Krahn, head of the Family Tree lab, but eventually we accomplished our goals. I have tried corresponding with the new management since August, with little to show for the effort.

The third problem involves the new testing that became available last Saturday at Family Tree DNA, called the Big Y. I will cover this in detail. But this test becomes the big elephant in the room, and the type work I have been doing in identifying new subgroups is generally incompatible with this because I do not know precisely what areas of the chromosome are tested in BigY. Any request for new tests needs to e compatible with those areas

I will be phasing out administratorship of the various projects, and we will have to discuss this in coming weeks with those interested. It is unlikely that someone will want to take on the full G membership. This involves several hours a day work. Logically it would be best to spin off the L293/P16, G1, L497 and M377 subgroups, some of which already have projects. Members of these can be identified in almost every case from 12-marker results. But logic and reality do not always come together because of different interests of different persons. I will be processing all current pending testing, and all testing done at Full Genomes and the 1000 Genomes Project remains 100% useful and will continue to be displayed. Offers for new management are welcome.

The Big Y Test
Family Tree made their Big Y test available last Saturday during the annual conference in Houston, Texas in the United States. Their web site tells customers almost nothing about this test. And as best as I can tell, no details of substance were provided during the conference. However, Thomas Krahn who put the test together before his firing, has provided important information on a public site.

We know this test will sequence 10 million base pairs out of the 20 million mapped Y-DNA sites for about $500. Full Genomes charges $1250 and provides all 20 million plus other testing. Thomas included among the 10 million sites all the 25,000 SNPs of which he had record at the time of the assembly. The Geno 2.0 test, in contrast, excludes any SNPs with health connections. This has resulted in some haplogroups being tested for only half the known subgroups. The sequencing method being used is new. Apparently the samples are sequenced simultaneously at a depth of coverage similar to the high-coverage (50x) coverage at Full Genomes. This means a high percentage of the mutations will be identified. Krahn also indicated he was concerned about this method because the barcoding method used in the context of simultaneous processing could lead to crossover from one sample to the other. Family Tree has responded that they would not provide results that have not been proven as to quality (paraphrased). I will be able to tell whether results are truly high coverage and very consistent with what would be expected in any G subgroup.

Thomas Krahn indicated they had already run beta tests at the Illumina lab. And Family Tree is planning to run 15,000 samples in their own lab. I think this info has gone over the heads of everyone. They do not need that many samples for quality checks. Even if they could do this at $100/sample, the cost would be $1.5 million, and that calculation seems quite low since Illumina collects hefty fees for the chemicals used. More likely this represents selected Genographic Project samples to be tested. It is questionable that the GP could pay for these, and this suggests also there may be a corporate sponsor or a grant for these. And this is really the most important news because 15,000 is ten times as many samples as we have been able to assemble in the last three years. Genographic Project also has samples from rare populations which are needed for a truly comprehensive tree. In this context, any tree generated from this will overwhelm anything available and would finally allow GP and Family Tree to provide a tree. In recent years the ISOGG tree has become by default the tree everyone uses, but such a tree allows them to start the process anew. Early subgroups which are suspected to be mere private SNPs can be eliminated. And the lab manager on Sunday apparently indicated that they will be adding all new results, including those SNPs found in only one person, into a massive tree. So their proprietary, expanding tree would be the only game in town, so to speak. (my analysis of the situation) Therefore, the SNPs from other sources on which I have been working would become of no interest to them. Even if my analysis is incorrect, there is a great risk this will be the case.

Besides no info presently on the reliability of the testing, how they will assemble the data is also a big question mark. There have been huge problems with the output from the Genographic Project. Besides the lack of any tree corresponding to their reports, there were hundreds of items being reported to practically everyone which were artifacts, no assembly of the mass of SNPs into anything organized, and the persistence of displayed trees and recommendations at Family Tree that were years out of date. Family Tree hired the staff of DNA analysis company, Arpeggi, over the summer. But the task of assembling a valid tree based on sequencing data is quite a task. The Geno Project reported wrong categories for a few persons due to a “no read” situation for an item they had chosen though equivalent items to the no-read were all positive for that subgroup. So the software has to ignore missing results due to no reads to still assemble a valid tree. This requires inference software, which has been used for several publications. And inconclusive reads have been problems in other sequencing methods as well. It can be inferred often that an inconclusive item is more likely positive than negative by comparisons. Then there are the mutations that pop up in multiple haplogroups. These are often unstable sites, and it would seem they plan to omit these sites. But given enough samples, more and more sites will pop up in multiple haplogroups. So there are significant challenges. There may or may not be a Y tree of confirmedsites for which anyone can request tests, but if they cannot document a wide coverage new tests will be rejected. So they want us to pay to confirm all branches. It is possible that the big donor may have included funds to validate the subgroups initially found with the 15,000 samples.

The fragments in the new test corresponding to known SNPs are 100 base pairs long. So 100 times the 25,000 SNPs they are included is 2.5 million base pairs. That leaves 7.5 million additional base pairs available for identifying new mutations in each sample. This is about one mutation every 200 yrs in the full 10 million base pairs.

We do not know whether all the known G SNPs (or equivalents) will be included or not. Most of the newer ones we added since January come from the Geno 2.0 test so we know those are included.

Since last Saturday, 8 of the 3400 G Project members have ordered Big Y tests. None asked me about the test, and so they were ordering blind in most cases. Many have not have much testing otherwise. Extrapolating this G Project figure throughout Family Tree, the total number ordered this first week could be 800. In the years that the partial sequencing program (Walk through the Y) existed, only about 500 total were ordered for a similar price. One big difference here is that Family Tree is prominently displaying Big Y as an item to be ordered, and the Walk through the Y was a word of mouth item which had to be requested. I would not think this volume of orders will continue. There is initially a surge when a new test is announced.

One additional consideration is that Family Tree is unlikely to provide results for all 10 million sites to the customers. Likely it will only be 25,000 known SNPs and new items specific to the tested person. Thus the customer may not know whether he has a no-read, inconclusive or negative at a site that may become important after additional tests or in the current testing. Full Genomes has a way to handle this for the current batch, and one can get a copy of the 10 gigabyte full data file for future querying with the relevant software.

So the Big Y has considerable potential, but we would like to check the first results for reliability and hope that the tree-type data to be generated is a big improvement over Geno 2.0.

Family Tree Sale and What Happened to 12 markers?
Family Tree has a sale to the end of December which dramatically reduces the cost of 37 markers and some other tests. But for the first time, the 12-marker test has disappeared. It would seem so far that customers will have to start with 37 markers.

News from Full Genomes
This company had problems with the BGI lab in the spring. There was a huge delay in getting results. There have been two batches received from BGI since then. Apparently they are now using UCLA for sequencing, but no results from there. One customer told me he has a message that his kit was received, but there was no info they had shipped it for processing. They may be just waiting to collect more kits for the batch. I do not know of any G men who have ordered Full Genomes testing since the summer.

Full Genomes did announce they will begin their own program of $39 type individual tests, but the price could be different.

Direct competition from Family Tree DNA’s Big Y for the sequencing is likely to have some impact on their business.

News from the Genographic Project
Spencer Wells who heads this organization in recent interviews has indicated that a replacement for Geno 2.0 will be offered next year. Hopefully this will be accompanied for the first time by a corresponding Y tree. He indicated that the Geno test will not be sequencing until the price could drop to $100. I actually find it odd that the Genographic Project consented to the Big Y testing because it is somewhat in competition with Geno 2.0 though not for a large number of tests.


When I heard about the new FTDNA offer, I thought I will wait until the end of the month to take a decision and I read some commentaries in several forums. They insisted about the precipitated and improvised character of the announce and showed the incertainty about delay and quality of the results as many newly launched products in the new technologies .
Altogether the curiosity, the wish to be in the first testees (not to be the first one but not to be delayed for a lot ofl months), the reputation of FTDNA and the desire there are some G users quickly in the project, I thought whatever the delays and the unknown quality of the results, the price justifies the risk and I commanded 3 days later after the annpuncement.

I have still to read Ray carefully again to understand the best I can .

I must thank Ray for the lot of work and public relations he has supported for more than 5 years. I hope he will be consoled quickly and to be proud of all he gave us.

PostPosted: Sat Nov 16, 2013 3:40 am
I ordered Big Y for my maternal uncle in haplogroup G. He is in his 80s and has no biological sons. I figure I have to get all the information (Y-DNA results) I can out of him while he is still with us.
G P303+
PostPosted: Wed Nov 20, 2013 7:42 pm
I am sad to see Ray go....I hope all the old timers from Dna-forums end up in the same place again somewhere.
I don't know if I should do the Big Y. I did the WTY and ended up with two "new" SNPs which ended up being common to all G-men.
I still bear the P303* designation. But burning another $500 for not much to show for seems unwise. I should save fup for the FullY, or for prices to drop further in a couple of years.

PostPosted: Fri Nov 22, 2013 9:56 am
soulblighter wrote:I did the WTY and ended up with two "new" SNPs which ended up being common to all G-men.
I still bear the P303* designation. But burning another $500 for not much to show for seems unwise.

If you did the WTY, you should have received a $50 coupon, to make the Big Y only $445.

Whereas WTY examined hundreds of thousands of locations, the Big Y will examine at least 10 million locations. One projection is that it will find roughly one SNP for every 150 years of history. For many of us, that is as much as we will ever need.

