STR data from the 1K Project

Moderators: Rootsy, GregRM


Posts: 70
Joined: Wed Mar 14, 2012 7:43 pm
Location: Maine
YDNA:
R1b-M157.2 (under Z6
MtDNA:
J1c2g
PostPosted: Tue May 08, 2012 9:45 am
I noticed on another forum where someone had teased out STR data on 17 markers from some individuals 1K results. It appears these results were from Haplogroup I.

I am wondering if some STR data may exist in other 1K project data in other clades. I am working on the SNPs found in this project below R1b-L48 and even having a partial STR result for some of these SNPs would be helpful.
Y Haplogroup: R1b - U106 - L48 - Z8 - Z6 - M157.2 (private sub-clade of Z6). 111 markers tested, Palindromic markers tested
mt Haplogroup J1c2g Full Mitochondrial test with an additional (private) mutation at 9974T
YSearch & Mitosearch ID: 3HSPN

Posts: 26
Joined: Sat Mar 17, 2012 10:48 am
PostPosted: Fri May 11, 2012 12:54 am
There is interest in seeing the DYF371 results for those samples that appear to be Z326+ null425's. Does the DYF371 missing T allelle confirm the null425 status?
From FTDNA's y chromosome browser this is what would be looked at. Remember that the HUGO sequence is apparently null425.
DYF371_1 repeat ChrY:18485602..18485796 n/a
DYF371_2 repeat ChrY:18662760..18662954 n/a
DYF371_3 repeat ChrY:24589107..24589310 n/a
DYF371_4 repeat ChrY:26191891..26192097 n/a

There is interest in DYF371 results for the following 4 sequences.
Z9+
NA20524
Z326+
HG01274
HG01550
NA20524

Can someone do these comparisons??? Thanks!
R1b-Z319+, Z325+, CTS2509+ , L188+ Previously known as the R1b-Z326 null425 cluster
User avatar
Posts: 287
Joined: Mon Apr 23, 2012 6:15 pm
PostPosted: Thu Aug 23, 2012 10:20 pm
Wing_Genealogist wrote:I noticed on another forum where someone had teased out STR data on 17 markers from some individuals 1K results. It appears these results were from Haplogroup I.
I am wondering if some STR data may exist in other 1K project data in other clades. I am working on the SNPs found in this project below R1b-L48 and even having a partial STR result for some of these SNPs would be helpful.

May I push this thread, as I'm also interested if generally it is possible to extract DYS-Values from 1K Genomes Y-data?

EDIT 10 Sept.: As I understood it from some sources 1K-Genome coverage is probably not high enough to get enough STR information for most of the the DYS loci. With high coverage data (see Complete Genomics) this maybe changes.
DNA/Admixture Central Europe (Alps, Tyrol, Dolomites, Raetia); Y-DNA J2a-L1064, J2a-L210, R1a-M17, R1b-U106 (L48-); mtDNA J1b1b, J1c1d, U5a2b2, U5b1b1. Projects : J2-M172, J2a-PF5197, Alpine DNA, ISOGG Wiki
User avatar
Posts: 287
Joined: Mon Apr 23, 2012 6:15 pm
PostPosted: Fri Nov 02, 2012 1:13 pm
I have asked Anneleen Van Geystelen (PhD student KU Leuven, AMY-tree software) if public sequences of Y chromosome like the 1K Genome are good enough in the coverage to extract STR-DYS-values. Her answer:
Anneleen Van Geystelen wrote:even with a high sequencing coverage it is very difficult to predict correctly small insertions and deletions. Therefore, we didn't include them in AMY-tree.

The prediction of STRs is even harder because the repeats make the mapping of the reads to the reference extremely difficult and this mapping is the basis for the variant calling.

So, it is almost impossible to extract the correct STR-DYS values at the moment. I hope that technology evolves fast such that the quality of indel and STR-DYS value predictions will be good enough to incorporate in AMY-tree
DNA/Admixture Central Europe (Alps, Tyrol, Dolomites, Raetia); Y-DNA J2a-L1064, J2a-L210, R1a-M17, R1b-U106 (L48-); mtDNA J1b1b, J1c1d, U5a2b2, U5b1b1. Projects : J2-M172, J2a-PF5197, Alpine DNA, ISOGG Wiki
User avatar
Posts: 123
Joined: Wed Mar 14, 2012 9:00 pm

YDNA:
J1-L858
MtDNA:
HV1a'b'c
PostPosted: Fri Nov 02, 2012 1:27 pm
Thank you Chris for posting this answer!
Paternal: J1-L858 (Z643+ P58+ L858+ L620+ L572+ L147.1+ L817+ Z644- Z642- Z641- Z640- M369- M368- M367- L93- L92- L897- L859- L829- L65.2- L616- L615- L585- L222.2- L174- L1279- L1253- YSC0000076-)
FTDNA Kit: E13058, Full Genomes Kit: FG1014
Maternal: HV1a'b'c
Father's Maternal: T1*
Mother's Paternal: R1b-Z142 (Z49+ Z142+ L2+ Z367- M228.2- L562- L553- L552- L21- L20- L196- L193-)
FTDNA Kit: E14371
User avatar
Posts: 182
Joined: Wed Mar 14, 2012 12:30 pm

YDNA:
Z1297*
MtDNA:
J1c5a*
PostPosted: Fri Nov 02, 2012 2:16 pm
40-70% of STR-loci can be extracted in low and medium coverage :)
M102+ Project

---
Grandpa: R1a-L366
User avatar
Posts: 284
Joined: Wed Mar 14, 2012 2:51 pm
Location: Russia
YDNA:
R1a [CTS3402]
MtDNA:
U4a2g (FMS)
PostPosted: Fri Nov 02, 2012 5:42 pm
Maximus wrote:40-70% of STR-loci can be extracted in low and medium coverage :)

Yes. This is a real number. :)
User avatar
Posts: 182
Joined: Wed Mar 14, 2012 12:30 pm

YDNA:
Z1297*
MtDNA:
J1c5a*
PostPosted: Fri Nov 02, 2012 5:50 pm
Cofgene wrote:Does the DYF371 missing T allelle confirm the null425 status?

No, mutation T to C in all DYS371 copys.
M102+ Project

---
Grandpa: R1a-L366
User avatar
Posts: 287
Joined: Mon Apr 23, 2012 6:15 pm
PostPosted: Sat Nov 03, 2012 1:40 am
Maximus wrote:40-70% of STR-loci can be extracted in low and medium coverage :)

That sounds good but also like a lot of manual raw data work. Is there somewhere a tutorial online how to do this? I think if all major Haplogroup Admins can do this, it would help new SNP testing in projects very much.
DNA/Admixture Central Europe (Alps, Tyrol, Dolomites, Raetia); Y-DNA J2a-L1064, J2a-L210, R1a-M17, R1b-U106 (L48-); mtDNA J1b1b, J1c1d, U5a2b2, U5b1b1. Projects : J2-M172, J2a-PF5197, Alpine DNA, ISOGG Wiki

Posts: 143
Joined: Sun Mar 18, 2012 6:26 pm
Location: Paris region
YDNA:
G2a2b2a1b1a2a-CT4803
MtDNA:
H2a2a1 (rCRS)
PostPosted: Sat Nov 03, 2012 5:14 pm
ChrisR wrote:I have asked Anneleen Van Geystelen (PhD student KU Leuven, AMY-tree software) if public sequences of Y chromosome like the 1K Genome are good enough in the coverage to extract STR-DYS-values. Her answer:
Anneleen Van Geystelen wrote:even with a high sequencing coverage it is very difficult to predict correctly small insertions and deletions. Therefore, we didn't include them in AMY-tree.

The prediction of STRs is even harder because the repeats make the mapping of the reads to the reference extremely difficult and this mapping is the basis for the variant calling.

So, it is almost impossible to extract the correct STR-DYS values at the moment. I hope that technology evolves fast such that the quality of indel and STR-DYS value predictions will be good enough to incorporate in AMY-tree

First,
What is AMY-tree.
Then
I don't understand what is the subject really.
The commercial companies give STR results of a lot of STRs in Y-chr with a very big accuracy and seem not to get any problem of mapping the STR reads to know the number of repeats for a known STR.

Maximus wrote "40-70% of STR-loci can be extracted in low and medium coverage" , I suppose he spoke about the chance to discover an unknown STR in autsomal DNA, but did he speak of the same thing that Anneleen Van Geystelen spoke ?
Next

Return to 1K Genomes Project

Who is online

Users browsing this forum: No registered users and 1 guest

cron