We are happy to announce that our paper "Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls" is now published online in Nature Biotechnology at http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.2835.html. For those of you using our NIST-GIAB high-confidence benchmark calls, we ask that you cite this manuscript rather than (or in addition to) our arXiv pre-print of this manuscript.
Thank you to all of you who attended our workshop January 27-28, 2014 at Stanford University. Your contributions to Genome in a Bottle are very much appreciated! Below, we have a summary of the topics discussed at the workshop, including links to slides with more details.
Update and Consortium Progress
- NIST plans to release its pilot whole genome Reference Material (RM 8398), based on a large batch of ~8000 vials of 10ug of NA12878 DNA from Coriell, by May 2014
I wanted to let you know of the availability of a gold standard variant data set for NA12878, NA12877 and their 11 offspring developed by RTG based on their consistency with the haplotype phases inferred for this family. Also available now is the free rtgTools package to allow comparisons of VCF calls with a gold standard.
Several of you helped us to characterize the pilot candidate NIST Reference Material based on NA12878, and that data has started to be uploaded to our ftp site. We're starting to learn a lot of interesting things from this informal interlaboratory study about variability within and between labs and instruments. We are now asking for volunteers to characterize the next set of 4 candidate NIST Reference Materials from the Personal Genome Project.
In preparation for our upcoming Genome in a Bottle Consortium workshop January 27-28 at Stanford University, we would like to welcome nominations for Steering Committee members. To help make decisions for the priorities of the Consortium, we plan to have our first steering committee meeting at the end of the workshop from ~4pm to 5:30pm on January 28. If you would like to nominate yourself or someone else, please respond to this email or email Justin Zook and Marc Salit. Also, registration and an agenda for the upcoming workshop will be posted tomorrow.
We’ve uploaded a new version (v2.18) of our NIST highly confident snp and indel genotypes for NA12878 to the GIAB ftp site at NCBI (ftp://ftp-trace.ncbi.nih.gov/giab/ftp/data/NA12878/variant_calls/NIST). This new version contains some refinements around complex variant calling, and includes an Ion Proton whole genome sequencing dataset in the integration.
For those of you using or interested in using AWS to analyze GIAB data on the cloud, we are happy to announce that data from the Genome in a Bottle ftp site at NCBI is mirrored on the Amazon S3 storage in the bucket s3://giab. Please let us know if you have any questions or suggestions.
Jason Wang from Gene by Gene (formerly from Arpeggi, Inc.) recently uploaded the new version 2.18 release of the NIST Genome in a Bottle highly confident SNP and indel genotypes on the freely available GCAT website (www.bioplanet.com/gcat).