Simons Foundation and WuXi NextCODE Put World’s Largest Autism Dataset Online to Spur New Era of Massive Genomics Research
Mar 10, 2016
The Simons Simplex Collection is now accessible in the cloud with WuXi NextCODE clinical discovery analytics, uniting the genome and the internet for real-time queries and collaboration
- Comprising 10,000 exomes from families with one child with an autism spectrum disorder, the SSC is the largest dataset ever to be made fully usable over an ordinary internet connection
- SSC data can instantly be used in tandem with other major datasets around the world, as a standard reference and the hub for autism genomics research of unprecedented scale
SHANGHAI; CAMBRIDGE, Massachusetts; REYKJAVIK; and NEW YORK, 10 March 2016 – WuXi NextCODE, the global genomic information and precision medicine company, and the Simons Foundation today announced that the Simons Simplex Collection (SSC) is now accessible on WuXi NextCODE’s integrated cloud-based database, interpretation and discovery system, the WuXi NextCODE Exchange. The WuXi NextCODE SSC portal has been inaugurated by autism researchers from seventeen leading institutions from the US, Canada, China, France, Iceland, Austria, Ireland, Brazil and Qatar, and from today will be open to researchers worldwide. Those interested can apply for training and access via the WuXi NextCODE SSC signup or by writing to email@example.com.
The SSC comprises genomic sequence and detailed clinical phenotypic data from nearly 2,600 families with one child with an autism spectrum disorder (ASD) and unaffected parents and siblings. It is a fundamental resource for advancing the understanding of ASDs and one of the largest focused collections of genome sequence data anywhere. The computational efficiency of WuXi NextCODE’s database architecture makes it possible to utilize data of this scale online. SSC users can now directly interrogate individual genomes, families, or the entire collection using WuXi NextCODE’s integrated clinical discovery tools. They can also tap into both GATK and FreeBayes variant calls for all samples; view findings with always-on visualizations backed by normalized global reference data; and collaborate with colleagues, all without having to move or download the data files. The data is stored in WuXi NextCODE’s elastically scalable, HIPAA-compliant cloud powered by DNAnexus.
“The SSC was conceived and has succeeded as a large-scale, open-access discovery engine. We are excited to be partnering with WuXi NextCODE to realize the next phase in the SSC’s potential by making it directly accessible to the autism community worldwide,” said Dr. Louis Reichardt, Director of the Simons Foundation Autism Research Initiative (SFARI). “Usable online it can serve as the hub of a network of major autism datasets and virtual cohorts of ever greater power, and we invite everyone in the field to take advantage of it.”
“We are thrilled and proud to be working with the Simons Foundation to inaugurate what we see as a new era of global collaboration to better understand, diagnose and address ASDs,” said Hannes Smarason, COO of WuXi NextCODE. “Putting the SSC on the Exchange is a landmark in creating a working internet of DNA, and it is appropriate that it should address autism. Unravelling its complexity demands the creation of truly vast datasets, and we look forward to working with the Foundation and the autism community to refine this resource, bring in whole genome data, and continue to expand the scale, scope and reach of the SSC in pursuit of this goal.”
“This is a game-changer and we are already using the WuXi NextCODE SSC portal to validate and extend new discoveries and confirm clinical diagnoses,” said Dr. Timothy Yu, a clinician and assistant professor of neurology at Boston Children’s Hospital. “This is the way the investment in big genomics is going to deliver on its potential to accelerate our understanding of autism and many other complex conditions. Since we and a growing number of our collaborators have our research and diagnostic data in GOR format, we are already seeing the impact of big virtual cohorts for rapidly advancing the field and are inviting our collaborators to do the same.”
The SSC comprises whole-exome sequence data and more than 2000 phenotypic variables for some 2,600 ASD probands and their parents and unaffected siblings. Among the portal’s features are:
- All SFARIGene and other major ASD gene and variants lists
- All major public reference datasets
- Instant visualization of raw BAM sequence reads
- Variant aggregation to power statistical association of rare variants
- De novo, paralog detection
- Carrier analysis
- Toggle filters for predicted variant impact and allele frequencies
- Histographic selectors and report builders for phenotype definition
- Import and merge functionality to incorporate external datasets
About WuXi NextCODE
WuXi NextCODE is a genomic information company applying sequence data to deliver better health and precision medicine for people around the world. Our uniquely comprehensive open-access capabilities include CLIA- and CAP-certified sequencing; a novel database architecture that mines and manages more genomes than any other; the world’s leading genome interpretation and discovery system, available installed or in the cloud; a pioneering internet of DNA that enables users to query and collaborate using massive genomic datasets online with unrivalled resolution and efficiency; the know-how to apply genomics to optimize drug discovery and development; and a growing range of tests and scans to improve rare disease diagnosis, targeted cancer treatment, and wellness. With offices in Shanghai, Cambridge, Massachusetts and Reykjavik, we serve companies and health systems, clinicians and researchers, and people and populations worldwide. WuXi NextCODE is a subsidiary of WuXi AppTec, the open-access R&D capability and technology platform company serving the pharmaceutical, biotechnology, and medical device industries, with operations in China and the United States.
The Simons Simplex Collection (SSC) is a unique, rigorously characterized data collection designed to support the discovery of rare, de novo genetic events that increase risk of developing autism spectrum disorders. The collection consists of 2,600 ‘simplex’ families, all of which have one child with autism, unaffected parents, and usually at least one unaffected sibling. SSC biospecimens and phenotypic and genetic data are available to approved researchers via SFARI Base or by emailing firstname.lastname@example.org. Sequencing of SSC data has already yielded 100 candidate genes for autism.
The SSC is supported by SFARI (Simons Foundation Autism Research Initiative), whose mission is to improve the understanding, diagnosis and treatment of autism spectrum disorders by funding innovative research of the highest quality and relevance.
Edward Farmer Stacey Greenebaum
NextCODE Health Simons Foundation
(781) 775 6206 (212) 524 6097