Archive by Author | Casey Geaney

Edico Genome’s DRAGEN Bio-IT Platform Now Available on BaseSpace® Sequence Hub

Removing the NGS Analytics Data Bottleneck with Field-Programmable Gate Arrays (FPGAs)

The following is a guest blog, written by our partners at Edico Genome.

The next-generation sequencing (NGS) analysis demand is growing at an exponential rate, creating a shortage of computing power to analyze the rapidly growing body of data. Current projections1 calculate genomic data to continue doubling every seven months, a stark acceleration in comparison to Moore’s Law, which states CPU capabilities will double every two years (Figure 1, below). The void left in-between creates a bottleneck for genomics labs.

Picture1

Figure 1

Providing an alternative to traditional CPU-based systems, Edico Genome’s DRAGEN™ (Dynamic Read Analysis for Genomics) Platform leverages FPGA (Field-Programmable Gate Array) technology to provide customers with hardware-accelerated implementation of genome pipeline algorithms. Leveraging FPGAs, DRAGEN allows customers to analyze NGS data at unprecedented speeds with extremely high accuracy2 onsite, in the cloud, or through a blended hybrid cloud.

BaseSpace Sequence Hub, hosted on Amazon Web Services, enables the cloud-based deployment of the Edico Genome DRAGEN pipeline. Edico Genome’s DRAGEN Genome Pipeline is now readily available, enabling rapid analysis of whole genome sequencing and targeted resequencing panels.

How does DRAGEN work?

Unlike conventional CPU-based systems that inefficiently execute lines of software code to perform an algorithmic function, FPGAs implement these algorithms as logic circuits, providing an output almost instantaneously. In addition, these logic circuits are replicated thousands of times, allowing for massive parallelism, unlike CPUs which are limited to running only one task per core. FPGAs are also fully reconfigurable, enabling customers to switch between functions and pipelines within seconds. See Figure 2.

Picture2

Figure 2

As a result, FPGA-based solutions, such as DRAGEN, can deliver high accuracy while functioning with exceptional speed, efficiency, and parallelism.

DRAGEN can process an entire human genome at 30x coverage in about 20 minutes, as compared to over 20 hours using a traditional CPU-based system3. Edico Genome’s partnership with Rady Children’s Institute of Genomic Medicine is a testament to how DRAGEN’s FPGA platform is revolutionizing genomic testing. Utilizing DRAGEN, Dr. Stephen Kingsmore set a Guinness World Record in 2016 for the Fastest Genetic Diagnosis for successfully diagnosing a critically ill newborn in 26 hours.4

Such results are consistent among DRAGEN customers, and are now readily available directly through  BaseSpace Sequence Hub with the DRAGEN Genome Pipeline. Other widely adopted pipelines for Transcriptomes, Methylation, Cancer and more, will all be made available on BaseSpace Sequence Hub.

To learn more about DRAGEN, visit www.edicogenome.com/basespace. And to learn more about BaseSpace Sequence Hub, contact us.

For Research Use Only. Not for Use in Diagnostic Procedures.

References

  1. Stephens ZD, Lee SY, Faghri F, et al. Big Data: Astronomical or Genomical? PLoS Biology. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4494865/. Published July 2015. Accessed July 31, 2017.
  2. Miller NA, Farrow EG, Gibson M, et al. A 26-hour system of highly sensitive whole genome sequencing for emergency management of genetic diseases. Genome Medicine. https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-015-0221-8. Published September 30, 2015. Accessed July 31, 2017.
  3. Miller NA, Farrow EG, Gibson M, et al. A 26-hour system of highly sensitive whole genome sequencing for emergency management of genetic diseases. Genome Medicine. https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-015-0221-8. Published September 30, 2015. Accessed July 31, 2017.
  4. Genome, E. (2017). Dr. Stephen Kingsmore Sets Guinness World Records Title for Fastest Genetic Diagnosis. [online] Prnewswire.com. Available at: http://www.prnewswire.com/news-releases/dr-stephen-kingsmore-sets-guinness-world-records-title-for-fastest-genetic-diagnosis-300256566.html [Accessed 21 Jun. 2017].

Introducing TruSeq® Amplicon 3.0

Also co-authored by Eric Allen.

Recent advancements in the Illumina TruSeq Amplicon technology enable higher multiplexing of amplicons in a single assay. Combined with next-generation sequencing (NGS) from Illumina, NGS users can perform high throughput, high sensitivity genotyping experiments on Illumina Sequencers. The new TruSeq Amplicon 3.0 BaseSpace® Sequence Hub App introduces major improvements to support a variety of amplicon sequencing applications, including the recently launched TruSeq Genotype Ne product. TruSeq Genotype Ne is a fully customizable targeted genotyping by sequencing (GBS) solution. Key GBS features of TruSeq Amplicon 3.0 include:

  • Support for custom reference genomes, allowing a user to analyze amplicon data against their choice of FASTA file (previously uploaded to Sequence Hub).
  • Genotypes of Interest reporting, allowing a user to generate a tabular report of genotypes for each sample, which is analogous to genotyping array outputs.

Example usage of the Genotypes of Interest feature can be found in the example Project below. The Input VCF (variant call file) in this Project (found in the test_NA12878_GOI output files) can be used as a template and customized for use with other datasets.

We also upgraded the TruSeq Amplicon 3.0 analysis pipeline to include:

  • Improved alignment and variant calling. Note that we have removed the outdated GATK v1.6 and Starling germline variant callers, replacing them with the new Pisces germline variant caller (optimized for amplicon data).
  • Improved variant annotation engine
  • Improved QC metrics engine
  • Support for up to 384 samples in a single analysis run

The improved small variant calling using the Pisces germline caller is demonstrated below for two data sets:

  • Coriell sample NA12878 with an internally developed panel that contains several challenging indels run on NextSeq®.
  • Coriell sample NA12878 with the TruSight® Myeloid Panel and sequenced on MiSeq®.

NextSeq® 550: TruSeq Amplicon (Replicates of NA12878)

https://basespace.illumina.com/s/tMYeNm6DLbaS

VCF BED SNV Recall SNV Precision Indel Recall Indel Precision
TSA 2-0 – GATK TSAVP A v3 98.91% 98.91% 91.30% 100.00%
TSA 2-0 – Starling TSAVP A v3 94.57% 100.00% 43.48% 86.96%
TSA 3-0 – Pisces TSAVP A v3 98.91% 100.00% 93.48% 100.00%

MiSeq™ v3: TruSight® Myeloid (Coriell & HorizonDx, Pool 1)

https://basespace.illumina.com/s/6r2D3PnfkKQS

VCF BED SNV Recall SNV Precision Indel Recall Indel Precision
NA12878S6-TSAv2 TruSight Myeloid v1.0 97.22% 92.11% 25.00% 28.57%
NA12878S6-TSAv3 TruSight Myeloid v1.0 94.44% 100.00% 87.50% 100.00%
  • Variant calls were compared by analyzing the same sample replicate with the TruSeq Amplicon v2.0 and v3.0 apps, and then using the Variant Calling Assessment Tool v3.0 app for accuracy assessment vs Platinum Genomes gold reference variant calls.

We hope this update enables you to discover new insights. Stay tuned for more app announcements, and let us know if you have any questions.

Advancing cancer research with new BaseSpace® Sequence Hub Apps 

Analyzing the genetic basis of a given tumor is important for understanding the progression of cancer and developing new methods of treatment. Cancer researchers use a variety of methods but none of them efficiently cover all of the variations present in our genes. To help researchers address this challenge, Illumina offers TruSight® Tumor 170, a next-generation sequencing (NGS) assay designed to cover 170 genes associated with cancer.

To help TruSight Tumor customers analyze data from this assay, we are excited to announce two new apps in BaseSpace Sequence Hub:

Additionally, we have made a significant update to the Tumor Normal app. All 3 apps expand our portfolio of cancer research applications by delivering advanced, new methods NGS data generation and analysis.

TruSight Tumor 170 App

The TruSight Tumor 170 app enables streamlined analysis of samples prepared using the TruSight® Tumor 170 library prep kit. This comprehensive somatic panel targets 170 genes and is based on hybrid capture technology optimized for Formalin-Fixed, Paraffin-Embedded (FFPE) samples. By using both DNA and RNA input sample pairs, TruSight Tumor 170 can detect small variants (Single Nucleotide Polymorphisms (SNPs) and Insertion/Deletions (InDels), amplifications, structural variants (gene fusions), and splice variants. The TruSight Tumor 170 app performs alignment and variant calling for all variant types in a single analysis workflow, and can analyze up to 16 samples (both DNA and RNA) in a single run.

TruSight Tumor 170 + Watson for Genomics Converter App

To efficiently extract information from a TruSight Tumor 170 sample, Illumina has partnered with IBM Watson for Genomics to expedite variant analysis. By leveraging natural language processing (a form of artificial intelligence), Watson for Genomics delivers an annotated, prioritized variant summary containing curated information on the significance of variants detected by sequencing. This curated information includes drug guidelines, clinical trials, and literature matches. TruSight Tumor 170 customers have the option to purchase add-on access to Watson for Genomics when buying the library prep kit. The TruSight Tumor 170 + Watson for Genomics app converts the output from the standard TruSight Tumor 170 app into variant calling files with a format suitable for upload into the Watson for Genomics portal. This app does not have a compute cost, but upload into Watson for Genomics requires purchase of the add-on license.

Tumor Normal App

Lastly, we have updated the Tumor Normal app to version 4.0. This app has improved performance, while providing updated variant callers for more accurate detection of somatic variants. As with version 3.0, the Tumor Normal app can detect small variants (SNPs and InDels), structural variants (gene fusions), and copy number variants (amplifications and deletions).

For more information, contact us.

For Research Use Only. Not for use in diagnostic procedures.