This update includes a recent version of hap.py which (in combination with vcfeval) has been selected by the GA4GH as the recommended tool for small variant call benchmarking. For more details, see the publication at: https://www.ncbi.nlm.nih.gov/pubmed/30858580
The upgraded version of the MiLaboratories LLC flagship software product, MiXCR, is now officially available as an Illumina BaseSpace™ application.
is a “gold standard” analytical package in the area of T-cell receptors (TCR)
and immunoglobulin (IG) repertoire profiling. Analysts apply MiXCR for
extracting immune repertoires from any type of sequencing data with any level of TCR/IG
coverage, ranging from perfectly enriched libraries such as multiplex
PCR or targeted 5’RACE, to the “rare event”
datasets containing several target entities among hundreds of millions of reads,
such as RNA-Seq, and even Exome-Seq data.
the new version, simple selection of species (Human or Mouse), template
material (RNA or DNA), library type (targeted or random), and basic information
on library preparation enables appropriate analysis settings for a variety of immune
repertoire experiment scenarios:
Extraction of both full VDJ length or CDR3 only repertoires is possible, for TCR or IG chains of interest, with or without out-of-frame and stop codon-containing clonotype variants:
The application also provides post-analysis metrics in the form of interactive reports, including:
High extraction efficiency for any type of sequencing data and superb accuracy should make the new MiXCR BaseSpace version a highly useful resource for many. The ability to look at basic parameters and immediately download the resulting figures for reports and publication make it really convenient for the most efficient everyday work on immune repertoires.
Analysis of amplicon data
Recommended settings for panels (a) Immune repertoire panel and (b) TCR-beta SR panel:
Starting material: RNA for panel (a), RNA or Genomic DNA for panel (b)
Author: Eric Allen, Associate Director of Bioinformatics at Illumina
As part of the new DRAGEN v3.4 launch, the Illumina software development team has released a new BaseSpace-exclusive DRAGEN app –DRAGEN Enrichment v3.4. Combining the best of DRAGEN with Illumina’s legacy Enrichment 3 App, the DRAGEN Enrichment app provides ultra-rapid analysis and improved accuracy all at a lower cost per sample.
The DRAGEN Enrichment app is the preferable method for analyzing enrichment data with DRAGEN, delivering a full suite of enrichment specific metrics and reporting.
Here is what to know:
The DRAGEN Enrichment App is faster and more accurate vs Enrichment (Isaac/Starling) and BWA Enrichment (BWA/GATK) apps, as demonstrated via the visuals below
Small variant calling – The app includes germline and somatic (low-frequency) small variant calling (tumor only); outputs VCF and gVCF in same analysis
Note: Tumor-normal analysis can be conducted by first running the DRAGEN Enrichment app on all their normal and tumor samples, and then running the DRAGEN Somatic app on the resulting BAM files for the Tumor/Normal pairs.
Copy number variant (CNV) calling – utilize CNV baseline files based on a panel of normals
Structural variant calling
Enrichment metrics generated:
Read/base enrichment padded/unpadded
% bases covered at 1x, 10x, 20x, 50x
Picard HsMetrics enabled by checkbox
Variety of reference options supported, including hg19, GRCh38 and custom references
Includes built-in targeted region BED files for common enrichment panels, and accepts custom targeted region BEDs
In-browser, PDFs, and CSVs
Single sample and aggregate reports
Integrated variant annotation (Nirvana) and variant browser
The improved small variant calling over other available BaseSpace app solutions is shown below for one replicate of Coriell sample NA12878 with 106x depth:
CNV calling is also enabled in the DRAGEN Enrichment app. The screenshot below from IGV shows a 937,697 bp CNV loss found in a melanoma cancer sample (Me01/ERR174231) around the chromosomal region chr9:125239269-126176965. The sample data was obtained from NCBI’s Sequence Read Archive (accession ERR174231) using the SRA Import BaseSpace App.
Somatic/low-frequency variant calling is also enabled. The table below demonstrates the usefulness of this somatic calling tool:
We’ve also incorporated many of the comprehensive metrics and reporting features built into the legacy Enrichment 3.1.0 app, including read-, base-, and target-level enrichment metrics, as well as the variant table for simple variant call browsing and filtering.
We hope this update enables you to discover new insights. Stay tuned for more app announcements, and let us know if you have any questions.
FOR RESEARCH USE ONLY. NOT FOR USE IN DIAGNOSTIC PROCEDURES.
We’ve been busy over the last few months! Back in May, Illumina announced the acquisition of Edico Genome and the DRAGEN™ (Dynamic Read Analysis for GENomics) technology. Since then, we have been hard at work expanding DRAGEN’s capabilities to provide more advanced, robust and performant pipelines for our customers. With the inclusion of DRAGEN into the Illumina ecosystem, we are now able to take advantage of the expertise of both teams to build out an expanded chest of tools that offer added functionality, benefits and ease-of-use.
The team has come a long way since we last published about DRAGEN on the BaseSpace™ Blog, and we are excited to share some insight into what we have been working on. Over the coming months, we will continue to post about our latest updates and activities to keep you updated.
Earlier this month, we released DRAGEN v3.2.8, which introduces a variety of new capabilities designed to deliver more insights from your data.
Join us for our upcoming webinar: High-volume sequence analysis with BaseSpace™ Sequence Hub and Edico DRAGEN apps, on Dec 13 at 10AM (PT)
The latest sequencing technologies enable unprecedented throughput and redefine limits for many labs. To adapt, these labs must redefine how they work – by automating tasks to reduce touchpoints and by simplifying workflows with integration and robust analysis tools.
In this webinar, we describe BaseSpace™Sequence Hub and how the newest features support high throughput, high-volume sequencing. We demonstrate how customers can progress from flowcell loading to variant analysis with zero touchpoints by using the Whole Genome Sequencing or Edico DRAGEN apps. Additionally, we describe how the integration with BaseSpace™ Variant Interpreter enables users to interpret and generate reports of identified variants.
Bioinfomatics tools are a key component in the Next-generation Sequencing (NGS) workflow and can have a significant impact on the results. Alignment and variant calling, in particular, involve complex algorithms, each with unique strengths and weaknesses. The Broad Institute’s BWA+GATK application is among the most popular, but over the last few years more alignment+variant calling methods have been released by companies including Illumina, Edico Genome, and Sentieon. With the emergence of multiple methods comes a clear need for comparison between the results obtained by these methods so that people who use these tools can select the best one for their purpose.
The new Hap.py app available on BaseSpace Sequence Hub enables users to compare diploid genotypes at the haplotype level by generating and matching alternate sequences in a small region of the genome that contains one or more variants. Hap.py makes it easy to compare any variant call set against a range of packaged gold-standard truth sets1,2 to perform routine benchmarking.
Removing the NGS Analytics Data Bottleneck with Field-Programmable Gate Arrays (FPGAs)
The following is a guest blog, written by our partners at Edico Genome.
The next-generation sequencing (NGS) analysis demand is growing at an exponential rate, creating a shortage of computing power to analyze the rapidly growing body of data. Current projections1 calculate genomic data to continue doubling every seven months, a stark acceleration in comparison to Moore’s Law, which states CPU capabilities will double every two years (Figure 1, below). The void left in-between creates a bottleneck for genomics labs.
Providing an alternative to traditional CPU-based systems, Edico Genome’s DRAGEN™ (Dynamic Read Analysis for Genomics) Platform leverages FPGA (Field-Programmable Gate Array) technology to provide customers with hardware-accelerated implementation of genome pipeline algorithms. Leveraging FPGAs, DRAGEN allows customers to analyze NGS data at unprecedented speeds with extremely high accuracy2 onsite, in the cloud, or through a blended hybrid cloud.
BaseSpace Sequence Hub, hosted on Amazon Web Services, enables the cloud-based deployment of the Edico Genome DRAGEN pipeline. Edico Genome’s DRAGEN Genome Pipeline is now readily available, enabling rapid analysis of whole genome sequencing and targeted resequencing panels.