Rounding out 2014 with new apps for the BaseSpace platform

We are looking forward to 2015 as we will continue to launch new Apps and support additional applications, but we are excited to close out 2014 with the release of three new Illumina Core Apps in BaseSpace:

image

The Amplicon-DS App enables analysis of the Illumina TruSight Tumor library prep kit. This solution is specifically design for analysis of all tumor samples, including FFPE. Using targeted TruSeq Amplicon chemistry and a unique, mirrored dual strand (“DS”) assay, researchers can easily detect low frequency somatic mutations. Amplicon-DS also leverages the mirrored dual strand design to reconcile variant calls and capture deamination events due to FFPE, providing confident measurements even in degraded samples.

The Isaac and BWA Enrichment v2.0 Apps add significant functionality over the Enrichment v1.0 Apps. Both Isaac and BWA can now analyze Nextera Rapid Capture Custom panels built in Illumina’s DesignStudio. Isaac Enrichment v2.0 includes Illumina’s own Isaac pipeline for alignment and variant calling. BWA Enrichment v2.0 incorporates the latest aligner, BWA-MEM, which provides improved accuracy (especially when calling structural variants) and increased speed. Both the Isaac and BWA Enrichment v1.0 Apps are available concurrently with v2.0 Apps in BaseSpace Cloud.

In addition to the above Illumina Core Apps we are also launching a BaseSpace Labs App called FASTQ Toolkit v1.0.

image

This App enables the user to have enhanced control over their data, allowing manipulation of FASTQ files including adapter trimming, quality trimming, length filtering, and down-sampling.Users can now down-sample or quality-trim their data and determine what effect that has on their variants, gene expression results, or bacterial classifications. Users could also assess their sample data with the FastQC App and then use that information to optimize their samples with the FASTQ Toolkit v1.0.

Specs for the FASTQ toolkit v1.0 are as follows:

Input- BaseSpace samples (max=200GB per analysis) and user specified parameters that define how the input sample(s) should be processed.

Output- Samples that can be accessed on the “Samples” page of the selected output project. In addition, the App generates a statistics summary file in JSON format that is used to generate the BaseSpace report.

Adapter Trimming-  performed using the approximate matching approach described in TagCleaner. The adapter sequence can be specified separately for the 5′- and 3′-end. Poly-A/T tails are considered repeats of As or Ts at the sequence ends. Trimming them can reduce the number of false positives during database searches, as long tails tend to align well to sequences with low complexity or sequences with tails (e.g. viral sequences) in the database.

Bases can be trimmed from either the 5′- or 3′-end. Alternatively, reads can be trimmed to a maximum read length. Quality trimming on the 3’-end is also available. Note: Aligners such as BWA and Isaac perform trimming internally during alignment. The trimming logic was adapted from BWA.

Down-sampling is performed when only a subset of the sample is needed for an application, such as de novo assembly with memory constraints, or when it is not necessary to process a full sample, like validating an approach at varying levels of genomic coverage.

Filtering- Paired-end reads are only filtered (and removed from the sample) if both reads are filtered out. Otherwise, the filtered mate is replaced by a sequence of Ns (number of Ns will be the minimum read length) to keep the order of pairs in the FASTQ files, which is necessary for many secondary analysis tools.

Nextera Mate-pair conversion- The App supports conversion of Nextera Mate-Pair oriented reads to paired-end oriented reads.

The output of the App contains a set of before and after metrics so you can quickly see the properties of your new data. The table below is an example of the results of down sampling 2,957,468 read pairs to 500,000 read pairs and at the same time performing quality trimming (< Q30) from the 3’ end of the reads.

image

A read length distribution is also provided as shown below for Read 1. The read length distribution provides the distribution of read lengths in your data before and after trimming and allows the user to quickly asses what effect the trimming had on their data.

image

Finally a read filtering summary is provided as shown below. Read filtering will only contain numbers if an option that turns on read filtering such as quality trimming (filters reads < 32 bps) is selected.

image

We are very proud of the hard work our team has put into providing these Apps for the NGS community and look forward to and even more exciting 2015.

5 responses to “Rounding out 2014 with new apps for the BaseSpace platform”

  1. nandita says :

    I’ve a question regarding BWA enrichment apps

    I have three independent data sets from 3 different MiSeq runs, all for the same set of 9 samples. I would like to merge these data and then run BWA enrichment app.

    1) My data sets are of different read lengths (two sets are 150×2 and one set is 75×2)

    2) I realise that I will have to rename the reads the conform to Illuminas Fastq file format standards (Reads must appear to come from same flowcell id).

    However, I’d like to know if I can use the variable read lengths together (75×2 and 150×2) in the same fastq file?

    Thanks and regards
    Nandita

    • Eric Allen says :

      Hi Nandita,

      I suggest you try combining your samples into a single new sample, then run the desired analysis app on that sample. We think it should work. If it doesn’t work, perhaps contact tech support via email.

      Thanks,
      Eric

  2. Nandita says :

    Thanks Eric, I tried it and it worked.

    Regarding the new BWA enrichment app (v2.0) – where can I find the differences between BWA enrichment v1.0 and v2.0? Particularly, it seems like, for the same data set, running it on v1.0 gives rise to a gaps.csv file that reflects several gaps, however, running it on v2.0 gives rise to an empty gaps. csv file. I was wondering if others have noticed this? Thanks

    • Eric Allen says :

      Hi Nandita,

      We expect the gaps.csv file to be restored in our next release of the BaseSpace Enrichment Apps. Sorry I can’t give you an ETA at this time. Please report any future BaseSpace support issues or questions to Illumina tech support or use the Contact Support form within BaseSpace by clicking on the Help button or going directly to https://support.basespace.illumina.com/ .

      Thanks,
      Eric

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: