Introducing fast, free alignment and variant calling with the Isaac Human Whole Genome Sequencing App

With the widespread adoption of the HiSeq 2500 and its lightning speed, enabling biologists to quickly and inexpensively extract biological information from sequences has become a critical need1,2. However, the management and analysis of large data sets is widely recognized as an obstacle to a wide adoption of next-generation sequencing, requiring large IT investment and bio-informatics expertise to set-up, maintain and run software at reasonable speed, especially for the most demanding applications like Whole Genome Sequencing (WGS).

To address this, Illumina has developed a user-friendly human WGS analysis workflow to enable scientists with no bioinformatics experience to align and call variants in whole human-genome data 4-6 times faster than existing methods. Combined with Illumina’s PCR-free sample preparation and the HiSeq 2500, the workflow provides a sample to answer time of less than 2 days.

In the words of Waibhav Tembe, Ph.D., Director of the Collaborative Bioinformatics Center at TGen “For whole genome sequencing, the aligner did an awesome job in cutting down the time to align 30x data against human genome and in using available hardware resources effectively.”

With the Isaac Human WGS app in BaseSpace, HiSeq users can now analyze and store WGS data without bioinformatics expertise, Linux experience or IT infrastructure. The workflow is free to use and can be accessed here (access requires a free BaseSpace account).

For those who prefer to keep their data on premises, the workflow is available as part of the HiSeq Analysis Software (HAS), freely available on the Illumina website here. HAS , can analyze WGS in a few hours, on a commodity PC with a single command line or an easy to use Graphical User Interface.

The component algorithms for the Isaac aligner and Variant Caller are released as open source here for developers to re-use and improve them. The open source version of the Isaac aligner is not commercially supported and provided as is under Illumina Open Source Software License available here.

Finally, data generated by Illumina’s IGN services uses the Isaac Human WGS workflow.

You can find more details on table 1 below and in our white paper available for download  here.

Table 1: Isaac Human WGS workflow on premises with the HiSeq Analysis Software. Comparison of analysis metrics with the BWA + GATK workflow showing comparable data is generated 6 times faster.

2013-03-05-IsaacPlanningExit-v4 [Read-Only] - Microsoft PowerPoint_2013-06-03_10-12-31

(1)    Saunders, C. J. et al. (2012) Rapid Whole-Genome Sequencing for Genetic Disease Diagnosis in Neonatal Intensive Care Units Sci Transl Med 4:154ra1352.

(2)    Jones, S. J. et al. (2010) Evolution of an adenocarcinoma in response to selection by targeted kinase inhibitors. Genome Biol. 11, R82

BaseSpace – Open for Business!

We’re happy to announce the commercial availability of BaseSpace, Illumina’s genomics cloud computing and storage platform.  The commercial release represents a major software update and includes a fully supported BaseSpace Apps store for quick and easy access to Illumina and third-party bioinformatics applications.  As introduced in our previous blog post, BaseSpace e-commerce transactions are based on BaseSpace iCredits, a currency that can be purchased and spent within BaseSpace.

This release also moves BaseSpace out of beta status and includes many new features for both MiSeq and HiSeq users.  Here’s a preview of the new features you will see in the commercial release of BaseSpace:

New BaseSpace Apps Store

The first feature is the BaseSpace Apps store, which is now updated to use iCredits for apps.  Starting today, some apps will charge iCredits, and some apps will remain free apps in the store.

Running your data through a new set of software tools is often difficult. Even if the software package is widely used, the processes of installation, data formatting, and simply evaluating the software tools can be overwhelming.  Clicking on an app within the BaseSpace Apps store brings up a dedicated page with a description, screenshots, and coming soon for some apps, video overviews.  Launching an app is also easy from the Apps store – simply select an app and click on the green button labeled “Launch” to try it out on your own data or on publicly available data in BaseSpace.   We made the Apps store intuitive and easy to use so you can find the right apps for your sequence analysis!

appStoreNew

New BaseSpace Apps

BaseSpace continues to attract a large number of app developers who can rapidly make their bioinformatics tools available to enable genomics research in diverse markets including cancer, microbiology, and genetic disease. We’ve added three new apps to BaseSpace with this release, adding to the existing MiSeq Reporter workflows and the new Isaac alignment and variant calling app.  Be sure to check back soon for future blog posts detailing the functionality of these new apps.  Here’s a preview:

HALThe first is a HLA typing application from Omixon Biocomputing (“HLA Typing”).  From Omixon’s app description: The Omixon app performs state of the art HLA typing with accuracy exceeding that of Sanger based typing and is able to do typing based on targeted data, whole exome or whole genome data.  It works with all known HLA alleles, supports 4, 6 or 8 digit typing, and the resolution depends only on the availability of reference and sequencing data.  Additionally, the app is exceptionally fast and easy to use – typing is done within minutes for targeted data, and in 1-2 hours for whole genome data, and the app requires no configuration.

GDAnother new application is from Elsevier Publishing, the world’s leading publisher of scientific and health information.  In a world-first, Elsevier has released a BaseSpace App that enables users submit data directly into the peer-review publication process.

From Elsevier: The Elsevier BaseSpace App works with Elsevier’s new journal Genomics Data, which provides genome-centric reporting on the increasing volume of genomic and functional genomic data that is without a formal report in the scientific, technical or medical (STM) literature. The goal is for authors to provide concise and highly standardized reports summarizing results of microarray or sequencing studies, such as what is put into the BaseSpace cloud. The genomic and functional genomic data/results are intended to serve as points of record that are enriched with interpretative commentary, validated by peer review and verified by a standards-focused editorial team.

PicardSpace_100Finally, we are introducing a new Illumina BaseSpace App called PicardSpace, which calculates alignment QC metrics from a BAM file. The app runs the open-source tool Picard, which was developed at the Broad Institute.

PicardSpace is also a great example for developers of a fully functional web app for BaseSpace. The source code is available in github and anyone can launch the app from the Apps store.

BaseSpace Public Data area

Essential to BaseSpace Apps is the ability to try apps on publicly-available shared data in BaseSpace.  Because of this, we’re introducing the Public Data tab in BaseSpace, which enables you to search and filter datasets by research area and by analytical category.  The screenshot below includes the new filtering interface which will soon be used for filtering apps in the BaseSpace Apps store as well.  Now it’s easy to locate example datasets and match them to the right apps to explore a new analytical workflows or approach.  With BaseSpace’s Public Data area, evaluation of new bioinformatics tools and approaches just became a lot easier!

publicData

Sample Sheet Editing and Run Re-Queing

One of the most requested features has been the ability to fix sample sheets and re-submit sequencing runs for data analysis in BaseSpace.  Included in this release is a live sample sheet editor that allows errors to be corrected and for the final sample sheet to be validated – all within BaseSpace.  Once validation occurs, the data, along with the final sample sheet, is automatically queued for analysis.

sampleSheet

BaseSpace Download Tool

While we believe that the BaseSpace Apps ecosystem represents a new way to avoid moving, handling, or downloading large sequencing datasets, we recognize that sometimes you just need local access to your data.  While BaseSpace has always allowed you to download your runs or project data, doing so meant relying on your browser’s built-in functionality.  We are now releasing the BaseSpace Downloader which can handle multiple files simulataneously and resume downloads after a severed internet connection.  The BaseSpace Downloader is available for Windows users, and we will continue to expand the downloader to other platforms in the future.

downloader

New BaseSpace Web resources

The BaseSpace web pages have received a major overhaul, making it easier to find the information you need about BaseSpace.  This includes major updates to the Illumina BaseSpace home pages (
http://www.illumina.com/software/basespace.ilmn
), the BaseSpace login page (http://www.basespace.com) and the BaseSpace developer portal (
https://developer.basespace.illumina.com/
).  On these pages you’ll find links to documentation, technical notes, tutorials and videos.  And we’re not done!  The BaseSpace web pages will be your go-to site for future information, screencasts, and use cases on BaseSpace.

Exciting new additions coming to the webpages are new BaseSpace videos and screencasts.  Ranging from overviews of BaseSpace to focused tutorials, the videos will comprise a set of “push-button bioinformatics” tutorials intended to help get you up and running in BaseSpace.  You can find more information on the BaseSpace home pages as well as on the Illumina.com website such as within the cancer genomics tumor sequencing pages.

TN

Finally, we’ve put together a set of BaseSpace data sheets, technical notes on data transfer and data security, and a comprehensive, authoritative user guide.  Available on the BaseSpace home page and through the BaseSpace help button, the user guide includes information to get started in BaseSpace, interact with your sequencing data and launch apps.

userGuide

What’s Next

As you can see, the commercial launch of BaseSpace includes a large number of platform enhancements, and this is only the beginning.  As we move forward, we’ll be detailing each new feature in subsequent blog posts.  We’ll also be introducing new BaseSpace Apps as they come onboard.  Stay tuned!

Coming Soon – BaseSpace iCredits!

Spring is in the air, and with the warmer weather comes new changes to the BaseSpace platform.  These changes come at a time when genomics data upload to BaseSpace has reached a critical mass.  As mentioned recently in Illumina’s recent earnings call, both our MiSeq and HiSeq platforms are enabled to stream data to BaseSpace, and data on more than 40,000 sequencing runs has been uploaded.  We now have more than 6,000 registered users, many who actively share data with their BaseSpace collaborators within their institutions and around the globe.  We’ve also had tremendous interest in the BaseSpace Apps Beta, with almost 4,000 App test drives.

We’re excited to announce the next major update to BaseSpace that will roll out in the coming weeks.  This update will include e-commerce capability, allowing the purchase of apps in our BaseSpace store, and will move BaseSpace out of beta status.

This is an important milestone for both Illumina instrument users and BaseSpace App developers.  For users, BaseSpace is emerging from beta status to a fully supported Illumina software platform featuring an ever-growing collection of apps to process data, storage room to collect data, and interfaces to share data.  For App developers, this release strengthens a vibrant ecosystem in which to offer best-in-class tools for next-generation sequencing data analysis.  We’re very excited as we look forward to this new level of platform functionality!

In preparation for the commercial launch of BaseSpace, we are going to have a series of “What’s New” blog posts for both users and developers, so that you can prepare for the upcoming platform changes. In today’s blog post, we’re introducing a currency called BaseSpace iCredits.

Introducing BaseSpace iCredits

Beginning next week, you will be able to purchase iCredits through our online store.  Much like an online bank account, iCredits can then be used within BaseSpace to purchase various services. At first, you can use iCredits to buy apps in BaseSpace to analyze, annotate, or process your data. In the future, iCredits will also be used to purchase additional data storage options.

Purchasing BaseSpace iCredits

Purchasing iCredits is easy to do in BaseSpace through your account dashboard. Simply clicking on the button labeled ‘Add More iCredits’ will bring you to a purchase screen where you can enter the number of iCredits you want to buy and the purchasing method you want to use:

Image

A BaseSpace iCredit is equivalent to one U.S. Dollar, and can be purchased online using credit cards or by submitting a purchase order to Illumina customer support:

Image

Using BaseSpace iCredits

Once you have iCredits in your account, you can allocate these in the App store, which gives an overview of the App functionality as well as the ability to launch the App directly from within the store:

Image

Each App will list the price for the App (which might be a per-use price or a subscription period for the App):

iCredits5

To launch the app, the system will ensure you have enough iCredits for the transaction and then deduct them from your account. At that point, you will get access to the app for your data analysis.

Managing your iCredit account with BaseSpace’s Digital Wallet

One of the additions to managing your account in BaseSpace is the concept of a ‘digital wallet’. The wallet keeps information about your iCredit balance and purchases – including the ability to print out receipts at any time. The digital wallet is accessible through your account information found in the BaseSpace top banner.

Image

iCredits are just the first step forward to the larger BaseSpace commercial launch, and we are very excited to share a preview. Stay tuned for more news about upcoming functionality as we roll out the new BaseSpace release.

.

Tumor-Normal PCR-Free WGS Data (HCC1187)

A variety of Illumina technologies can be used to help understand cancer markers and progression.  To illustrate this, we are publishing a series of Tumor-Normal datasets in BaseSpace analyzed using several approaches.  Read this tech note for additional details on how to visualize this particular tumor/normal data set using the integrated set of tools in BaseSpace.

Materials and Methods: The DNA was extracted from the HCC1187 breast ductal carcinoma cell line and a matching lymphoblastoid cell line. 500 ng of DNA were prepped using an early access version of Illumina’s TruSeq DNA PCR-Free kit, and sequenced on a HiSeq 2000. The data was analyzed using a pre-release version of the Cancer Sequencing Workflow. This data is being shared in accordance with the terms of a licensing agreement with UT Southwestern, the owners of the cell lines.

To access the data, click on the link below to see the project folder. You will be asked to “Accept” the Project into your BaseSpace account: this is the same mechanism you will use to share specific real-life projects or runs with your colleagues/collaborators via a dedicated URL.

  • TumorNormal_WGS_HiSeq2000_CSW_0.23: Project (Cancer Sequencing Workflow (pre-release version) output files).

Below is a preview of what can be found in the data set and the related technical note:

Summary table from the Somatic Summary Report, automatically generated using Illumina’s Cancer Sequencing Workflow (pre-release):

image

Circos Plot showing HCC1187 Somatic Mutations, automatically generated using Illumina’s Cancer Sequencing Workflow (pre-release):

image

Customized version of the Broad IGV, fully integrated into BaseSpace, and showing some VCF and BAM tracks for this Tumor-Normal dataset:

image

Learn more about:

Note: HCC cell lines were invented by Drs. Adi F. Gazdar and John D. Minna at the University of Texas Southwestern Medical Center. Rights in and to the HCC cell lines, progeny, and unmodified derivates thereof belong to the Board of Regents of The University of Texas System. Illumina, Inc. has obtained permission from the Board of Regents of The University of Texas System through the University of Texas Southwestern Medical Center to use the HCC cell lines and publish the data and results herein displayed.

Run QC Chart Enhancements

This week we’ve introduced enhancements to one of the most popular BaseSpace features: the Run QC charts (familiar to many users from our Sequencing Analysis Viewer desktop program).

We’ve begun to add interactivity and polish to these charts. The charts will work even on your iPad or Android tablet, and they can be a powerful way to monitor the quality of both MiSeq and HiSeq runs. You can learn more about interpreting run QC metrics by checking out the Sequencing Analysis Viewer user guide.

Below are some examples:

  • Multiseries charts will now highlight the series and its corresponding legend item when you move the mouse over them:

  • We’ve added gradients and smooth transitions between data, so you can quickly distinguish what’s changed:

  • And to improve their legibility, we’ve given every chart a subtle makeover in terms of colors, shapes, and typography:

Later we’ll be adding the ability to zoom on the charts, among other enhancements. Stay tuned for more visualization goodness, and let us know what you think!

Nextera Rapid Capture Exome Data Sets

We are happy to announce the BaseSpace availability of three Nextera Rapid Capture Exome data sets:

  • A standard 37Mb exome sequenced on two lanes of the HiSeq 2500®
  • An expanded 62Mb exome sequenced on two lanes of the HiSeq 2500
  • A standard 37Mb exome sequenced on MiSeq

These data sets demonstrate the high uniformity and accuracy of Illumina’s new Nextera Rapid Capture Exome sample prep and Illumina sequencing. The data sets also demonstrate Illumina’s updated Enrichment analysis, which is available either through a new version of MiSeq Reporter or through the new HiSeq Analysis Software product.

Click on the links below to see the project and run folders. You will be asked to “Accept” the Run/Project into your BaseSpace account: this is the same mechanism you will use to share specific real-life projects or runs with your colleagues/collaborators via a dedicated URL.

  • NexteraRapidCaptureExome_HiSeq2500: Run (QC plots & summary metrics), Project (Enrichment workflow output files).
  • NexteraRapidCaptureExpandedExome_HiSeq2500: Run (QC plots & summary metrics), Project (Enrichment workflow output files).
  • NexteraRapidCaptureExome_MiSeq_NA18507: Run (QC plots & summary metrics), Project (Enrichment workflow output files).

Summary of the HiSeq 2500 standard exome run:

image

Summary of exome analysis metrics from this run:

image

Materials and Methods: Human Coriell samples NA18507, NA10859, and NA12144; Nextera Rapid Capture Exome and Expanded Exome; analysis with HiSeq Analysis Software and MiSeq Reporter.

Learn more about the Nextera Rapid Capture Exomes here, and MiSeq Reporter software here, and HiSeq Analysis Software here.

Transfer of Runs and Projects

We are pleased to announce the transfer feature in BaseSpace.  Transfer adds to the collaboration feature set of BaseSpace, allowing users to transfer ownership of runs and/or projects to other users.  Transfer is used when a user wants to transfer ownership to another user.  With an ownership transfer, the new owner will control all permissions for that project or run as well as have the project and run associated only with their account.

How is transfer different from our current sharing feature?  If I share a project or run, I remain the owner, controlling all of the access rights for my collaborators.  The project/run remain associated with my account.

If you work in a Core Facility or provide sequencing services, Transfer may be just what you need for prompt data delivery to your customers.

Let’s take a deep dive into how transfer works in BaseSpace.

I’ll start with a scenario where I want to transfer ownership of a project to a fellow collaborator.  The transfer feature is only available for projects/runs that I own.  If I own the project, the transfer ownership feature is available for that project…

In the project list view…

image

Or from within the project itself…

image

Once I select “transfer ownership” I will be presented with the transfer ownership dialog.

image

I must provide an email address for the user that I would like to transfer ownership to.  I may also provide an optional message to that user.  There is a note here warning me to make sure that there are no Apps currently running on data in the project.  Any projects that have Apps running (Apps that save to a project) and are transferred to a new owner before the save to a project will not complete successfully.

After I provide an email and select continue, I will be presented with a final confirmation dialog.

image

I have the capability to cancel the transfer, or move forward and initiate transfer.  When I select “Transfer Now” the transfer invitation is sent to the user I provide; as an email and as a BaseSpace dashboard notification.

I have the capability to view my transfers as well as cancel an invitation from my user settings page.  To get to “my account”, just select the dropdown next to your username in BaseSpace.

image

Once at my transfer history, I am able to cancel transfer invitations.  Cancelling an invitation is available up until the new owner accepts the invitation.

I remain the owner of the project until the invitee accepts the transfer.  Below is a screenshot of my transfer history, where I may view all of my transfer invitations, as well as the status of each invitation.

image

The invitation has been sent.  Now, let’s switch over to the invitee’s view.

When a user transfer a project to me, I will receive both an email and a dashboard notification.

The Dashboard notification…

image

And the email…

image

I am able to view the invitation by selecting the hyperlink in the dashboard notification or the link from the email.

Once selecting the hyperlink I am presented with the invitation.

image

I have a choice to either accept or ignore the invitation.  If I choose the ignore, the invitation will still be accessible, as long as the original owner does not cancel the invitation.  When I choose accept, I become the new owner of the project, BaseSpace will navigate to the project, and update the transfer history for the original owner to “accepted”.  From this point forward I am the new owner of the project.  I may control over collaborators as well as the capability to transfer the project.  The previous owner no longer has access to the project.

Transfer is complete!  Here are some more pointers and tips on transfer in BaseSpace.

  • Transferring runs only transfers the run files, not the projects in that run.  This use case supports the transferring of Run QC information (SAV) but not the projects and contents of those projects.
  • Transferring of runs is only available for runs that have been completed.  Aborted or incomplete runs are not available for transfer.
  • Be aware that if your collaborator accepts ownership of the project while an App you launched is still in progress, that app won’t be able to save data into that project anymore.

Enjoy using the transfer feature, and please provide feedback on how we may make transfer a better experience for you in BaseSpace.