Archive | Uncategorized RSS for this section

Register Now for the BaseSpace Developers Meeting at EMBL in Heidelberg, Germany


Registration is now open for the BaseSpace Developer’s Meeting at the European Molecular Biology Laboratory (EMBL) Heidelberg, Germany on May 7, 2014.  This free, one-day forum is a great opportunity for both experienced and novice developers to network, exchange ideas, and learn more about the world’s most widely used cloud-based bioinformatics platform for next-generation sequencing.  Participants will use the BaseSpace Native App Engine to launch their own bioinformatics apps in BaseSpace.

Why develop for BaseSpace? Because 90% of the world’s next-gen sequencing data is produced on Illumina instruments, and your novel algorithms, open-source tools, and applications for BaseSpace users can directly impact the growth of genomic research. In short, you can change the way the world analyzes genomic data.


Welcome to EMBL & Illumina’s Co-Hosting of 2014 BaseSpace WWDC
Jonathon Blake, Ph.D., Bioinformatics, EMBL
Raymond Tecotzky, Market Manager, BaseSpace, Illumina, Inc.

Keynote: BaseSpace and The Next Frontier for Genomics Storage, Sharing, and Analysis
Elliott Margulies, Ph.D., Product Owner, BaseSpace, Illumina, Inc.

Biomax PEDANT – Pathway Analysis for NGS Data
Dimitrij Frishman, Ph.D., Professor of Bioinformatics, Technical University in Munich, Germany

New Frontiers of Genome Assembly with SPAdes 3.0 on Illumina BaseSpace Platform
Anton Korobeynikov, Ph.D., Associate Professor Saint Petersberg State University, St. Petersburg, Russian Federation

ABL (Advanced Biology Laboratories/Therapy Edge) DeepChek® Hep B & C Detection App
Dr. Chalom Sayada, CEO, Advanced Biological Laboratories SA

Hands-On Session: Build Your Own BaseSpace App
Greg Roberts, Senior Staff Software Engineer, Illumina, Inc.
Mayank Tyagi, Senior Applications Support Engineer, Illumina, Inc.

Hands-On “Hackathon” Build Your Own BaseSpace App (Choose from Open-Source, Command-Line, or Bring Your Own Code)
Ilya Chorny, Sequencing Application Marketing, Illumina, Inc.

BaseSpace Onsite Introduction – Storage, Sharing, & Analysis in a Box
John Duddy, Senior Staff Software Engineer. Illumina, Inc.

Will be followed by a Networking Reception

Link to Agenda

Event Details:

Date: Wednesday 7 May, 2014 9:00-6:30 PM

EMBL Heidelberg
Meyerhofstrasse 1
69117 Heidelberg, Germany
Flex Lab A+B

Cost: Free

Twitter: #basedev2014

Got a killer NGS app? Enter your original idea and win an iPad mini at the conference!

Register Here

BaseSpace Update from AGBT

We are humbled and excited by the overwhelming attention BaseSpace, BaseSpace Onsite, and the BaseSpace Core Apps have received at AGBT. Things were set in motion on Wednesday by a review of the BaseSpace RNA-Seq Apps (TopHat and Cufflinks) by James Hadfield from Cancer Research UK as a part of his presentation at the Illumina User Meeting. Then on Thursday, during the standing-room only Illumina Workshop, our own Gary Schroth gave a “User’s Perspective” talk on RNA-Seq right after Sheila Fisher’s hot-off-the press presentation of HiSeq X10 and NextSeq datasets from the Broad Institute. Gary gave a deep dive of the TopHat and Cufflinks Apps on BaseSpace. Gary emphasized the high usability and the end-to-end workflow now enabled by BaseSpace. The workflow starts with creation of samples, libraries and runs for the NextSeq on the Prep tab, followed by real-time monitoring of sequencing metrics, and finally the streamlined analysis of data resulting in graphical interactive plots of expression profiles. Gary also mentioned that he is not a bioinformatician, but can now perform RNASeq analysis all by himself.

On Thursday evening, the first-ever AGBT “Electronic Poster Session” was held, where about 30 software vendors showcased their solutions in a large, well-catered room (chocolate fountain, crabs, sushi and all). The two of us who were demo-ing BaseSpace were kept busy throughout the two hours and we definitely got the sense that the value proposition of the BaseSpace platform and Apps resonated with all users who stopped by our booth.

Finally, we would like to respond to some questions that have come up in the twitter-verse based on the James Hadfield and Gary Schroth presentations:


1. To cite BaseSpace in journal manuscripts: we recommend citing the specific URL as appropriate

  • To cite BaseSpace in general:
  • To cite a particular App:  etc. (Each App has a dedicated “App description page” that is accessible from the App tab on the top menu bar)
  • To cite algorithms/methods used in the Apps:  All the methods used within apps are referenced within the corresponding App description page
  • To cite a particular Project, with embedded datasets and App analysis sessions, again use the particular URL associated with the dataset. Here is an example of a publicly shared Nextera Rapid Capture Exome project, with 12 exome samples run on the HiSeq 2500 (you will notice the associated “Analyses” and “Samples” along the tab on the left) :

2. Timing of availability of the BaseSpace Core Apps: The BWA/GATK whole-genome (WGS) App, the BWA/GATK Exome (Enrichment) App, as well as the Strelka-based Tumor-Normal App are available on BaseSpace today.  The Isaac-based WGS and exome Apps, along with the RNA-Seq Apps will be available by the end of February.

Wishing you Happy Holidays with a Festive Competition!

As we near the end of the year, we’d like to share a fun competition put together by James Hadfield, director of the genomics core facility at the University of Cambridge, Cancer Research UK Cambridge Institute, and sponsored by Illumina and BaseSpace.  James is a blogger at CoreGenomics, and is running a holiday contest that will test your knowledge of library prep and sequencing applications. How much do you know about the cost of library prep and sequencing for various applications, including genome sequencing, RNA-Seq, Exome Seq and others? Take the CoreGenomics logic challenge and find out!

The winner of this festive competition will receive a MiSeq 600 cycle run to be performed in James’ lab, and 250 iCredits towards BaseSpace analysis. You’re in charge of library prep, and of course, the sample for sequencing, although James may lean towards seasonally appropriate species such as the cranberry or fir tree genome, (our pick: follow up study on the Reindeer rumen microbiome).

So check out the CoreGenomics post here, with a link to enter the contest, as well as all the rules and regs. We’re happy to help support this fun exercise by covering the kit and the iCredits.

Happy Holidays from Illumina and the BaseSpace team!

Whole-genome and cancer analysis: Datasets from Illumina’s FastTrack Services Laboratory


We are pleased to announce the availability of data from two sequencing projects conducted in the Illumina FastTrack Services Laboratory through the Illumina Genome Network (IGN).   Whole-genome and Cancer Analysis Demo Datasets can now be accessed within or downloaded from BaseSpace for free through BaseSpace’s Public Data repository.

Whole-Genome Analysis Dataset:

Results from the ENCODE project reveal that many DNA variants previously associated to disease lie outside of the coding regions of genomic DNA.  Because whole-genome sequencing (WGS) gives researchers the most complete view, we offer the Illumina FastTrack Services Whole-genome Demo Dataset containing three WGS example datasets using the CEPH family trio sequenced to depth of ~30x coverage and analyzed using the Whole-Genome Sequencing Informatics Pipeline v2.0.  The project includes archival BAM files, variant calls (CNV, SV, & SNPs), a sample PDF summary report, and Illumina Omni2.5M genotyping data.

To access the shared whole-genome dataset in your BaseSpace account, click the following shared project link:

Cancer Analysis Dataset:

Cancer possesses significant heterogeneity at the genetic and histological levels.  The Illumina FastTrack Services Cancer Analysis Demo Dataset uses the IGN variant calling and sequencing methodology to address this complexity using ATCC_HCC samples sequenced to 40x coverage for the normal tissue sample and 80x coverage for the tumor tissue sample.  The data is analyzed using Cancer Analysis Pipeline v2.0, which uses a Bayesian combined variant calling method that provides the most accurate models for real-life tumor samples, recovering 97% of known SNVs.  The datasets include the standard WGS deliverable, as well as somatic variant data, and somatic PDF summary report.

To access the shared cancer analysis dataset in your BaseSpace account, click the following shared project link:

More About the Illumina Genome Network:

The IGN, consisting of CSPro-certified organizations and Illumina FastTrack Services, offers highly accurate, affordable, end-to-end human whole-genome sequencing services.  The IGN laboratories have experienced scientists using TruSeq technology for superior coverage and quality of even challenging regions, and industry-leading HiSeq systems for the highest throughput.  IGN Services are finalized with data analysis by skilled bioinformaticians to accelerate researchers’ opportunities to discover more from the whole human genome.

We invite you to view these example IGN projects using BaseSpace Apps such as the Broad’s IGV, or by downloading files and exploring the data using your favorite tools.  See for yourself the unmatched performance, data quality and expertise of the Illumina Genome Network.

See you at #basedev2013 in San Francisco

In the San Francisco Bay area this week? Drop by the BaseSpace Developer Conference on Monday, December 9.

Network, exchange ideas, and learn about making genomic apps for BaseSpace, the world’s most widely used cloud-based bioinformatics platform for next-generation sequencing. Build an app in less than three hours in the afternoon hands-on session.

Novice and experienced developers, researchers, engineers, academic and industry professionals welcome. Learn from the community, and start changing the way the world analyzes genomic data. In one day. For free.

Register here, or join us Monday from 8:00 AM–6:00 PM at the JW Marriott San Francisco Union Square, Metropolitan room.

If you can’t join us in person, follow #basedev2013 from @illumina for highlights.

BaseSpace Update: Analysis User Interface Upgraded

We are excited to announce a brand new look for Projects and Analyses in BaseSpace.  We have made significant improvements to the ways in which users interact with their Analysis data.  We strive to bring our users the best experience possible and we hope that you enjoy these changes.

Here is a brief summary of all of the changes you will see:

  • New look and feel to the Project page: With this new look, you should be able to navigate around your projects more efficiently.


  • Streamlined left-hand navigation:  Everything you need from your project, whether it’s your input, output, or samples; it’s accessible through this navigation.


  • What we used to call AppSessions, we now call Analyses: the output of BaseSpace Apps.
  • Improvements to the Analysis page


  • Improvements to the file browser


  • Visual improvements to the file browser allows for simpler navigation of your results.  Select a file, and open it right in BaseSpace.


A Note About Apps and Analyses

A new Analysis is created each time you launch an app.  You can find a list of Analyses within each Project.  Each time data in a Project is selected as input to run an app, an Analysis is created in that Project.  In addition, each time a Project is selected for storing result data, an Analysis will also be created in that Project. On each Analysis page, you will find:

  • Information about the app that performed the analysis
  • A running status update from the app
  • The result files that were uploaded to your account once the app’s analysis was completed
  • The input data selected for the app via the Analysis Inputs page


You can also rename each Analysis you own by clicking on the Edit Analysis button on the Analysis page.


This will allow you to more easily organize your results in BaseSpace.

For more information, please refer to the BaseSpace Release Notes.  Please do not hesitate to provide feedback to us via the Contact Us button, we constantly strive to improve BaseSpace for our users.

MiSeq Trio Data: TruSight One Sequencing Panel

We’re happy to introduce a dataset from the TruSight One Sequencing Panel, which provides comprehensive coverage of 4,813 genes with an associated clinical phenotype. With TruSight One, you can analyze all of the genes included in the panel, or focus on a specific subset of them. Labs can use this panel to expand existing assay offerings, streamline workflows, or create an entire portfolio of sequencing options.

This is a shared, downloadable dataset in BaseSpace representing a family trio using TruSightOne sequenced on a MiSeq using v3 reagents. This trio was obtained from the Coriell Institute’s NIGMS Human Genetic Cell Repository, and is from the CEPH/Utah Pedigree 1463.

Sample ID Sample Name Details
1 NA12880 Daughter
2 NA12878* Mother
3 NA12877 Father

* From the Coriell Institute“Donor subject has a single bp (G-to-A) transition at nucleotide 681 in exon 5 of the CYP2C19 gene (CYP2C19*2) which creates an aberrant splice site. The change altered the reading frame of the mRNA starting with amino acid 215 and produced a premature stop codon 20 amino acids downstream, resulting in a truncated, nonfunctional protein. Because of the aberrant splice site, a 40-bp deletion occurred at the beginning of exon 5 (from bp 643 to bp 682), resulting in deletion of amino acids 215 to 227. The truncated protein had 234 amino acids and would be catalytically inactive because it lacked the heme-binding region.”

This data, along with other example datasets from a variety of sequencing runs, is available for import into your BaseSpace account from BaseSpace’s Public Data repository:


To access the data, click on the links (the full URLs are shown below) to view the Project (enrichment results) or the Run (sequencing results).  You will be asked to “Accept” the Project or Run into your BaseSpace account.  Once there, you can access or download the data.

Some stats for this dataset:

Library Prep Kit TruSight One Sequencing Panel
Sequencing Kits MiSeq Reagent Kit, v3
Sequencer MiSeq
Read Length 2 x 151 bp
Sample Multiplexing 3 samples
Sequencing Output 9.4 Gb
Cluster Density 1325 +/- 27 K/mm2
Cluster PF 92.25  +/- 0.74%
Reads PF 29.84 M
Total Aligned Reads 15,589,287 – 20,773,100
Read Enrichment 63.4% – 66.1%
Uniformity of Coverage (% > 0.2 of mean) 95.5% – 96.4%

Run data link:

Project data link: