Archive | Announcement RSS for this section

Happy 10-Year Anniversary to BaseSpace® Correlation Engine

Deep sequencing and high throughput microarray technologies have enabled scientists to routinely generate hundreds of thousands if not millions of new data points in a single experiment. The extraordinary rate of data generation, finite resources, and focused research interests limit most investigations to follow up on only a small fraction of the data generated from next-generation sequencing (NGS) instruments.

Ten years ago, there were no services available to curate data. Researchers relied on home grown tools to perform the cumbersome task of matching information to publications, but they didn’t have the expertise to do a re-analysis. A group of entrepreneurial scientists and bioinfomaticists envisioned a need for solutions to handle the deluge of data that was coming as more and more genomes, from different kinds of species, were being sequenced. That vision manifested into NextBio® Research, a genomics software platform that could match variant to variant sets and gene expression to DNA methylation to protein-DNA binding across a spectrum of organisms, saving researchers time and resources. By leveraging biomedical ontologies coupled with proto-machine learning algorithms, dynamic data-driven applications were added to aid in the discovery of novel relationships among diseases, compounds, gene perturbations, and pathways.

After the acquisition of NextBio by Illumina in 2014, one of the primary NextBio Research utilities was rebranded as BaseSpace® Correlation Engine. Today it stands as a key pillar in the BaseSpace® Informatics Suite.

The BaseSpace Correlation Engine public study library has steadily grown over the years, approaching 21,000 studies, with more than 130,000 experimental gene signatures that have been collected and curated by a highly skilled team of scientists.* Illumina has been working hard to engage with ecosystem partners  to improve the quality of the user experience. Illumina has partnered with Elsevier over the past year to create connectivity between BaseSpace Correlation Engine and Elsevier’s Pathway Studio. Users can apply data filters to their results or work with the public data and visualize functional relationships among genes found in an experiment.

BaseSpace Correlation Engine results have found their way into hundreds of peer-reviewed citations from distinguished universities around the world and from many of the top 25 large pharmaceutical organizations including:

Universities

Pharmaceutical Organizations

Sanford Burnham Celgene
Mayo Clinic Sanofi
University of Pittsburgh Regeneron
University of Southern California Boehringer-Ingelheim
Stanford Pfizer
Harvard Johnson & Johnson
Cornel Weil Medical College Merck
Karolinska Institute Medimmune
Emory University
Kyoto University

Government Organizations

Vanderbilt NIH-National Institute
of Environmental Health Sciences
Yale Health Canada
University of California – Davis Environmental Protection Agency

If you would like to learn more about BaseSpace Correlation Engine:

*Based on internal database read as of 5/2017.

For Research Use Only. Not for use in diagnostic procedures.

Introducing unlimited data and compute plans for new BaseSpace® Sequence Hub customers

Next-Generation Sequencing (NGS) users often adopt BaseSpace Sequence Hub at times of change within their organizations. They may be new to NGS, or in the process of scaling up their operations, such as having purchased a new sequencing instrument – perhaps a NovaSeq™ Series instrument. In these situations, it can be challenging to estimate data storage and compute costs, creating uncertainty in the budgeting process.

To address this concern, we are excited to announce an unlimited data storage and compute plan for BaseSpace Sequence Hub that takes the uncertainty away. The plan enables new Sequence Hub customers to choose from either the traditional pay-for-use plan or alternatively choose a fixed-price, unlimited plan covering all data storage and compute cost in the first year.

With the plan, new customers get unlimited data storage and have access to all of the apps in BaseSpace Sequence Hub without any additional cost. The plan includes Illumina-developed apps as well as third-party apps, such as the recently announced whole genome sequencing Apps from Edico Genome (coming soon). The unlimited plan eliminates any ambiguity associated with the cost of using BaseSpace Sequence Hub and allows customers to understand their usage patterns so they can comfortably estimate their expenses in subsequent years.

These plans are available for both the U.S. and Frankfurt sites. Please contact us to learn more.

For Research Use Only.  Not for use in diagnostic procedures.

BaseSpace® Clarity LIMS integration with NovaSeq™ Series Instruments

Despite advances in sequencing technology, conducting studies from an informatics perspective can still be challenging. Managing, analyzing, and interpreting the large volume of data generated from genomic studies calls for a systematic, standardized, and pipeline-centric approach.1

To accommodate this type of approach, we have integrated BaseSpace Clarity LIMS and the NovaSeq Series instruments. The integration helps expedite genomic workflows and can potentially reduce human error inherent when handling and managing samples in a laboratory.

Ready to use, this integration connects BaseSpace Clarity LIMS to the NovaSeq instrument with automated tracking and file generation. Users of both systems can:

  • Apply a pipeline-based approach from sample accessioning to secondary analysis of the data.
  • Positively track samples from sample accessioning to secondary analysis through automation and validation of sample indexes and reagent barcodes.
  • Automate sequencing run information and parse key sequencing metrics from the instrument back into BaseSpace Clarity LIMS.
  • Initiate secondary analysis by streaming sequencing information directly to BaseSpace Sequence Hub.

Protocol

The protocol provides a series of validated steps, as noted in the illustration, below.

novaseq-series-protocol

The NovaSeq Series preconfigured protocol as seen in BaseSpace Clarity LIMS.

Additionally, the integration has several points at which users can validate the integration and efficiently test it before putting into production. These points include:

Integration validation point 1

BaseSpace Clarity LIMS automatically calculates library normalization and pooling volumes. BaseSpace Clarity LIMS generates the run info file and the NovaSeq Sample Sheet including the Library tube ID, which are automatically placed into a specific network folder on the instrument.

Integration validation point 2

Key primary sequencing metrics, such as Yield, %Q30, %Reads PF, Number of reads, etc., are automatically parsed into BaseSpace Clarity LIMS. This parsing enables users to generate sequencing statistics and monitor sequencing instrument performance over time.

Integration validation point 3

NovaSeq 6000 integrates with BaseSpace Sequence Hub, where sequencing run details and sequencing data are automatically sent, thus making the triggering of downstream analysis even easier.

The complete integration is available for BaseSpace Clarity LIMS Gold users, although a more simplified version is available to all BaseSpace Clarity LIMS users. Additionally, the integration is currently compatible with S2 flowcells; additional functionality will become available to the integration when new flowcells are available.

For more information about this integration, please contact us.

For Research Use Only. Not for use in diagnostic procedures.
References
  1. “Big Biological Data: Challenges And Opportunities”. Sciencedirect.com. N.p., 2017. Web. 17 Apr. 2017.

Singling out solutions for single-cell analysis

To date, most of what we know about our genome comes from studying populations of cells. Although few would argue with how far we have come to understand our genome, many researchers now realize that it may be just as important to fully examine the heterogeneity that exists within the population of cells. Evidence suggests that bulk sequencing methods can mask the contribution of individual cells. As a result, many researchers are turning to an evolving technique: single-cell sequencing.

Pioneered in the 1990s by James Eberwine2 and made more robust by the analytical sensitivity and specificity of next-generation sequencing (NGS) methods,3 single-cell sequencing enables researchers to examine the heterogeneity of cells, and promises to reveal what role individual cells play in disease and complex biological systems.

How? For every cell sequenced, researchers have a comprehensive map of the transcriptome that can be analyzed in several of different ways to characterize cells at single-cell resolution. Currently, 3 primary applications stand out:

  • Assessing cell-to-cell heterogeneity. In this application, researchers dissect cell subtypes in a heterogeneous population of cells using cell surface markers to characterize cell types within a population. Using this method, cells can be bioinformatically classified based on expression levels of thousands of genes using clustering approaches, such as principal component analysis (PCA). This process has even enabled discovery of new cell types that were not previously known.4
  • Mapping cell trajectories. Using this application, researchers can investigate cell lineage trajectories over time and possibly detect expression changes occurring in only a subset of cells or substates along a development path. Notably, in traditional bulk-cell sequencing approaches, these trajectories would be missed as they would be averaged across the population.
  • Dissecting transcriptional mechanics. Using this application, researchers can classify individual cells according to a gene’s transcription state, such as presence or absence of a transcription factor.

Yet researchers who conduct single-cell sequencing still face throughput and analysis challenges, so with the potential for this method comes the need for more refined sequencing and bioinformatics tools.

A scalable, high throughput, and straightforward solution

To deliver on the promise of single-cell biology, the Illumina® Bio-Rad® Single-Cell Sequencing Solution combines the Bio-Rad Droplet Digital™ Technology with Illumina NGS library preparation, sequencing, and analysis technologies. This new platform provides a comprehensive workflow for single-cell RNA-Seq that enables controlled experiments with multiple samples, treatment conditions, and time points.

This co-developed solution enables transcriptome analysis of hundreds to thousands of single cells in one experiment, enabling researchers to apply the sensitivity and precision of RNA-Seq to questions that can only be answered by interrogating individual cells.

Flowjo Workflow

After sequencing, the single-cell sequencing data can be instantly transferred, stored, and analyzed securely in BaseSpace Sequence Hub. There, users can access the SureCell RNA Single-Cell App, which was specifically designed to support data analysis for the Illumina Bio-Rad Single-Cell Sequencing Solution. This app enables streamlined data analysis for up to 96 samples across multiple sequencing runs and performs:

  • Read 2 alignment using the STAR aligner
  • Cell barcode and unique molecular identifier (UMI) identification
  • UMI counting for each gene and associated statistics
  • Identification of good barcodes corresponding to single cells
  • Calculation of alignment, cell, and gene metrics

The app generates a BAM, cell and gene counts table, and a report including analysis metrics and plots.

Picture1.png

The UMI cell plot indicates the total number of cells passing filter; the vertical threshold (red line) must pass through the first knee. The defining features are the two distinct curves, or knees, and the threshold, which indicate the number of valid cells detected in the sample.

Picture2

The t-Distributed Stochastic Neighbor Embedding (t-SNE) plot is a two-dimensional projection of cells illustrating potential clusters (populations) of neighboring cells with similar expression profiles.

Downstream analysis with FlowJo SeqGeq

We’ve worked with another one of our partners – FlowJo – to develop an integration between the SureCell RNA Single-Cell App and the SeqGeq toolset. SeqGeq is a set of tools for exploring single-cell NGS data with an intuitive drag-and-drop interface. Users of both systems can transfer files into SeqGeq for additional visualization and analysis, including gene tables, and heat maps.

Picture3

Within SeqGeq, you can directly import data from BaseSpace Sequence Hub.

For more information, and to learn how Illumina instruments and bioinformatics are integrated with the solutions from Bio-Rad and FlowJo, download the technical note titled “Illumina® Bio-Rad® SureCell™ WTA 3′Library Prep Kit for the ddSEQ™ System” or visit the FlowJo website.

For Research Use Only.  Not for use in diagnostic procedures.
References
  1. Macaulay, Iain C. and Thierry Voet. “Single Cell Genomics: Advances And Future Perspectives”. PLoS Genet 10(1): e1004126. doi:10.1371/journal.pgen.1004126
  2. Eberwine J, Yeh H, Miyashiro K et al. Analysis of gene expression in single live neurons. Pnasorg. 2017. Available at: http://www.pnas.org/content/89/7/3010.short. Accessed March 14, 2017.
  3. Liu STrapnell C. Single-cell transcriptome sequencing: recent advances and remaining challenges. 2017.
  4. Macosko E, Basu A, Satija R et al. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. 2017.

BaseSpace Suite Summit

Join us for our BaseSpace® Suite Informatics Summit in Copenhagen, DK on 31 May and 1 June. Immediately after the European Society for Human Genetics (ESHG) annual meeting, attendance at the summit is FREE. Learn more about our informatics tools and how they’re designed to help you transform complex genomic data into meaningful insights quickly and easily.

basespace-suite-summit-copenhagen
Why attend a BaseSpace Suite Summit?

  • Share your perspectives on applying informatics tools in your lab
  • Attend informative sessions and learn how other customers use informatics
  • Get important product information for BaseSpace Clarity LIMS, BaseSpace Sequence Hub, BaseSpace Variant Interpreter (Beta), BaseSpace Cohort Analyzer, and BaseSpace Correlation Engine
  •  Learn best practices, including how an integrated approach to informatics can expedite workflows
  • Connect with your peers

Register here.

Learn more by clicking on the “Summit” dropdown above, or click here. 

 

Updated Command Line Tools – Wait for App Dependencies

We are pleased to announce a minor release of BaseSpaceCLI (0.8.10) with some improvements to existing tools and a new tool – bs wait.

‘bs wait’

The new wait command for BaseSpaceCLI is analogous to the shell command wait and was designed to help connect together separate app launches. The wait command accepts as arguments one or more appsessions and will then wait for these appsessions to finish, polling based on a specified interval (default 60 seconds). Once they have all finished, bs wait returns the appresults that have been generated by the provided appsessions. The intention is that these appresults can then be passed into another app launch, providing some limited app-chaining capabilities.
Read More…

BaseSpace Informatics Suite Summit 2016

transform-possibilities

You’re invited to an exclusive informatics event

Advancing Precision Medicine efforts relies on the ability to make sense of a growing body of genomic data. The need for robust informatics tools and an integrated approach when it comes to acquiring, storing, distributing, and analyzing data is essential.

Join us for our BaseSpace® Suite Summit in Rochester, MN on October 3 and 4. Taking place immediately before the Individualizing Medicine Conference, registration is free. Learn more about our informatics tools and how they’re designed to help you transform complex genomic data into meaningful insights quickly and easily.

  • Share your perspectives on applying informatics tools in your lab
  • Attend your choice of sessions on informatics topics
  • Learn best practices for laboratory information management, including how an integrated approach can expedite workflows
  • Connect with your peers

Venue and Format
Lodging and Summit activities take place at the Kahler Grand Hotel in Rochester, MN. All day sessions on October 3 and the morning of October 4 include a variety of hands-on, introductory, and training sessions.

More Information
If you have questions, please contact us.

register