New Sequence Quality Metrics in BaseSpace™ Sequence Hub

The Run Monitoring features in BaseSpaceTM Sequence Hub (BSSH) enable users to remotely monitor the quality of their sequencing runs and troubleshoot sequencing errors. As part of our efforts to extend real time Run Monitoring capabilities, we recently released new data quality metrics in BSSH.

 

% Occupancy for iSeq™ and MiniSeq™ instruments

 In a previous release, we added the %Occupied measure in the Charts section of Run Monitoring for the NovaSeq™ systems. As part of this release, this metric will now be visible for iSeq and MiniSeq systems, in BaseSpace Sequence Hub. This measure can be used to understand loading concentrations on the flow cell.

For patterned and non-patterned flow cells, % Occupancy is the percentage of clusters on the flowcell that have DNA that can ultimately be sequenced. With patterned flow cells (such as iSeq), the number of nano wells on the patterned grid determines the total number of possible clusters. For non-patterned flow cells (such as MiniSeq), the total number of possible clusters is the number of non-duplicated spots identified by Real Time Analysis (RTA) during template generation.

 

new metrics

 

% Pass Filter (%PF) settings for all instruments

 The Flow Cell chart in BaseSpace Sequence Hub has also been updated to include the %Pass Filter (%PF) for all instruments. This additional information will allow users to determine in particular tiles of a flowcell have unusual levels of %PF.

%PF

With these enhancements, we have added capabilities that are currently not available in Sequence Analysis Viewer (SAV). SAV will be updated in the future so our users have a consistent experience across SAV and BSSH.

 

#QB6200

Putting Your Privacy First

BaseSpace™ Sequence Hub is used by investigators around the world to facilitate and scale their sequencing and genomic data analysis operations. At Illumina, we understand that security, privacy, and confidentiality are complex issues, and we are committed to protecting our software-as-a-service (SaaS) customers’ data.

To ensure that our customers remain compliant with upcoming changes to the EU General Data Protection Regulation (GDPR), we’ve made a number of updates to privacy practices, policies and agreements that are effective May 25, 2015 for all users globally.  These changes include explaining in more detail how we use your information, including your choices, rights, and controls.

Privacy and compliance is a shared responsibility between Illumina and our customers. We are responsible for the security of the BaseSpace Sequence Hub platform. Our cloud provider, Amazon Web Services (AWS) is responsible for providing the tools, services and functionality that enable both the data controller (our customers) and the data processor (Illumina) to be successful.

 AWS-ILMN_Shared_Responsibility_Model

Figure 1: Shared responsibility Model

 

A short summary of our changes:

  • GDPR and Terms & Conditions (T&Cs). GDPR places new obligations on organizations that process EU personal data. As a result, we have updated our business operational practices. The following documents (Privacy Policy (Link), and Terms & Conditions (Link)) better explain our customers’ and users’ rights, and their relationship with Illumina. In addition all our NGS product support pages have been updated with a Privacy & Security section (Link).
  • Improved clarity and transparency.As a key part of GDPR compliance, we’ve described our data processing practices in clear language. For instruments sending Performance Data (IPD) to BaseSpace Sequence Hub, or connected in the Run Monitoring or Storage and Analysis mode, our updated Illumina®Proactive Technical Note (Link) clearly explains what data is sent to BaseSpace in each of the connectivity modes.
  • Data Protection Addendum:BaseSpace Sequence Hub leverages AWS to deliver its services. The updated AWS Service Terms (Link) incorporate the GDPR Data Processing Addendum (DPA) and will automatically apply to all customers. Illumina is willing to sign a DPA for customers who ask for it.
  • Opt-in & Opt-out:Sharing data with BaseSpace Sequence Hub, irrespective of connectivity mode, is entirely controlled by our customers. If you would like to opt out of sharing Instrument Performance Data (IPD), Run Monitoring, or Storage and Analysis mode, you can do so at any time.

In addition, we are continually reviewing and updating our security best practices to safeguard your data and the services we provide. We are ISO 27001 certified, which has a direct emphasis on international compliance and governance. Please review our security and data privacy whitepaper (Link) to learn more about our security practices.

We hope this makes your use of our SaaS products much easier. As always, please contact us at informatics@illumina.com if you have any questions.

QB#6005

Enhanced Run Monitoring in BaseSpace™ Sequence Hub

The ability to monitor sequencing runs in real time helps users identify issues that prevent costly sequencing errors. Many users rely on the Sequencing Analysis Viewer (SAV) to access detailed quality metrics generated by the real-time analysis software on Illumina instruments.

BaseSpace Sequence Hub has enabled users to remotely monitor their sequencing runs with the Run Charts function with a very similar interface to that of SAV. We have recently released a synchronized update with SAV to offer an expanded set of metrics for monitoring run quality. At the same time, we have added a few capabilities previously only present in SAV. These enhancements provide a consistent experience and enable users to make informed decisions on the quality of their sequencing runs – whether they are standing in front of their instrument accessing SAV or monitoring the run remotely using BaseSpace Sequence Hub.

Expanded menu of metrics that maintains consistency with SAV

BaseSpace Sequence Hub now includes per cycle Phasing and Pre-phasing metrics, % No Call, and Median QScore measures in the Charts section of Run Monitoring. These measures were also released as part of SAV 2.4.5. % No Call & Median QScores are available for all sequencing platforms. The new Phasing/Pre-phasing metrics are available for all platforms except MiSeq and HiSeq 2000/2500.

expanded menu.png

Traditional Phasing (and pre-phasing) metrics, which were calculated once at cycle 25, are now listed as “Legacy Phasing Rate.” The new per-cycle weights are listed as “Phasing Weight” in the Run Charts.

traditional phasing.png

Improved usability

The Charts section of Run Monitoring now includes the same menu structure as SAV 2.4.5. Now, metrics in the drop down menus only appear if they are available for the cycle, significantly improving the usability of the charts.

Extracted, Called, and Scored cycles have a minimum-maximum range

Run Monitoring now provides Extracted, Called, and Scored cycles as a minimum-maximum range during an instrument run. Previously, Run Monitoring showed only the maximum cycles. A wide spread between the leading and lagging tile might be an indication of a run problem. Now users can easily spot a problem with their run on both SAV and BaseSpace Sequence Hub.

New Metrics in Both SAV and BaseSpace Sequence Hub

In addition to the changes enumerated above, both SAV and BaseSpace Sequence Hubnow include Occupied Count (K) and % Occupied measures in the Charts section of Run Monitoring for NovaSeq systems. The Occupied Count is a measure of the number of wells on the flow cell with DNA. Adding these new metrics will help users understand their loading concentrations and identify issues with their sequencing run.

new metrics

 

For Research Use Only. Not for use in diagnostic procedures.

BaseSpace™ Clarity LIMS NovaSeq™ Integration Now Supports the S1 Flow Cell

Integration and interoperability between laboratory systems –or lack thereof—remains a challenge for those performing next-generation sequencing (NGS) or other genomics studies.[i] To address this challenge, we developed version 2.2 of the integration between BaseSpace Clarity LIMS and the NovaSeq 6000 instrument. This integration now supports the NovaSeq S1 flow cell.

The NovaSeq S1 flow cell delivers up to 0.5TB of output in two days and is ideally suited for high-intensity sequencing applications. Users can now sequence up to 8 human genomes or 80 exomes per run in approximately 24 hours.[ii] And now, users of both Basespace Clarity LIMS and NovaSeq 6000 instrument can access this out-of-the box integration to quickly get up and running with their system.

fun format.png

The NovaSeq 6000 version 2.0 Workflow in BaseSpace Clarity LIMS that supports the integration version 2.2.1

 

The integration helps users track samples throughout the workflow. Specifically, it:

  • Supports S1, S2, and S4 flow cells per sample
  • Supports different applications on the same flow cell
  • Calculates samples and reagents volumes based on the flow cell type
  • Creates an output file for use with liquid handling robots
  • Validates every step in the workflow

The integration also tracks sequencing run information in BaseSpace Clarity LIMS to help with troubleshooting or trending:

  • Run recipe files (JSON) are automatically generated to set up and initiate the run
  • Sample sheets, which are compatible with BaseSpace Sequence Hub and bcl2fastq v 2.19, are automatically generated and placed directly on the NovaSeq 6000 instrument
  • Sequencing run are tracked and run metrics are parsed per lane and per flow cell

If you have questions about this integration, please contact Technical Support.

For Research Use Only. Not for use in diagnostic procedures.


 

[i] Next-Generation Sequencing Informatics: Challenges and … http://www.bing.com/cr?IG=74008A18392242E59F11965A936C0331&CID=1B0873003B0C6EB91053783A3A0A6F0E&rd=1&h=qZ8eqx6ov_OxkAzDtTWfrbsSZM2WP_pCoQuO66f-AVI&v=1&r=http%3a%2f%2fwww.archivesofpathology.org%2fdoi%2f10.5858%2farpa.2015-0507-RA&p=DevEx,5067.1. Accessed November 14, 2017.

[ii]  Illumina.com. (2017). Illumina Releases NovaSeq S4 Flow Cell and NovaSeq Xp Workflow. [online] Available at: https://www.illumina.com/company/news-center/press-releases/2017/2308795.html [Accessed 16 Nov. 2017].

 

 

Announcing the New Data Uploader in BaseSpace™ Cohort Analyzer

BaseSpace Cohort Analyzer enables users to automatically aggregate and analyze subjects with genomics and phenotype data in a few clicks. Ultimately, users can analyze and share data for biomarker discovery, translational research, and clinical trials.

One of the most powerful features of BaseSpace Cohort Analyzer is the ability to centralize all available information for a subject into a single record. This includes phenotype obtained from various phenotypic databases, lab and image data, and genomic, methylation, proteomics, and expression data, to name a few. Breaking down siloed data in this way enables users to perform integrative analyses to make meaningful discoveries in aggregated data. Now, users of BaseSpace Cohort Analyzer can take advantage of a new beta feature: the Data Uploader.

Data Uploader: Import Somatic, CNV, RNA-Seq and >500 Phenotypical Attributes

You can now easily import your genomic data (somatic mutation or copy number variations between tumor and normal samples), or RNA-Seq data into BaseSpace Cohort Analyzer for analysis. Either upload your own files or directly import from a BaseSpace Sequence Hub Enterprise account. The uploader supports >500 phenotype and subject measurements.

Uploading and Analyzing Data

1. Upload in 2 Steps through the Data Uploader (beta)

  • Load data with >500 of phenotypic attributes, including age, gender, condition, therapies, overall survival and other outcomes.
  • Load genomic data and RNA-seq data directly from BaseSpace Sequence Hub, or from a desktop in multiple formats.
  • Check your data to catch formatting errors prior to ingestion.

ch1

2. Process and integrate your data so you can analyze it in real time within BaseSpace Cohort Analyzer.

  • Monitor and view study import status through a user interface
  • Automatically add meaningful content for analysis such as calculating tumor mutation burden for all uploaded somatic mutation data

ca2

3. Analyze Data in BaseSpace Cohort Analyzer

After your data is uploaded, perform cohort analysis using over 100 bioinformatic workflows and

  • Compare your data with other datatypes or technologies
  • Load and view everything associated to a single subject in one place
  • Filter and select a cohort based on any phenotype or molecular marker(s).
  • Integrate and analyze your data with clinical outcomes and therapies
  • Understand the survival, molecular, and clinical differences between two groups
  • Find expression outliers in your cohort of interest
  • Research meaningful biomarkers and drug targets

ca3

For more information about BaseSpace Cohort Analyzer, the Data Uploader or to sign up for a free trial, please contact us at techsupport@illumina.com.

 

For Research Use Only. Not for use in diagnostic procedures.

Characterizing Bacterial Single Isolates with BaseSpace™ Sequence Hub Apps

A guest blog, written by GoSeqIt

In an increasingly globalized world, bacteria can spread rapidly and easily. Furthermore, they often contain genes that make them resistant to antibiotics or confer high virulence. Sequencing the entire genome of bacteria enables a thorough characterization and thus makes it possible for researchers to monitor the spread of particular strains of bacteria or sets of genes.

In collaboration with the Illumina BaseSpace Sequence Hub development team, GoSeqIt has published two apps for characterization of bacterial single isolates. Both of these apps are now available to BaseSpace Sequence Hub users:

The input for both apps is a bacterial complete or draft genome in FASTA format (only files with the extension .fa or .fasta are accepted).

The genomes may have been generated by either the BaseSpace SPAdes Genome Assembler app or the Velvet de novo Assembly app.

Bacterial Analysis Pipeline App

The Bacterial Analysis Pipeline app will initially predict the species of the bacterial draft genome based on the number of kmers (oligonucleotides with the length k) co-occurring between the input genome and bacterial genomes in a reference database (1). Further, acquired antimicrobial resistance genes are identified using a BLAST-based approach, where the nucleotide sequence of the input genome is compared to the genes in the ResFinder database (2). Depending on the identified species, Multilocus Sequence Typing (MLST) is performed, also using a BLAST-based approach (3). One-hundred-twenty-five (125) MLST schemes are currently available.

If the input genome is recognized as belonging to Enterobacteriaceae or the gram positive bacteria (Enterococcus, Streptococcus, or Staphylococcus), BLAST is used to search for plasmid replicons using the PlasmidFinder database (4). Identified plasmids of the incF, IncH1, IncH2, IncI1, IncN, or IncA/C type are further subtyped by plasmid MLST (4). Finally, identified Escherichia coli, Enterococcus sp., Listeria sp., and Staphylococcus aureus are compared to the VirulenceFinder database containing known virulence genes (5). For more information, refer to the article titled “Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance.” Figure 1 illustrates the output for species prediction and MLST, while figure 2 illustrates the output for the prediction of acquired antimicrobial resistance genes.

fig1

Figure 1: Example of output from the Bacterial Analysis Pipeline app for species prediction and MLST of the input genome.

fig2.png

Figure 2: Example of output from the Bacterial Analysis Pipeline app for acquired antimicrobial resistance genes in the input genome.

E. coli Serotyping App

The E. coli Serotyping app uses a BLAST-based approach to predict the serotype of E. coli isolates by comparing the input genome with a database of specific O-antigen processing system genes for O typing and flagellin genes for H typing (7). The app outputs the predicted serotype along with the identified O-antigen genes (wzx, wzy, wzm, and wzt) and flagellin genes (fliC, flkA, fllA, flmA, and flnA).
fig3.png

Figure 3: Example of output from the E. coli Serotyping app. So far, only E. coli isolates can in this way be in silico serotyped.

Using the New Apps

The price for using the Bacterial Analysis Pipeline app is 5 iCredits per uploaded file plus the cost of computing. The E. coli Serotyping app costs 1 iCredit per uploaded file plus the cost of computing.

Both apps use methods that have been throughly described and published in renowned scientific journals.

References

1) Larsen MV, Cosentino S, Lukjancenko O, Saputra D, Rasmussen S, Hasman H, Sicheritz-Pontén T, Aarestrup FM, Ussery DW, Lund O. Benchmarking of methods for genomic taxonomy. J Clin Microbiol. 2014 May;52(5):1529-39. PMID: 24574292.

2) Zankari E, Hasman H, Cosentino S, Vestergaard M, Rasmussen S, Lund O, Aarestrup FM, Larsen MV. Identification of acquired antimicrobial resistance genes. J Antimicrob Chemother. 2012 Nov;67(11):2640-4. PMID: 22782487.

3) Larsen MV, Cosentino S, Rasmussen S, Friis C, Hasman H, Marvig RL, Jelsbak L, Sicheritz-Pontén T, Ussery DW, Aarestrup FM, Lund O. Multilocus sequence typing of total-genome-sequenced bacteria. J Clin Microbiol. 2012 Apr;50(4):1355-61. PMID: 22238442.

4) Carattoli A, Zankari E, García-Fernández A, Voldby Larsen M, Lund O, Villa L, Møller Aarestrup F, Hasman H. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob Agents Chemother. 2014 Jul;58(7):3895-903. PMID: 24777092.

5) Joensen KG, Scheutz F, Lund O, Hasman H, Kaas RS, Nielsen EM, Aarestrup FM. Real-time whole-genome sequencing for routine typing, surveillance, and outbreak detection of verotoxigenic Escherichia coli. J Clin Microbiol. 2014 May;52(5):1501-10. PMID: 24574290.

6) Thomsen MC, Ahrenfeldt J, Cisneros JL, Jurtz V, Larsen MV, Hasman H, Aarestrup FM, Lund O. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance. PLoS One. 2016 Jun 21;11(6):e0157718. PMID: 27327771.

7) Joensen KG, Scheutz F, Lund O, Hasman H, Kaas RS, Nielsen EM, Aarestrup FM. Real-time whole-genome sequencing for routine typing, surveillance, and outbreak detection of verotoxigenic Escherichia coli. J Clin Microbiol. 2014 May;52(5):1501-10. PMID: 24574290.

 

For Research Use Only. Not for use in diagnostic procedures.

BaseSpace™ Clarity LIMS NovaSeq™ Integration v2.2

Integration and interoperability between laboratory systems—or lack thereof—remains a challenge for those performing next-generation sequencing (NGS) or other genomics studies.1 To address this challenge, we developed version 2.2 of the integration between BaseSpace Clarity LIMS and the NovaSeq 6000 instrument. This integration now supports the NovaSeq S4 flow cell, as well as the NovaSeq Xp protocol.

Picture1

Figure 1: The NovaSeq 6000 version 2.0 Workflow in BaseSpace Clarity LIMS that supports the integration version 2.2

The NovaSeq S4 flow cell delivers up to 6 TB of output in two days and is ideally suited for high intensity sequencing applications. Users can now sequence up to 48 human genomes or 384 exomes per run in less than 48 hours. This innovation paves the way for large-population-scale initiatives at the lowest price per sample, and enables labs to cost effectively perform human whole-genome sequencing.2 And now, users of both BaseSpace Clarity LIMS and the NovaSeq 6000 instrument can access this out-of-the box integration to get up and running with their system sooner.

The new integration helps users track samples throughout the workflow. Specifically, it:

  • Supports S13, S2, and S4 flow cells per sample
  • Supports different applications on the same flow cell
  • Calculates samples and reagents volumes based on the flow cell type
  • Creates an output file for use with liquid handling robots
  • Validates every step in the workflow

The new integration also tracks sequencing run information in BaseSpace Clarity LIMS to help with troubleshooting or trending:

  • Run recipe files (JSON) are automatically generated to set up and initiate the run
  • Sample sheets, which are compatible with BaseSpace Sequence Hub and bcl2fastq
    v 2.19, are automatically generated and placed directly on the NovaSeq 6000 instrument
  • Sequencing run are tracked and run metrics are parsed per lane and per flow cell

If you have questions about this integration, please email Illumina Technical Support.

References

  1. Next-Generation Sequencing Informatics: Challenges and … http://www.bing.com/cr?IG=74008A18392242E59F11965A936C0331&CID=1B0873003B0C6EB91053783A3A0A
    6F0E&rd=1&h=qZ8eqx6ov_OxkAzDtTWfrbsSZM2WP_pCoQuO66f-AVI&v=1&r=http%3a%2f%2fwww.archivesofpathology.org%2fdoi%2f10.5858%2farpa.2015-0507-RA&p=DevEx,5067.1. Accessed November 14, 2017.
  2. Illumina.com. (2017). Illumina Releases NovaSeq S4 Flow Cell and NovaSeq Xp Workflow. [online] Available at: https://www.illumina.com/company/news-center/press-releases/2017/2308795.html [Accessed 16 Nov. 2017].
  3. Upcoming flow cell in the NovaSeq 6000 instrument portfolio

For Research Use Only. Not for Use in Diagnostic Procedures.