Enabling Cancer Interpretation At Scale For The Genomics England 100K Genomics Project

Perspectives on training and on-boarding users of the Genomics England Cancer Program

By Jawahar Swaminathan, Ph.D., Program Manager – Population Genomics (aided by Keira Cheetham, Ph.D., Staff Bioinformatics Scientist)

Illumina and Genomics England announced the Bioinformatics and Clinical Interpretation partnership (BCIP) in February 2016 with the aim: “develop a platform and knowledge base that can be used to improve and automate genome interpretation.” As part of this collaboration Illumina developed a customized version of BaseSpaceVariant Interpreter (BSVI)[1] for cancer and rare disease, including various backend services to allow integration between the Genomics England case dispatch pipeline and Illumina systems. What followed was a rigorous schedule of meetings between Genomics England and Illumina (read as long hours, late nights, lots of coffee and many meetings at Genomics England HQ in London!) leading to development of essential features for cancer interpretation.

In June 2017, following multiple rounds of user acceptance testing and concordance checks, BSVI was adopted by Genomics England as the default interpretation solution. Illumina then began the process of on-boarding various users at the 13 Genomic Medicine Centres (GMC), the recruiting hubs for various regions of England by organizing training sessions on the use of the software with particular focus on the unique way data entered and left the system. This article is a look back on these activities and how they are helping in the development of genome interpretation software that meets the diverse needs of the Genomics England end users.

Figure 1: The Genomics England Genomic Medicine Centres (Image Courtesy: Genomics England Ltd.

The GMC training sessions

Over the course of 2018, we carried out training and outreach activities across most of the GMCs. The GMCs are the recruitment hubs for the Genomics England 100,000 Genomes Project and comprise of multiple hospitals centered around a geographical area that has the necessary expertise. All training activities were organized by the Genomics England Cancer Interpretation team and were also attended by a representative from Genomics England.

Some humorous takeaways:

  1. Long hours on an early morning packed train from Cambridge (where we are situated) to our destination city, including a hurriedly eaten lunch at a busy Costa Coffee (yes almost every hospital in the UK has one of these) at the hospital before the training! Throw in the occasional aborted visit due to an alarmingly growing windscreen crack on a rental car or boarding the wrong train and you have the makings of a long and interesting day.
  2. Every NHS hospital looks the same. The usual 1960s concrete exterior, the same typeface on the signs and the same warren of corridors to the Clinical Genetics department
  3. Working out how to use the different display equipment in different hospitals before attempting to figure out internet connectivity on the slow and ageing hospital computer systems.
  4. Hot chocolate or a burrito on the return leg at the local train station as a treat for a job done well
  5. Never work with children, animals, or live demos. Although we always got the live demo to work!

All training activities were conducted by my colleague Keira Cheetham and I and involved a mix of presentations, live demos using cases specific to the GMC followed by hands-on instructions on how to use the software and send results back to Genomics England for reporting. The training was also an opportunity for us to talk about the science around interpreting cancer genomes and how Illumina is facilitating greater insights into cancers with whole genome sequencing (WGS).

This was also a great opportunity to see how the BCIP tools were used by GMC users and any feedback (both good and bad) were gratefully received. We also spoke about upcoming features in these sessions. Attendance at these events varied from 2-10 users per GMC and the venues ranged from really tight spaces (sometimes with windows!) to large meeting rooms and everything in between. However, what was consistent throughout was the motivation and dedication of the NHS staff in delivering the best possible care to their patients recruited into the Genomics England 100,000 Genomes Project cancer program.

Illumina continues to work with Genomics England to extend its BCIP tools for Rare Disease interpretation and this offering will soon be available for user acceptance testing and following that, could be used in Genomics England’s suite of clinical interpretation systems. In the meantime, the UK NHS has announced the commissioning of WGS for rare disease and cancer, to be offered throughout the health system. The outreach activities of 2018 carried out by Keira and I for cancer will keep in us good stead for the next round of training for rare disease.

The Genomics England Cancer Outreach Program by numbers

  • ~76 GMC users across 11 GMCs trained
  • ~ 34 hours of training imparted
  • ~4000 miles travelled (all by British Rail barring Belfast Northern Ireland)

[1] The version of BSVI co-developed with Genomics England as part of the BCIP contains extensive customizations for their use cases and is not openly accessible to the public. Please contact your Illumina sales representative for guidance on how to use the publicly available version of BSVI.

Doing more with DRAGEN™ v3.2.8

Advancing Workflows through Relentless Innovation

We’ve been busy over the last few months! Back in May, Illumina announced the acquisition of Edico Genome and the DRAGEN™ (Dynamic Read Analysis for GENomics) technology. Since then, we have been hard at work expanding DRAGEN’s capabilities to provide more advanced, robust and performant pipelines for our customers. With the inclusion of DRAGEN into the Illumina ecosystem, we are now able to take advantage of the expertise of both teams to build out an expanded chest of tools that offer added functionality, benefits and ease-of-use.

The team has come a long way since we last published about DRAGEN on the BaseSpace™ Blog, and we are excited to share some insight into what we have been working on. Over the coming months, we will continue to post about our latest updates and activities to keep you updated.

Earlier this month, we released DRAGEN v3.2.8, which introduces a variety of new capabilities designed to deliver more insights from your data.

Read More…

New Sequence Quality Metrics in BaseSpace™ Sequence Hub

The Run Monitoring features in BaseSpaceTM Sequence Hub (BSSH) enable users to remotely monitor the quality of their sequencing runs and troubleshoot sequencing errors. As part of our efforts to extend real time Run Monitoring capabilities, we recently released new data quality metrics in BSSH.

 

% Occupancy for iSeq™ and MiniSeq™ instruments

 In a previous release, we added the %Occupied measure in the Charts section of Run Monitoring for the NovaSeq™ systems. As part of this release, this metric will now be visible for iSeq and MiniSeq systems, in BaseSpace Sequence Hub. This measure can be used to understand loading concentrations on the flow cell.

For patterned and non-patterned flow cells, % Occupancy is the percentage of clusters on the flowcell that have DNA that can ultimately be sequenced. With patterned flow cells (such as iSeq), the number of nano wells on the patterned grid determines the total number of possible clusters. For non-patterned flow cells (such as MiniSeq), the total number of possible clusters is the number of non-duplicated spots identified by Real Time Analysis (RTA) during template generation.

 

new metrics

 

% Pass Filter (%PF) settings for all instruments

 The Flow Cell chart in BaseSpace Sequence Hub has also been updated to include the %Pass Filter (%PF) for all instruments. This additional information will allow users to determine in particular tiles of a flowcell have unusual levels of %PF.

%PF

With these enhancements, we have added capabilities that are currently not available in Sequence Analysis Viewer (SAV). SAV will be updated in the future so our users have a consistent experience across SAV and BSSH.

 

#QB6200

Putting Your Privacy First

BaseSpace™ Sequence Hub is used by investigators around the world to facilitate and scale their sequencing and genomic data analysis operations. At Illumina, we understand that security, privacy, and confidentiality are complex issues, and we are committed to protecting our software-as-a-service (SaaS) customers’ data.

To ensure that our customers remain compliant with upcoming changes to the EU General Data Protection Regulation (GDPR), we’ve made a number of updates to privacy practices, policies and agreements that are effective May 25, 2015 for all users globally.  These changes include explaining in more detail how we use your information, including your choices, rights, and controls.

Privacy and compliance is a shared responsibility between Illumina and our customers. We are responsible for the security of the BaseSpace Sequence Hub platform. Our cloud provider, Amazon Web Services (AWS) is responsible for providing the tools, services and functionality that enable both the data controller (our customers) and the data processor (Illumina) to be successful.

 AWS-ILMN_Shared_Responsibility_Model

Figure 1: Shared responsibility Model

 

A short summary of our changes:

  • GDPR and Terms & Conditions (T&Cs). GDPR places new obligations on organizations that process EU personal data. As a result, we have updated our business operational practices. The following documents (Privacy Policy (Link), and Terms & Conditions (Link)) better explain our customers’ and users’ rights, and their relationship with Illumina. In addition all our NGS product support pages have been updated with a Privacy & Security section (Link).
  • Improved clarity and transparency.As a key part of GDPR compliance, we’ve described our data processing practices in clear language. For instruments sending Performance Data (IPD) to BaseSpace Sequence Hub, or connected in the Run Monitoring or Storage and Analysis mode, our updated Illumina®Proactive Technical Note (Link) clearly explains what data is sent to BaseSpace in each of the connectivity modes.
  • Data Protection Addendum:BaseSpace Sequence Hub leverages AWS to deliver its services. The updated AWS Service Terms (Link) incorporate the GDPR Data Processing Addendum (DPA) and will automatically apply to all customers. Illumina is willing to sign a DPA for customers who ask for it.
  • Opt-in & Opt-out:Sharing data with BaseSpace Sequence Hub, irrespective of connectivity mode, is entirely controlled by our customers. If you would like to opt out of sharing Instrument Performance Data (IPD), Run Monitoring, or Storage and Analysis mode, you can do so at any time.

In addition, we are continually reviewing and updating our security best practices to safeguard your data and the services we provide. We are ISO 27001 certified, which has a direct emphasis on international compliance and governance. Please review our security and data privacy whitepaper (Link) to learn more about our security practices.

We hope this makes your use of our SaaS products much easier. As always, please contact us at informatics@illumina.com if you have any questions.

QB#6005

Enhanced Run Monitoring in BaseSpace™ Sequence Hub

The ability to monitor sequencing runs in real time helps users identify issues that prevent costly sequencing errors. Many users rely on the Sequencing Analysis Viewer (SAV) to access detailed quality metrics generated by the real-time analysis software on Illumina instruments.

BaseSpace Sequence Hub has enabled users to remotely monitor their sequencing runs with the Run Charts function with a very similar interface to that of SAV. We have recently released a synchronized update with SAV to offer an expanded set of metrics for monitoring run quality. At the same time, we have added a few capabilities previously only present in SAV. These enhancements provide a consistent experience and enable users to make informed decisions on the quality of their sequencing runs – whether they are standing in front of their instrument accessing SAV or monitoring the run remotely using BaseSpace Sequence Hub.

Expanded menu of metrics that maintains consistency with SAV

BaseSpace Sequence Hub now includes per cycle Phasing and Pre-phasing metrics, % No Call, and Median QScore measures in the Charts section of Run Monitoring. These measures were also released as part of SAV 2.4.5. % No Call & Median QScores are available for all sequencing platforms. The new Phasing/Pre-phasing metrics are available for all platforms except MiSeq and HiSeq 2000/2500.

expanded menu.png

Traditional Phasing (and pre-phasing) metrics, which were calculated once at cycle 25, are now listed as “Legacy Phasing Rate.” The new per-cycle weights are listed as “Phasing Weight” in the Run Charts.

traditional phasing.png

Improved usability

The Charts section of Run Monitoring now includes the same menu structure as SAV 2.4.5. Now, metrics in the drop down menus only appear if they are available for the cycle, significantly improving the usability of the charts.

Extracted, Called, and Scored cycles have a minimum-maximum range

Run Monitoring now provides Extracted, Called, and Scored cycles as a minimum-maximum range during an instrument run. Previously, Run Monitoring showed only the maximum cycles. A wide spread between the leading and lagging tile might be an indication of a run problem. Now users can easily spot a problem with their run on both SAV and BaseSpace Sequence Hub.

New Metrics in Both SAV and BaseSpace Sequence Hub

In addition to the changes enumerated above, both SAV and BaseSpace Sequence Hubnow include Occupied Count (K) and % Occupied measures in the Charts section of Run Monitoring for NovaSeq systems. The Occupied Count is a measure of the number of wells on the flow cell with DNA. Adding these new metrics will help users understand their loading concentrations and identify issues with their sequencing run.

new metrics

 

For Research Use Only. Not for use in diagnostic procedures.

BaseSpace™ Clarity LIMS NovaSeq™ Integration Now Supports the S1 Flow Cell

Integration and interoperability between laboratory systems –or lack thereof—remains a challenge for those performing next-generation sequencing (NGS) or other genomics studies.[i] To address this challenge, we developed version 2.2 of the integration between BaseSpace Clarity LIMS and the NovaSeq 6000 instrument. This integration now supports the NovaSeq S1 flow cell.

The NovaSeq S1 flow cell delivers up to 0.5TB of output in two days and is ideally suited for high-intensity sequencing applications. Users can now sequence up to 8 human genomes or 80 exomes per run in approximately 24 hours.[ii] And now, users of both Basespace Clarity LIMS and NovaSeq 6000 instrument can access this out-of-the box integration to quickly get up and running with their system.

fun format.png

The NovaSeq 6000 version 2.0 Workflow in BaseSpace Clarity LIMS that supports the integration version 2.2.1

 

The integration helps users track samples throughout the workflow. Specifically, it:

  • Supports S1, S2, and S4 flow cells per sample
  • Supports different applications on the same flow cell
  • Calculates samples and reagents volumes based on the flow cell type
  • Creates an output file for use with liquid handling robots
  • Validates every step in the workflow

The integration also tracks sequencing run information in BaseSpace Clarity LIMS to help with troubleshooting or trending:

  • Run recipe files (JSON) are automatically generated to set up and initiate the run
  • Sample sheets, which are compatible with BaseSpace Sequence Hub and bcl2fastq v 2.19, are automatically generated and placed directly on the NovaSeq 6000 instrument
  • Sequencing run are tracked and run metrics are parsed per lane and per flow cell

If you have questions about this integration, please contact Technical Support.

For Research Use Only. Not for use in diagnostic procedures.


 

[i] Next-Generation Sequencing Informatics: Challenges and … http://www.bing.com/cr?IG=74008A18392242E59F11965A936C0331&CID=1B0873003B0C6EB91053783A3A0A6F0E&rd=1&h=qZ8eqx6ov_OxkAzDtTWfrbsSZM2WP_pCoQuO66f-AVI&v=1&r=http%3a%2f%2fwww.archivesofpathology.org%2fdoi%2f10.5858%2farpa.2015-0507-RA&p=DevEx,5067.1. Accessed November 14, 2017.

[ii]  Illumina.com. (2017). Illumina Releases NovaSeq S4 Flow Cell and NovaSeq Xp Workflow. [online] Available at: https://www.illumina.com/company/news-center/press-releases/2017/2308795.html [Accessed 16 Nov. 2017].

 

 

Announcing the New Data Uploader in BaseSpace™ Cohort Analyzer

BaseSpace Cohort Analyzer enables users to automatically aggregate and analyze subjects with genomics and phenotype data in a few clicks. Ultimately, users can analyze and share data for biomarker discovery, translational research, and clinical trials.

One of the most powerful features of BaseSpace Cohort Analyzer is the ability to centralize all available information for a subject into a single record. This includes phenotype obtained from various phenotypic databases, lab and image data, and genomic, methylation, proteomics, and expression data, to name a few. Breaking down siloed data in this way enables users to perform integrative analyses to make meaningful discoveries in aggregated data. Now, users of BaseSpace Cohort Analyzer can take advantage of a new beta feature: the Data Uploader.

Data Uploader: Import Somatic, CNV, RNA-Seq and >500 Phenotypical Attributes

You can now easily import your genomic data (somatic mutation or copy number variations between tumor and normal samples), or RNA-Seq data into BaseSpace Cohort Analyzer for analysis. Either upload your own files or directly import from a BaseSpace Sequence Hub Enterprise account. The uploader supports >500 phenotype and subject measurements.

Uploading and Analyzing Data

1. Upload in 2 Steps through the Data Uploader (beta)

  • Load data with >500 of phenotypic attributes, including age, gender, condition, therapies, overall survival and other outcomes.
  • Load genomic data and RNA-seq data directly from BaseSpace Sequence Hub, or from a desktop in multiple formats.
  • Check your data to catch formatting errors prior to ingestion.

ch1

2. Process and integrate your data so you can analyze it in real time within BaseSpace Cohort Analyzer.

  • Monitor and view study import status through a user interface
  • Automatically add meaningful content for analysis such as calculating tumor mutation burden for all uploaded somatic mutation data

ca2

3. Analyze Data in BaseSpace Cohort Analyzer

After your data is uploaded, perform cohort analysis using over 100 bioinformatic workflows and

  • Compare your data with other datatypes or technologies
  • Load and view everything associated to a single subject in one place
  • Filter and select a cohort based on any phenotype or molecular marker(s).
  • Integrate and analyze your data with clinical outcomes and therapies
  • Understand the survival, molecular, and clinical differences between two groups
  • Find expression outliers in your cohort of interest
  • Research meaningful biomarkers and drug targets

ca3

For more information about BaseSpace Cohort Analyzer, the Data Uploader or to sign up for a free trial, please contact us at techsupport@illumina.com.

 

For Research Use Only. Not for use in diagnostic procedures.