Perspectives on training and on-boarding users of the Genomics England Cancer Program
By Jawahar Swaminathan, Ph.D., Program Manager – Population Genomics (aided by Keira Cheetham, Ph.D., Staff Bioinformatics Scientist)
Illumina and Genomics England announced the Bioinformatics and Clinical Interpretation partnership (BCIP) in February 2016 with the aim: “develop a platform and knowledge base that can be used to improve and automate genome interpretation.” As part of this collaboration Illumina developed a customized version of BaseSpace™ Variant Interpreter (BSVI) for cancer and rare disease, including various backend services to allow integration between the Genomics England case dispatch pipeline and Illumina systems. What followed was a rigorous schedule of meetings between Genomics England and Illumina (read as long hours, late nights, lots of coffee and many meetings at Genomics England HQ in London!) leading to development of essential features for cancer interpretation.
In June 2017, following multiple rounds of user acceptance testing and concordance checks, BSVI was adopted by Genomics England as the default interpretation solution. Illumina then began the process of on-boarding various users at the 13 Genomic Medicine Centres (GMC), the recruiting hubs for various regions of England by organizing training sessions on the use of the software with particular focus on the unique way data entered and left the system. This article is a look back on these activities and how they are helping in the development of genome interpretation software that meets the diverse needs of the Genomics England end users.
The GMC training sessions
Over the course of 2018, we carried out training and outreach activities across most of the GMCs. The GMCs are the recruitment hubs for the Genomics England 100,000 Genomes Project and comprise of multiple hospitals centered around a geographical area that has the necessary expertise. All training activities were organized by the Genomics England Cancer Interpretation team and were also attended by a representative from Genomics England.
Some humorous takeaways:
- Long hours on an early morning packed train from Cambridge (where we are situated) to our destination city, including a hurriedly eaten lunch at a busy Costa Coffee (yes almost every hospital in the UK has one of these) at the hospital before the training! Throw in the occasional aborted visit due to an alarmingly growing windscreen crack on a rental car or boarding the wrong train and you have the makings of a long and interesting day.
- Every NHS hospital looks the same. The usual 1960s concrete exterior, the same typeface on the signs and the same warren of corridors to the Clinical Genetics department
- Working out how to use the different display equipment in different hospitals before attempting to figure out internet connectivity on the slow and ageing hospital computer systems.
- Hot chocolate or a burrito on the return leg at the local train station as a treat for a job done well
- Never work with children, animals, or live demos. Although we always got the live demo to work!
All training activities were conducted by my colleague Keira Cheetham and I and involved a mix of presentations, live demos using cases specific to the GMC followed by hands-on instructions on how to use the software and send results back to Genomics England for reporting. The training was also an opportunity for us to talk about the science around interpreting cancer genomes and how Illumina is facilitating greater insights into cancers with whole genome sequencing (WGS).
This was also a great opportunity to see how the BCIP tools were used by GMC users and any feedback (both good and bad) were gratefully received. We also spoke about upcoming features in these sessions. Attendance at these events varied from 2-10 users per GMC and the venues ranged from really tight spaces (sometimes with windows!) to large meeting rooms and everything in between. However, what was consistent throughout was the motivation and dedication of the NHS staff in delivering the best possible care to their patients recruited into the Genomics England 100,000 Genomes Project cancer program.
Illumina continues to work with Genomics England to extend its BCIP tools for Rare Disease interpretation and this offering will soon be available for user acceptance testing and following that, could be used in Genomics England’s suite of clinical interpretation systems. In the meantime, the UK NHS has announced the commissioning of WGS for rare disease and cancer, to be offered throughout the health system. The outreach activities of 2018 carried out by Keira and I for cancer will keep in us good stead for the next round of training for rare disease.
The Genomics England Cancer Outreach Program by numbers
- ~76 GMC users across 11 GMCs trained
- ~ 34 hours of training imparted
- ~4000 miles travelled (all by British Rail barring Belfast Northern Ireland)
 The version of BSVI co-developed with Genomics England as part of the BCIP contains extensive customizations for their use cases and is not openly accessible to the public. Please contact your Illumina sales representative for guidance on how to use the publicly available version of BSVI.
Advancing Workflows through Relentless Innovation
We’ve been busy over the last few months! Back in May, Illumina announced the acquisition of Edico Genome and the DRAGEN™ (Dynamic Read Analysis for GENomics) technology. Since then, we have been hard at work expanding DRAGEN’s capabilities to provide more advanced, robust and performant pipelines for our customers. With the inclusion of DRAGEN into the Illumina ecosystem, we are now able to take advantage of the expertise of both teams to build out an expanded chest of tools that offer added functionality, benefits and ease-of-use.
The team has come a long way since we last published about DRAGEN on the BaseSpace™ Blog, and we are excited to share some insight into what we have been working on. Over the coming months, we will continue to post about our latest updates and activities to keep you updated.
Earlier this month, we released DRAGEN v3.2.8, which introduces a variety of new capabilities designed to deliver more insights from your data.Read More…
Next-generation sequencing (NGS) systems now produce more data than ever before. Additionally, a typical NGS workflow involves manual, time-consuming touchpoints for quality control, analysis setup, and results review. As a result, labs who perform NGS or other complex, high-volume processing of samples can be overwhelmed managing the workflows and data generated. To address these issues and simplify NGS research, we are happy to announce the new version of BaseSpace Sequence Hub. It is designed to enhance your laboratory’s efficiency and support the needs of high-throughput labs.
Included in this update are new features, including a biosample-centric data model that provides tracking of all biosample activity from lab preparation through analysis delivery. We’re also introducing the following features:
- New automation quality control features
- Automated app launches and workflows
- An updated Application Programming Interface (API) to help you streamline your next-generation sequencing (NGS) workflows
- An improved user interface that helps you access your data and perform functions more quickly
Biosample-centric Data Model
Our new biosample-centric data model enables easy tracking of all biosample activity from lab preparation through analysis delivery. Biosamples are the data containers that represent the original DNA source material. They are used to trace all sequencing activities, including lab preparation (with LIMS integration) sequencing runs, data analysis, and delivery of data.
The new data model centers on biosamples, the original source of DNA, so you can easily track all biosample activity from lab preparation, with optional laboratory information management system (LIMS) integration, to delivery of analysis results. Biosamples can be used as inputs to multiple sequencing runs, and they can contain multiple datasets, which can live within separate projects.
Important Note: Biosamples with the same name (Sample ID in the sample sheet) are automatically aggregated. The new features will aggregate all FASTQ data sets with the same Sample ID into a single biosample. It is important to name the samples in your sample sheet uniquely, otherwise they will be aggregated together. Learn more about automatic data aggregation here.
Automated Lane QC, App Launch, and Analysis QC
After sequencing, much of the work required to process biosamples can be automated in bulk. By setting up automation ahead of time using the command line interface (CLI), sequencing runs can be automatically passed or failed based on their sequencing quality, converted to FASTQ datasets, used as inputs in an app, and then be passed or failed based on their app metrics. Automation removes much of the time-consuming and error prone manual work of processing sequencing data into downstream results.
Improved User Interface
The updated interface provides quick access to all of your data from the My Data menu, while the new Action Toolbar contains new and improved app functions such as requeues, QC status changes, workflows, and collaboration tools.
The Analyses page provides a listing of all analyses in your account. The filters on this page help you quickly narrow your search for specific analyses by their current status.
The Projects and Runs pages function the same as before, providing quick access to all of your sequencing projects and instrument runs.
Advanced Automation and Integration Toolset
Alongside our updated data model, we’ve introduced version 2 of the API, which enables you to interact directly with your data and integrate systems together with your BaseSpace Sequence Hub account.
The new automation tools in version 2 of the API:
- Correspond to the new biosample-centric data model
- Improve performance and robustness of the solution
- Include new documentation
Note: The version 1 API is still fully-supported and maintained, although we are actively focusing primarily on version 2 API development. The version 1 API documentation is maintained here.
Version 2 of BaseSpaceCLI has been built using the version 2 API. BaseSpace CLI can be leveraged to read data from your BaseSpace Sequence Hub account and create new data by uploading data and launching apps. In addition, the new BaseSpace CLI can be used to create automated analysis workflows, and import biosamples.
BaseMount is a command-line tool which allows you to explore through runs, projects, biosamples, and datasets, and interact directly with the associated files exactly as you would with any other file system.
We hope the new functionality of BaseSpace Sequence Hub enables your lab to boost productivity and discovery. View a video or visit our updated Support Site to learn more about how to use all the new features and tools. Please contact us at firstname.lastname@example.org if you have any questions or comments.
The BaseSpace Sequence Hub Team
- CLI documentation https://developer.basespace.illumina.com/docs/content/documentation/cli/cli-overview
- CLI automated workflow creation docs https://developer.basespace.illumina.com/docs/content/documentation/cli/cli-examples
- Link to v1 API docs https://developer.basespace.illumina.com/docs/content/documentation/rest-api/v1-api-reference
- Link to v2 API docs https://developer.basespace.illumina.com/docs/content/documentation/rest-api/api-reference
To date, most of what we know about our genome comes from studying populations of cells. Although few would argue with how far we have come to understand our genome, many researchers now realize that it may be just as important to fully examine the heterogeneity that exists within the population of cells. Evidence suggests that bulk sequencing methods can mask the contribution of individual cells. As a result, many researchers are turning to an evolving technique: single-cell sequencing.
Pioneered in the 1990s by James Eberwine2 and made more robust by the analytical sensitivity and specificity of next-generation sequencing (NGS) methods,3 single-cell sequencing enables researchers to examine the heterogeneity of cells, and promises to reveal what role individual cells play in disease and complex biological systems.
How? For every cell sequenced, researchers have a comprehensive map of the transcriptome that can be analyzed in several of different ways to characterize cells at single-cell resolution. Currently, 3 primary applications stand out:
Join us for our BaseSpace® Suite Informatics Summit in Copenhagen, DK on 31 May and 1 June. Immediately after the European Society for Human Genetics (ESHG) annual meeting, attendance at the summit is FREE. Learn more about our informatics tools and how they’re designed to help you transform complex genomic data into meaningful insights quickly and easily.
Why attend a BaseSpace Suite Summit?
- Share your perspectives on applying informatics tools in your lab
- Attend informative sessions and learn how other customers use informatics
- Get important product information for BaseSpace Clarity LIMS, BaseSpace Sequence Hub, BaseSpace Variant Interpreter (Beta), BaseSpace Cohort Analyzer, and BaseSpace Correlation Engine
- Learn best practices, including how an integrated approach to informatics can expedite workflows
- Connect with your peers
Learn more by clicking on the “Summit” dropdown above, or click here.
You’re invited to an exclusive informatics event
Advancing Precision Medicine efforts relies on the ability to make sense of a growing body of genomic data. The need for robust informatics tools and an integrated approach when it comes to acquiring, storing, distributing, and analyzing data is essential.
Join us for our BaseSpace® Suite Summit in Rochester, MN on October 3 and 4. Taking place immediately before the Individualizing Medicine Conference, registration is free. Learn more about our informatics tools and how they’re designed to help you transform complex genomic data into meaningful insights quickly and easily.
- Share your perspectives on applying informatics tools in your lab
- Attend your choice of sessions on informatics topics
- Learn best practices for laboratory information management, including how an integrated approach can expedite workflows
- Connect with your peers
Venue and Format
Lodging and Summit activities take place at the Kahler Grand Hotel in Rochester, MN. All day sessions on October 3 and the morning of October 4 include a variety of hands-on, introductory, and training sessions.
If you have questions, please contact us.
The new Enrichment v3.0 BaseSpace® App (formerly called Isaac Enrichment) introduces major improvements and new features including:
- Improved small variant calling
- Copy number variant (CNV) calling
- Structural variant calling
- Somatic/low-frequency variant calling
- Ability to start from FASTQ or BAM
- GRCh38 reference added
- Variant table CSV file including variant frequencies
- Improved variant annotation engine
- Improved metrics engine