VCAT v4.0 – Support for DRAGEN and GA4GH best practices for small variant benchmarking

Author: Eric Allen, Associate Director of Bioinformatics at Illumina

The Variant Calling Assessment Tool (VCAT) v4.0 BaseSpace™ App has an updated and several new gold references and panel BED files.

This update includes a recent version of which (in combination with vcfeval) has been selected by the GA4GH as the recommended tool for small variant call benchmarking. For more details, see the publication at:

Updated and improved tools

Gold Reference Additions and Updates

  • Updated Platinum Genomes reference data to version 2017-1.0
  • Added all the NIST GiaB gold references to v3.3.2:
    • Caucasian female: NA12878 (mother)
    • Ashkenazi Jewish trio: NA24143 (mother), NA24149 (father), NA24385 (son)
    • Han Chinese trio: NA24695 (mother), NA24694 (father), NA24631 (son)

More built-in panel BED files

  • Illumina, IDT, and Twist exome panels
  • Several AmpliSeq panels

The diagram below shows how VCAT,, and vcfeval work together.

See the screenshots below displaying some of the new features.

We hope you find VCAT 4.0 useful. Questions can be directed to

For Research Use Only.  Not for use in diagnostic procedures.


MiXCR Immune Repertoire Analyzer version 2.1.11 from MiLaboratories is now available at Illumina BaseSpace™

  • Youting Sun, Senior Bioinformatics Scientist at Illumina
  • Dmitriy Chudakov, CSO at MiLaboratories

The upgraded version of the MiLaboratories LLC flagship software product, MiXCR, is now officially available as an Illumina BaseSpace™ application.

MiXCR is a “gold standard” analytical package in the area of T-cell receptors (TCR) and immunoglobulin (IG) repertoire profiling. Analysts apply MiXCR for extracting immune repertoires  from any type of sequencing data with any level of TCR/IG coverage, ranging from perfectly enriched libraries such as multiplex PCR or targeted 5’RACE, to the “rare event” datasets containing several target entities among hundreds of millions of reads, such as RNA-Seq, and even Exome-Seq data.

In the new version, simple selection of species (Human or Mouse), template material (RNA or DNA), library type (targeted or random), and basic information on library preparation enables appropriate analysis settings for a variety of immune repertoire experiment scenarios:

MiXCR Immune Repertoire Analyzer v2.1.11 Input Form Parameters.

Extraction of both full VDJ length or CDR3  only repertoires is possible, for TCR or IG chains of interest, with or without out-of-frame and stop codon-containing clonotype variants:

MiXCR Immune Repertoire Analyzer v2.1.11 Analysis Settings.

The application also provides post-analysis metrics in the form of interactive reports, including:

Post-analysis metrics in the form of interactive reports, including basic statistics.
Spectratype with major clonotypes.
Quantile Statistics on clonotype frequencies.
Clonotypes with colorized V, D, and J segments.

High extraction efficiency for any type of sequencing data and superb accuracy should make the new MiXCR BaseSpace version a highly useful resource for many. The ability to look at basic parameters and immediately download the resulting figures for reports and publication make it really convenient for the most efficient everyday work on immune repertoires.

Analysis of amplicon data

Recommended settings for panels (a) Immune repertoire panel and (b) TCR-beta SR panel:

  • Starting material: RNA for panel (a), RNA or Genomic DNA for panel (b)
  • Library type: Targeted TCR/IG library amplification (5’RACE, Amplicon, Multiplex, etc)
  • 5’-end of the library: V gene single primer/multiplex
  • 3’-end of the library: J gene single primer/multiplex
  • Presence of PCR primers and/or adapter sequence: Absent / nearly absent / trimmed
  • Target region: CDR3

For Research Use Only.  Not for use in diagnostic procedures.


BaseSpace™ CLI v1.0.0 is here!

By Swathi A. Ramani, Staff Product Manager – BaseSpace Sequence Hub

 If you’ve been using BaseSpace Sequence Hub for some time now, then you probably know that there is a lot more to the platform than the browser console. The Command Line Interface (CLI) is an easy-to-use command line tool that enables users to do more with BaseSpace via managing common (and not so common) tasks associated with their genomic data and analysis.

The CLI has been in development for over 4 years, and was created by our talented UK team. They needed automation tools to help sequence more than 20 petabases of data for the 100k Genomes Project. Over the past few months, we’ve been hard at work on the next generation of the CLI. We are thrilled to announce that our CLI is no longer in Beta! In our latest release, we have launched the officially supported BaseSpace (BS) Sequence Hub (BSSH) CLI v1.0.0, and all the exciting features that come with it. In the years since the initial release, we saw incredible product uptake and a lot of positive feedback from the BaseSpace community. With this launch announcement, we are delivering on some of your biggest requests for a robust feature set that simplifies data and analysis wrangling and process automation. This is a great foundation on which we can continue expanding our toolset.

Rich Built-in Features

BS CLI v1.0.0 is a completely different beast from its previous version. With just one file to download and configure, you can control multiple BaseSpace services and automate them through scripts, including uploading samples, downloading runs, launching or stopping apps and workflows, setting custom quality filters for your runs, launching analysis workflows, generate pre-signed URLs, and much more ! 

These include: 

  • Flexible install process: The CLI is installed by downloading a single binary with no additional dependencies, which enables you to install the CLI in an environment where you do not have administrator privileges.
  • Support for Linux, Mac and Windows (32 and 64 bit) operating systems
  • Rich options for listing details and filtering with customized output for seamless multi-command pipelines and scripts
  • Powerful data management features including creation, renaming and deletion of BSSH entities
  • Efficient upload of FASTQ datasets or any other file types, coupled with fast download of runs, projects, biosamples and datasets
  • Parameterize, launch, monitor and kill analyses running remotely in BSSH

Importantly, we’ve made sure the above features work nicely together so you don’t have to do the plumbing yourself. For a full list of worked examples visit our help site.

Try It Out Today! 

Our new BS CLI v1.0.0 is ready to serve as your standard toolchain to programmatically read, create and manipulate data in your BSSH account, automate routine tasks, as well as to efficiently manage your applications. You can try it out right now by following the instructions on our help site

If you are using existing tools like BaseMount or BaseSpace Copy, these will continue to work. However, as we continue to improve the developer experience, we hope to consolidate our existing tools and add new features to the BS CLI v1.0.0 toolchain. 

The more you use BS CLI v1.0.0, the more you will see how powerful it is. We can’t wait to see what you build with it! As always, let us know how we are doing. We want to incorporate best practices in the toolchain as much as possible, so it becomes customary, so please submit any requests in via this blog, twitter or Happy hacking! 

  • The BaseSpace Sequence Hub Team

For Research Use only.





DRAGEN™ Enrichment App – Accurate, rapid analysis for germline and somatic exome experiments

Author: Eric Allen, Associate Director of Bioinformatics at Illumina

As part of the new DRAGEN v3.4 launch, the Illumina software development team has released a new BaseSpace-exclusive DRAGEN app –DRAGEN Enrichment v3.4. Combining the best of DRAGEN with Illumina’s legacy Enrichment 3 App, the DRAGEN Enrichment app provides ultra-rapid analysis and improved accuracy all at a lower cost per sample.

The DRAGEN Enrichment app is the preferable method for analyzing enrichment data with DRAGEN, delivering a full suite of enrichment specific metrics and reporting.

Here is what to know:

  • The DRAGEN Enrichment App is faster and more accurate vs Enrichment (Isaac/Starling) and BWA Enrichment (BWA/GATK) apps, as demonstrated via the visuals below
  • Variant Calling:
    • Small variant calling – The app includes germline and somatic (low-frequency) small variant calling (tumor only); outputs VCF and gVCF in same analysis
      • Note: Tumor-normal analysis can be conducted by first running the DRAGEN Enrichment app on all their normal and tumor samples, and then running the DRAGEN Somatic app on the resulting BAM files for the Tumor/Normal pairs.
    • Copy number variant (CNV) calling – utilize CNV baseline files based on a panel of normals
    • Structural variant calling
  • Enrichment metrics generated:
    • Read/base enrichment padded/unpadded
    • Uniformity
    • % bases covered at 1x, 10x, 20x, 50x
    • Picard HsMetrics enabled by checkbox
  • Variety of reference options supported, including hg19, GRCh38 and custom references
  • Includes built-in targeted region BED files for common enrichment panels, and accepts custom targeted region BEDs
  • Extensive reporting:
    • In-browser, PDFs, and CSVs
    • Single sample and aggregate reports
  • Integrated variant annotation (Nirvana) and variant browser

The improved small variant calling over other available BaseSpace app solutions is shown below for one replicate of Coriell sample NA12878 with 106x depth:

Analysis AppApp Execution TimeDRAGEN-only Execution TimeSNV RecallSNV PrecisionIndel RecallIndel Precision
DRAGEN Enrichment v3.4.516m 4s6m 50s95.04%99.49%86.90% 92.18%
(Isaac/Starling) Enrichment v3.1.053m 20sNA93.26%99.38%78.29% 86.90%
BWA Enrichment v2.1.21h 23m 2sNA90.66% 99.78%72.85% 89.44%

• Example sample (s01-NFE-CEX-NA12878-demo.vcf) was prepared using Nextera Flex for Enrichment Library Preparation kit with dual indices and sequenced on a NovaSeq™ S2 flow cell:
• Variant accuracy comparison was performed using the Variant Calling Assessment Tool v3.2.0 app.

CNV calling is also enabled in the DRAGEN Enrichment app. The screenshot below from IGV shows a 937,697 bp CNV loss found in a melanoma cancer sample (Me01/ERR174231) around the chromosomal region chr9:125239269-126176965. The sample data was obtained from NCBI’s Sequence Read Archive (accession ERR174231) using the SRA Import BaseSpace App.

Project: SRA: ERP001844 (Agilent SureSelect – Exome CNV Detection – Melanoma). Publication: Magi et al.

Somatic/low-frequency variant calling is also enabled. The table below demonstrates the usefulness of this somatic calling tool:

Variant TypeChr Pos Gene Variant HD753 – Expected VF (%) HD753 – Measured VF (DRAGEN Enrichment) (%)
SNV Low GCchr.3 178936091 PIK3CA E545K 5.63.8
SNV High GC chr.19 3118942 GNA11 Q209L 5.66
Long Deletion chr.7 55242464 EGFR ΔE746 – A750 5.33.3
Long Insertion chr.755248998 EGFR V769_D770insASV 5.63.7
SNV High GC chr.14 105246551 AKT1 E17K 55.7

Project: NovaSeq S4: Nextera Flex for Enrichment (HCC1187, HCC1395, HCC1954, HD753, Coriell Mixture). 1% VF cutoff

We’ve also incorporated many of the comprehensive metrics and reporting features built into the legacy Enrichment 3.1.0 app, including read-, base-, and target-level enrichment metrics, as well as the variant table for simple variant call browsing and filtering.

We hope this update enables you to discover new insights. Stay tuned for more app announcements, and let us know if you have any questions.


Interpreting structural variation in cancer genomes

A user story from the Genomics England 100,000 Genomes Project cancer programme1

By Jawahar Swaminathan, Ph.D., Program Manager – Population Genomics

Illumina and Genomics England announced a Bioinformatics and Clinical Interpretation partnership (BCIP) in 2016 to “develop a platform and knowledge base that can be used to improve and automate genome interpretation.” In 2017, following months of directed development and rigorous testing, BaseSpace™ Variant Interpreter (BSVI) was adopted by Genomics England as the default interpretation partner for cancer cases in the 100,000 Genomes Project. A previous blog article presented a light-hearted take on the on-boarding and outreach activities by the Illumina team across the 13 constituent Genomic Medicine Centres (GMCs) of the 100,000 Genomes Project. In this article, we showcase how Dr. Patrick Tarpey and Jamie Trotman from the Cancer Genetics Group at Addenbrookes Hospital, Cambridge used the BCIP tools to interpret biologically relevant structural variants in cancer cases from the East of England GMC.

Case 1

A 11-year boy presented with a complex brain tumor initially presented as a biphasic neuroepithelial tumour (low-grade with features similar to those of a desmoplastic infantile ganglioma and high-grade astrocytic tumour). This case was recruited into the 100,000 Genomes Project, where whole-genome sequencing of the tumor and matched normal samples were performed. No variants of interest were initially identified in the static report presented back to the GMC, using analysis from the Genomics England standard pipeline. However, upon deeper analysis with the BCIP tools, Patrick and Jamie were able to identify a novel ZNF394-BRAF gene fusion by visualizing the structural variants and confirmation via filtering and review of the variant call metrics. This fusion is highly reminiscent of an activating fusion between KIAA1549-BRAF (Faulkner et al, 2015, PMID: 26222501) that is a leading cause for pilocytic astrocytoma. The fusion product was predicted to result in the formation of an unregulated kinase domain of BRAF (Exons 10-18). BRAF helps transmit chemical signals from outside the cell to the cell’s nucleus and forms part of a signaling pathway known as the RAS/MAPK pathway, which regulates cell differentiation, migration and apoptosis.

Figure 1: Data visualization showing ZNF394-BRAF fusion
Figure 2: ZNF394-BRAF gene fusion as shown in variant grid

The identified ZNF394-BRAF fusion was subsequently confirmed via an orthogonal test, led to the clinician treating the patient with MEK inhibitors. MEK is downstream from BRAF in the growth factor activation pathway (Figure 3) and the inhibitor is expected to target BRAF-activated cells.

Figure 3: Growth factor activation pathway (Picture courtesy: J. Trotman)

Case 2

This case was an ambiguous diagnosis that presented as a glioblastoma but was reviewed and diagnosed two years after treatment by histology as a pilocytic astrocytoma. Whilst the static reports from Genomics England did not find any variants of interest, a review of the case in BSVI led to the discovery of a large 26.1Mb deletion in chromosome 2, suggesting a CCDC88A-ALK fusion. ALK is a neuronal receptor tyrosine kinase plays a critical role in the development of the nervous system and is selectively expressed in the peripheral and central nervous systems. The domain architecture of ALK shows that it is primarily composed of two MAM domains (MAM1 and MAM2) and a tyrosine kinase domain. The MAM (meprin, A-5 protein, and receptor protein-tyrosine phosphatase mu) domains are predicted to play a role in homodimerization of the receptor kinase and regulate the function of the enzyme (Marchand et al, 1996, PMID: 8798668).

Figure 4: Domain organization of the ALK gene

The fusion product seen in the case suggests that the breakpoints are intronic and lead to the production of a chimeric protein that eliminates the MAM domains of ALK, thereby leading to an activated kinase. Structural variants with intronic breakpoints support the use of whole-genome sequencing in cancer, since these events are unlikely to be identified via targeted, hybrid-capture methods such as whole-exome sequencing.

Figure 5: Visualization and variant grid showing the CCDC88A-ALK gene fusion (Courtesy: J. Trotman)

Prior studies points have identified CCDC88A-ALK fusion as a recurrent partner in ependymoma-like gliomas characterized by both ependymal and astrocytic features (Olsen et al, 2015, PMID: 25795305). This is a critical finding as there are selective ALK inhibitors that could be administered in this case. The fusion was subsequently confirmed in the tumor via orthogonal methods, i.e. PCR and Sanger Sequencing.


The two examples show how visualization of cases accompanied by appropriate use of filters (MGE10KB), gene lists, coupled with a manual check of the variant calls, has resulted in the identification of biologically relevant variants and insight into disease mechanism. As a power user of the BCIP tools deployed to support the 100,000 Genomes Project, Dr. Tarpey says “BSVI analysis of cancer genomes is invaluable to access all variants (regardless of vcf filter status), and to visualise variants (particularly SVs) to inform validity. The numerous opportunities for triage facilitate appropriate analysis strategies across the diverse array of cancer types.


Jamie Trotman is a pre-registered Clinical Scientist. His role is in the analysis and interpretation of 100,000 Genomes Project cancer programme data and report writing for the East of England GMC and the East Midlands and East of England Genomics Laboratory Hub.

Patrick Tarpey is a group leader in the Department of Clinical Genetics at the Cambridge University Hospitals NHS Trust. After a brief period in clinical diagnostics, Patrick moved to Mike Stratton’s team at the Sanger Institute to pursue a project on hereditary x-linked disease via sequencing of the entire genic X-chromosome in a cohort of 100 probands with X-linked disease. This endeavor identified multiple new disease genes which have since been incorporated into routine diagnostics.

He later migrated onto the cancer genome project and pursued multiple projects aimed at unravelling the landscape of somatically acquired variation in breast, bone and other cancer types. This led to the discovery of multiple novel cancer genes, including those of clinical potential. Patrick has a lead role in developing and expanding cancer genome services (familial and acquired) in the recently formed East Anglia and East Midlands Genomic Laboratory Hub (GLH)

For Research Use Only. Not for use in diagnostic procedures.  

1This version of BaseSpace Variant Interpreter co-developed with Genomics England as part of the BCIP contains extensive customizations for their use cases and is not openly accessible to the public. Please contact your Illumina sales representative for guidance on how to use the publicly available version of Variant Interpreter.

Somatic Pipeline Improvements with DRAGEN v3.3

by Severine Catreux – Associate Director, Bioinformatics FPGA Development

Significant accuracy gains and speed improvements with DRAGEN v3.3, released April 2019

The DRAGEN engineering and bioinformatics team is excited to announce a new DRAGEN release, v3.3. The second of several releases scheduled for 2019, DRAGEN v3.3 contains improvements across the many pipeline offerings now supported by the DRAGEN platform. This includes accuracy improvements in the germline and somatic pipelines, new features (e.g. CNV DeNovo calling and RNA quantification) and speed gains (Somatic T/N, BCL conversion).

 Please see DRAGEN v3.3 Release Notes for more details.  This blog highlights the significant updates to the DRAGEN Somatic Pipeline for small variants, that are part of the v3.3 release.

As one of DRAGEN’s core pipelines, the DRAGEN Somatic Pipeline for small variants is utilized by cancer research institutes around the globe. Expanding on the existing functionality, accuracy and speed of the DRAGEN Somatic Pipeline, the v3.3 release placed a high focus on the somatic tumor/normal WGS mode, producing step-function improvements in both accuracy and speed.

Accuracy Improvements:

During the development cycle for v3.3, the DRAGEN engineering and bioinformatics teams took a deep dive into the DRAGEN Somatic Pipeline tumor/normal mode, strengthening the existing algorithm for accuracy improvements. Specific improvements were made in the genotyping module, to replace point estimation of the variant allele frequency with continuous integration over a range of possible frequencies. This led to significant gains in both sensitivity and precision. Additionally, downstream filtering rules were improved to optimize both sensitivity and precision (less stringency on clustered variants, filter variants positioned at the edge of reads, filter variants with low median base quality and MAPQ). Finally, the indel PCR error model autocalibration module was made independent between the tumor and normal control, to allow for differences in library preparation between the tumor sample and the control sample.

These changes are precursors to further accuracy improvements planned for the DRAGEN v3.4 release, specifically in the area of liquid tumor support, where tumor-in-normal contamination will be taken into account.

Accuracy gains of DRAGEN 3.3 over previous DRAGEN versions (3.2) as well as other pipelines (GATK4 MuTect2 and Strelka2) are shown in the plot below. Gains are measured for both SNVs and indels on most datasets.

Figure 1: Comparison of False-Positives (FP) and False-Negatives (FN) between GATK4, Strelka2, DRAGEN 3.2 and DRAGEN 3.3. Lower values are better.

Figure 2: The above chart showcases sensitivity improvements in DRAGEN v3.3 in comparison to DRAGEN v3.2 for INDELs and SNPS.

Speed Gains

DRAGEN v3.3 delivers unprecedented fast run times on the processing of somatic T/N WGS. Users of previous DRAGEN versions will notice substantial speed gains in DRAGEN 3.3 (see graph below). For datasets that were previously HMM-limited, v3.3 delivers up to 6-fold speed improvements, with a typical 100x (tumor) and 40x (normal) run finishing within 1 hour and 40 minutes on an on-premise DRAGEN server. In the cloud, run times average at 2 hours and 30 minutes.

The run time gains were obtained from optimizations in the upstream stages of the pipeline (more efficient way of defining regions of interest and increase the MAPQ threshold of reads to pass downstream, i.e., less reads get passed downstream, without loss on sensitivity). Additionally, the accelerated HMM engines were optimized to consume less of the FPGA footprint, such that more engines could be run in parallel.

Run-time comparison for T/N WGS Somatic Calling

Figure 3: The above chart compares DRAGEN v3.2 (Jan. 2019) and v3.3 for tumor-normal whole genome sequencing somatic calling. DRAGEN v3.3 introduces significant speed improvements.

About the DRAGEN Somatic Pipeline

The DRAGEN Somatic Pipeline provides highly accurate, ultra-rapid secondary analysis for tumor-only and tumor/normal experiments to identify cancer-associated mutations.

Tumor/Normal Mode

The DRAGEN Somatic Pipeline offers flexible data analysis to suit the specific needs of users. DRAGEN accepts FASTQ, BAM/CRAM, and BCL files and supports NGS input from whole genome, whole exome, and targeted cancer panels. In the tumor/normal pipeline, both samples go through identical processing steps of mapping, aligning, sorting, and duplicate marking. Then, both sets of tumor and normal reads are passed through the somatic variant caller which looks for sites exhibiting a mutation in the tumor reads while showing little to no evidence of the mutation in the normal reads, thus producing a VCF file containing tumor-specific mutations. The Somatic Pipeline also reports allele frequency, allowing users to assess the prevalence of a specific mutation.

Figure 4: Tumor-Normal pipeline diagram

Tumor-only Mode

In the tumor-only pipeline, users input NGS data from a tumor sample and run it through the same pipeline as for tumor/normal analysis, but it lacks the matching normal sample. The somatic variant caller contains algorithms that distinguish low-frequency alleles from background noise. Although the resulting VCF file does not distinguish germline from somatic variants, it allows researchers and clinicians to determine if a mutation is present in a tumor sample and its allele frequency.

Figure 5: Tumor-only pipeline diagram

Have any feedback, suggestions or data that you’d like to share with the DRAGEN team? Our new community forum is an active, collaborative hub for connecting and sharing feedback.

For Research Use Only. Not for use in diagnostic procedures

Enabling Cancer Interpretation At Scale For The Genomics England 100K Genomics Project

Perspectives on training and on-boarding users of the Genomics England Cancer Program

By Jawahar Swaminathan, Ph.D., Program Manager – Population Genomics (aided by Keira Cheetham, Ph.D., Staff Bioinformatics Scientist)

Illumina and Genomics England announced the Bioinformatics and Clinical Interpretation partnership (BCIP) in February 2016 with the aim: “develop a platform and knowledge base that can be used to improve and automate genome interpretation.” As part of this collaboration Illumina developed a customized version of BaseSpaceVariant Interpreter (BSVI)[1] for cancer and rare disease, including various backend services to allow integration between the Genomics England case dispatch pipeline and Illumina systems. What followed was a rigorous schedule of meetings between Genomics England and Illumina (read as long hours, late nights, lots of coffee and many meetings at Genomics England HQ in London!) leading to development of essential features for cancer interpretation.

In June 2017, following multiple rounds of user acceptance testing and concordance checks, BSVI was adopted by Genomics England as the default interpretation solution. Illumina then began the process of on-boarding various users at the 13 Genomic Medicine Centres (GMC), the recruiting hubs for various regions of England by organizing training sessions on the use of the software with particular focus on the unique way data entered and left the system. This article is a look back on these activities and how they are helping in the development of genome interpretation software that meets the diverse needs of the Genomics England end users.

Figure 1: The Genomics England Genomic Medicine Centres (Image Courtesy: Genomics England Ltd.

The GMC training sessions

Over the course of 2018, we carried out training and outreach activities across most of the GMCs. The GMCs are the recruitment hubs for the Genomics England 100,000 Genomes Project and comprise of multiple hospitals centered around a geographical area that has the necessary expertise. All training activities were organized by the Genomics England Cancer Interpretation team and were also attended by a representative from Genomics England.

Some humorous takeaways:

  1. Long hours on an early morning packed train from Cambridge (where we are situated) to our destination city, including a hurriedly eaten lunch at a busy Costa Coffee (yes almost every hospital in the UK has one of these) at the hospital before the training! Throw in the occasional aborted visit due to an alarmingly growing windscreen crack on a rental car or boarding the wrong train and you have the makings of a long and interesting day.
  2. Every NHS hospital looks the same. The usual 1960s concrete exterior, the same typeface on the signs and the same warren of corridors to the Clinical Genetics department
  3. Working out how to use the different display equipment in different hospitals before attempting to figure out internet connectivity on the slow and ageing hospital computer systems.
  4. Hot chocolate or a burrito on the return leg at the local train station as a treat for a job done well
  5. Never work with children, animals, or live demos. Although we always got the live demo to work!

All training activities were conducted by my colleague Keira Cheetham and I and involved a mix of presentations, live demos using cases specific to the GMC followed by hands-on instructions on how to use the software and send results back to Genomics England for reporting. The training was also an opportunity for us to talk about the science around interpreting cancer genomes and how Illumina is facilitating greater insights into cancers with whole genome sequencing (WGS).

This was also a great opportunity to see how the BCIP tools were used by GMC users and any feedback (both good and bad) were gratefully received. We also spoke about upcoming features in these sessions. Attendance at these events varied from 2-10 users per GMC and the venues ranged from really tight spaces (sometimes with windows!) to large meeting rooms and everything in between. However, what was consistent throughout was the motivation and dedication of the NHS staff in delivering the best possible care to their patients recruited into the Genomics England 100,000 Genomes Project cancer program.

Illumina continues to work with Genomics England to extend its BCIP tools for Rare Disease interpretation and this offering will soon be available for user acceptance testing and following that, could be used in Genomics England’s suite of clinical interpretation systems. In the meantime, the UK NHS has announced the commissioning of WGS for rare disease and cancer, to be offered throughout the health system. The outreach activities of 2018 carried out by Keira and I for cancer will keep in us good stead for the next round of training for rare disease.

The Genomics England Cancer Outreach Program by numbers

  • ~76 GMC users across 11 GMCs trained
  • ~ 34 hours of training imparted
  • ~4000 miles travelled (all by British Rail barring Belfast Northern Ireland)

[1] The version of BSVI co-developed with Genomics England as part of the BCIP contains extensive customizations for their use cases and is not openly accessible to the public. Please contact your Illumina sales representative for guidance on how to use the publicly available version of BSVI.