Taking Out the Trash

If you’ve been using BaseSpace for a while you may have noticed that there wasn’t a way to permanently remove data from your account.  I say that in the past tense because it is no longer true.    The wait is over! “Move to Trash” is now available on Runs and Analyses.

MoveToTrash02

 

Trash Overview

This has been one of the most important features for us to get right because it has to do with removing your data and we take that very seriously.  That is why we are introducing a two-step delete process that will help prevent accidental deletes and give you the confidence you need to safely manage your data.

First, you will notice a new action available on run and analysis list and detail pages, called “Move to Trash”. On the list pages, you must first highlight the row that you want before it’s available.

 MoveToTrash

This action is very similar to moving files on your desktop to the trash or recycle bin. Just like your desktop, the data can be recovered, but it can no longer be viewed or acted upon.

Trashed Items Side-Effects:

  • If the items were shared, all share recipients will lose access to that data
  • All API access is immediately removed and will return the HTTP status code of 410 (“Gone”)
  • Any attempt to view this data on the website will take the user to an 410 error page stating the content is “Gone”
  • Data, while in the trash, can only be  “Restored” or “Emptied” by the owner.
  • Purging data will cause it to be permanently removed and cannot be undone.

 

Moving Runs to the Trash

  • Runs can be put in the trash from the list or the detail pages.
  • Runs cannot be removed if they are in a non-terminal state.  The most common non-terminal states would be: running, uploading, analyzing.
  • The dialog may also present you with the option to remove all associated analyses that used the run as input.
    • All sequencing runs will have at least 1 associated analysis unless they were failed or used just for remote monitoring. 
  • If you are not the owner of the run, moving this item to the trash will simply remove your access and cannot be undone.
    •   To restore access, just contact the owner or click on the previously sent share link if it’s still active.

 

movetotrash-animation

 

Moving Analyses to the Trash

  • Analyses can be put in the trash from the list and detail pages.
  • Analyses cannot be removed if in a non-terminal state. The most common non-terminal states would be: pending execution and running.
  • If a project is being transferred, some of the analyses may not be removed until after the transfer has been completed.
  • Apps that are leveraging data as input may fail if items are moved to the trash. 
  • If you have items in the trash, we prevent project transfers until all items in that project are restored or emptied.

 

Emptying and Restoring Items in the Trash

The trash page can be accessed from most of the project and run list pages.  The icon is always in the right side of the grid and labeled, “View Trash”.

TrashIcon

There are only two actions currently on the Trash page: Empty and Restore.

Empty will permanently delete all items, and Restore allows you to return the items back to being active.

Restored items will keep all of their original attributes except for the share recipients.

restore

User Agreement Updates

Because of all of these changes, we have also updated our User agreements to reflect the behavior of these new features. In particular, item 7 states that even though data can be removed it may have been previously shared with other users or apps and subsequently downloaded or copied.   You will be prompted to accept these new terms upon your next login.  If you have any questions, don’t hesitate to ask!

Thank you,

-Greg

Introducing our First BaseSpace Labs Applications – FastQC and Velvet de novo Assembly

We are excited to announce two new applications in BaseSpace, FastQC and Velvet de novo Assembly.

 denovo_assembly_100                                     FastQC_icon_100

     Velvet de novo Assembly                                      FastQC

Both applications are currently available for all users and were built using the BaseSpace Native App Engine by our internal R&D groups.  These two applications are also the first BaseSpace Labs Apps of many more to come, the concept behind BaseSpace Labs Apps is explained in more detail below.

BaseSpace Labs Apps are Illumina’s internally developed applications that extend the functionality within BaseSpace.  Some BaseSpace Labs applications will be experimental or research focused, while others will be used as a step in a greater workflow.  The Apps are reviewed regularly by our team and put through the same review process as third-party apps.

BaseSpace Labs Apps are developed using an accelerated development process in order to make them available to BaseSpace users faster than the BaseSpace Core Apps.  It is important to note that, unlike BaseSpace Core Apps, BaseSpace Labs Apps are not officially supported by Illumina Customer Service.  Support for BaseSpace Labs applications is provided at the developer’s discretion and the apps are provided as-is without any warranty of any kind.

The FastQC app can be used to provide a quality assessment of the sequence data generated using Illumina sequencers.  FastQC for BaseSpace is based on the FastQC software developed by the Bioinformatics Group at the Babraham Institute.  It provides a modular set of analyses which can be used quickly to assess if there are any problems with the sequencing data before doing any additional analysis.

fastqcscreenshot

The above figure shows an example output from the FastQC app depicting the quality score across all bases at a given position in the reads.  For an example of additional output generated by FastQC, please view this FastQC demo project.

The Velvet de novo Assembly app is a de novo assembly pipeline for bacterial samples using the Velvet assembler.  One of the key features of this app is that it has an adapter trimming protocol that has been optimized for the Nextera Mate-Pair library prep kit.  An application note describing the de novo assembly of 9 different bacterial using the Velvet de novo Assembly app can be found here.  In many cases, a single contig representing the entire bacterial genome can be assembled.  The figure below is an example of the output generate by the Velvet de novo Assembly app.

rsz_denovoscreenshot

Example output generated by the Velvet de novo Assembly can be found here.  We hope you enjoy the FastQC and Velvet de novo Assembly apps.  For any questions, feedback, or feature requests for these applications, please send an email to basespacelabs@illumina.com and include the name of the application.  Thank you!

FASTQ upload is now available in BaseSpace

We are excited to announce the availability of a data upload feature for FASTQ files that were previously generated on Illumina sequencing instruments. This simple-to-use feature is accessible from any project to which the user has write access by first clicking on the project and then selecting the Import tab shown below.

ProjectTab

The user will then be prompted to select their import type. The user can upload a single sample by clicking on “Sample” as shown below.

Samples

The user can then either “Drag and drop” one or more files into the webpage or click on “select files” and select which files they would like to upload from a file browser. Note that the FASTQ files need to adhere to Illumina standards, as specified below.  Data for a single sample can constitute multiple files. The total number of files per sample and their combined size are limited to 16 and 25 GB respectively. It will take 1-2 hours to upload a 25GB sample on a network with a relatively fast internet connection.

dranganddrop

The user will then see a progress bar as the file/s are uploaded. Once the progress bar completes, the user can add additional files. The user can also set the sample name and associate a genome with the sample in the upper left hand corner of the screen.

upload_screen

Once the user has imported all of the files and the files complete uploading, the user will need to click on the  “Complete Import” button (shown above) to complete the session.

FASTQ file standards

  • The uploader will only support gzipped FASTQ files generated on Illumina instruments
  • The name of the FASTQ files must conform the following convention:
    • SampleName_SampleNumber_Lane_Read_FlowCellIndex.fastq.gz (i.e. SampleName_S1_L001_R1_001.fastq.gz / SampleName_S1_L001_R2_001.fastq.gz)
  • The read descriptor in the FASTQ files must conform to the following convention:
    • @Instrument:RunID:FlowCellID:Lane:Tile:X:Y ReadNum:FilterFlag:0:SampleNumber:
      • Read 1 descriptor would look like this:
        @M00900:62:000000000-A2CYG:1:1101:18016:2491 1:N:0:13
      • Read 2 would have a 2 in the ReadNum field, like this:
        @M00900:62:000000000-A2CYG:1:1101:18016:2491 2:N:0:13

Quality considerations

  • The number of base calls for each read must equal the number of quality scores
  • The number of entries for Read 1 must equal the number of entries for Read 2
  • The uploader will determine if files are paired-end based on the matching file names in which the only difference is the ReadNum
  • For paired-end reads, the descriptor must match for every entry for both reads 1 and 2
  • Each read has passed filter

Upload parameters

  • Only one sample can be uploaded at a time
  • A maximum of 16 files can be uploaded in a session
  • The size of the uploaded files cannot exceed 25 GB
  • A detailed description of how to use the uploader can be found in the BaseSpace user guide

DeepChek®-HIV – App for genotyping by NGS and inferred drug resistance testing – for research use only

DeepCheck-HIV

DeepChek®-HIV

HIV genotyping and inferred drug resistance testing has become an integral part of the clinical management of patients infected with HIV. Detecting minority populations of resistant viruses is now routinely done. Next-generation sequencing (NGS) technology is replacing  Sanger sequencing methodology, and end-to-end solutions combining sensitive genomic tests with advanced data management software platforms are in high demand.

DeepChek®-HIV is easy-to-use downstream analysis software for NGS data management, interpretation, and reporting for Research Use Only. DeepChek is a reliable software and database solution that is capable of handling the complexity of NGS data for all the key genomic regions involved in HIV drug resistance (reverse transcriptase, protease, integrase, GP41, and GP120/V3). The database is regularly updated with the most recent drug resistance information and provides an efficient and downstream analysis platform for clinical laboratories involved in routine HIV-1 genotyping and drug resistance testing.

 

Link to App in BaseSpace:

https://basespace.illumina.com/apps/414414/DeepChek-HIV 

 

Link to example dataset with example input data and output results:

https://basespace.illumina.com/s/krdEqmmpwTrn

 

 

Introducing VariantStudio on BaseSpace

VariantStudio v2.2

Sequence and stream data to BaseSpace- done.  Run quality check- done.  Alignment and variant calling- done.  Congratulations, you now have a set of variants!  But what good is a set of variants if you can’t describe what they mean, how they might explain the phenotype of the specimen, and which ones aren’t worth worrying about? There is good news- now that Illumina VariantStudio is available on BaseSpace, deciphering the biological meaning of genomic variants is not a huge challenge.  If you’re not familiar with it, VariantStudio1 is an easy-to-use tool for variant annotation, filtering, and reporting.

  • Step 1: Import data into VariantStudio. Click the import button and you will be able to browse and select VCF and gVCF files stored in BaseSpace to import into VariantStudio.  You can import DNA variant data from targeted, exome, or whole-genome sequencing, but VariantStudio only supports human SNP and indel analysis at this time.
  • Step 2: Annotate variants using the Illumina Annotation Service, which aggregates annotations from a broad range of public sources including ClinVar, COSMIC, OMIM, and 1000 Genomes Project. Variants will be richly annotated with biological information including transcript consequence, functional impact, known disease association, population allele frequency, and more.
  • Step 3:  Apply a cascade of filtering options to quickly create a short list of candidate variants that are likely associated with the disease or phenotype. In addition to single sample analysis, you can perform tumor/normal comparisons to identify somatic mutations or family-based analyses to investigate variants underlying rare disease.
  • Step 4: Use the provided annotations to classify variants based on their presumed biological impact. A common scheme is pathogenic, likely pathogenic, benign, or unknown significance. Or you can use your own, customizable classification scheme.
  • Step 5:  Generate a customizable report that summarizes your important variants, along with any additional metadata.

Import variants > Annotate > Filter > Classify and Interpret > Generate Report.  Watch our analysis videos and see how quickly you can go through this workflow. VariantStudio is a powerful, secure2  tool to simplify genomic data interpretation, and accessing it on BaseSpace is  just a click away.  With the addition of VariantStudio to the BaseSpace Core Apps, BaseSpace users can now execute the entire sample-to-answer workflow- from generating sequence reads to reporting biologically significant results.

Related Information:

1. datasheet_illumina_variantstudio_software.pdf

2. technote_variantstudio_data_security.pdf

 

Two applications for Illumina’s synthetic long reads: TruSeq Long-Read Assembly and TruSeq Phasing Analysis

We’re excited to announce the release of two apps that support unique applications for Illumina’s TruSeq Synthetic Long-Read Technology.Image

Using data generated by the TruSeq Synthetic Long-Read DNA Library Prep Kit (also released this week), the TruSeq Long-Read Assembly App executes the assembly of synthetic long reads, and the TruSeq Phasing Analysis App performs whole human genome phasing. 

 The TruSeq Long-Read Assembly App constructs synthetic long reads from shorter sequencing reads, providing FASTQ files for accurate genome assembly, genome finishing, de novo assembly and metagenomics analysis. 

The TruSeq Phasing Analysis App performs whole human genome phasing, identifying haplotype information, co-inherited alleles and phasing de novo mutations. The application reports haplotype blocks across the genome and phasing confidence scores in a phased VCF file.

Image

 

Together with the TruSeq Long-Read DNA Library Prep Kit and Illumina’s sequencing technology, these new apps provide a solution for long reads that spans library prep, sequencing, and informatics. See some sample data on the Illumina Blog, and learn more about synthetic long-read and phasing technology on the Illumina website.

New Features in the BaseSpace Prep Tab

We constantly strive to improve the experience for all users using our tools, and today we are excited to announce a few new updates we have made to the BaseSpace Prep Tab.  The BaseSpace Prep Tab makes it easier for our users to prepare and plan a sequencing Run by using a rich web user interface which communicates directly with the instrument to set it up in four easy steps: preparing Biological Samples, Libraries, Pools, and planning a Run which can be discovered by the instrument.  At the moment, the Prep Tab supports only NextSeqs but support for MiSeqs and HiSeqs is coming in the future.  Today we’ve added a few new features to the BaseSpace Prep Tab, which we will explain in more detail below:

  1. The ability for users to Import Custom Library Prep Kits
  2. The ability for users to Import prepped Libraries in one step
  3. The Prep Libraries section under the Libraries tab is now supported on small laptop screens

For those that would like more information about the Prep Tab, please view the original NextSeq and Prep Tab blog post to learn more.

 

Import a Custom Library Prep Kit

  • When prepping Libraries, under Library Prep Kit drop-down list, you can now choose to add a Custom Library Prep Kit.

 prep-tab-blog-post-1

  • After selecting Custom Library Prep Kit, in addition to naming your kit, you will be asked to specify a few basic options for supported read types, indexing strategies, and default read cycles.

 prep-tab-blog-post-2

  • Now, click on Choose .csv file to select a template file.  You can customize your template file with the following information for the new kit:
    • Custom adapter sequences
    • Custom indexes (name and sequences)
    • Custom default layout

    Here’s a simple example template file:

prep-tab-blog-post-3

  • When complete, click Create New Kit to add this kit to the drop-down list that appears for your account.
  • You can also view all of your Library Prep Kits under the My Account section of Basespace.

 prep-tab-blog-post-4

Import Prepped Libraries

  • Users can now import prepped libraries all in one step, instead of importing biological samples and prepping them in 2 separate steps.
  • First, access the Import feature from the Libraries page within the Prep tab by clicking on Import:

 prep-tab-blog-post-5

  • Click the choose .csv File button
  • In your .csv file, specify plate information and a list of libraries to import.  The following is a simple example file:

 prep-tab-blog-post-6

  • Once a file is selected, click Open.
  • The page will now populate with prepped libraries, ensure that the information displayed is correct.

prep-tab-blog-post-7

  • When you’re done you can save the plate for later use or proceed directly to pooling libaries!

We hope you enjoy these changes and look forward to more updates in BaseSpace in the future.