FASTQ upload is now available in BaseSpace

We are excited to announce the availability of a data upload feature for FASTQ files that were previously generated on Illumina sequencing instruments. This simple-to-use feature is accessible from any project to which the user has write access by first clicking on the project and then selecting the Import tab shown below.

ProjectTab

The user will then be prompted to select their import type. The user can upload a single sample by clicking on “Sample” as shown below.

Samples

The user can then either “Drag and drop” one or more files into the webpage or click on “select files” and select which files they would like to upload from a file browser. Note that the FASTQ files need to adhere to Illumina standards, as specified below.  Data for a single sample can constitute multiple files. The total number of files per sample and their combined size are limited to 16 and 25 GB respectively. It will take 1-2 hours to upload a 25GB sample on a network with a relatively fast internet connection.

dranganddrop

The user will then see a progress bar as the file/s are uploaded. Once the progress bar completes, the user can add additional files. The user can also set the sample name and associate a genome with the sample in the upper left hand corner of the screen.

upload_screen

Once the user has imported all of the files and the files complete uploading, the user will need to click on the  “Complete Import” button (shown above) to complete the session.

FASTQ file standards

  • The uploader will only support gzipped FASTQ files generated on Illumina instruments
  • The name of the FASTQ files must conform the following convention:
    • SampleName_SampleNumber_Lane_Read_FlowCellIndex.fastq.gz (i.e. SampleName_S1_L001_R1_001.fastq.gz / SampleName_S1_L001_R2_001.fastq.gz)
  • The read descriptor in the FASTQ files must conform to the following convention:
    • @Instrument:RunID:FlowCellID:Lane:Tile:X:Y ReadNum:FilterFlag:0:SampleNumber:
      • Read 1 descriptor would look like this:
        @M00900:62:000000000-A2CYG:1:1101:18016:2491 1:N:0:13
      • Read 2 would have a 2 in the ReadNum field, like this:
        @M00900:62:000000000-A2CYG:1:1101:18016:2491 2:N:0:13

Quality considerations

  • The number of base calls for each read must equal the number of quality scores
  • The number of entries for Read 1 must equal the number of entries for Read 2
  • The uploader will determine if files are paired-end based on the matching file names in which the only difference is the ReadNum
  • For paired-end reads, the descriptor must match for every entry for both reads 1 and 2
  • Each read has passed filter

Upload parameters

  • Only one sample can be uploaded at a time
  • A maximum of 16 files can be uploaded in a session
  • The size of the uploaded files cannot exceed 25 GB
  • A detailed description of how to use the uploader can be found in the BaseSpace user guide

Tags: , ,

10 responses to “FASTQ upload is now available in BaseSpace”

  1. Nandita says :

    Good to hear- this is very useful.Can I also get this on my BaseSpace Onsite? Thanks

  2. Barry Murphy says :

    Great News. Really looking forward to seeing how this works.

  3. Dr.K says :

    Do you have delete or remove function for the unsuccessful uploaded fastq files or items?

  4. Chentha Vasu says :

    why does the failed sign appear when drag and drop the correct format file (like: 44_S44_L001_R1_001.fastq)? Thanks, Chentha

  5. Chentha Vasu says :

    Not able to import files and failed sign appears. Thanks

    • Ilya Chorny says :

      The files need to be gzipped (gnu zip) with a .gz extension Please see the BaseSpace user guide for further information about making sure the files conform to Illumina standards.

  6. michelmfarah says :

    Hi. I’m trying import the fastq.gz files and failed… Say that the name of the files were wrong, here is the name format of my files.

    CH12_AGTCAAA_L005_R1_001.fastq.gz

    Anyone know what is wrong?

    Thanks,

    • Ilya Chorny says :

      Hi.

      You should change the name to CH12-AGTCAA_S1_L005_R1_001.fastq.gz. Note the dash and the underscore.

      Thanks,

      Ilya

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: