New MiSeq Datasets
We are happy to announce the BaseSpace availability of two 2x250bp datasets, generated with the recently-released MiSeq v2 reagents.
As mentioned previously, we have introduced the concept of Projects in BaseSpace, which, together with Runs, provides a flexible working environment and accommodates many types of users. On one hand, the new UI is ideal for biologists or clinicians working on specific projects and perhaps a few collaborators. On the other hand, the Project concept allows lab managers to securely assign specific datasets or projects to various remote users/ customers.
Each of the datasets discussed below, therefore, have a link to the respective project and run folders. Also, when you click on the links, you will be asked to “Accept” the Run/Project into your BaseSpace account: this is the same mechanism you will use to share specific real-life projects or runs with your colleagues/collaborators via a dedicated URL.
2×250 bp PhiX
The total yield is 9.2 Gb and 87.2% of bases are at or above Q30.PhiX is a bacteriophage with a small and well-defined genome. This sample is often used by our customers to QC the process of cluster generation, sequencing and alignment.
2×250 bp B. cereus NC_003909.8
This is a multiplexed experiment using TruSeq Dual Indexing kits. The flow cell contained 24 samples resulting in a total yield of 7.9 Gb with 84.6% of bases at or above Q30. The average fragment length for this data set was 590 bp. Data from all 24 samples are provided here.
Benefits of Long, Paired-End Data
As indicated in this Technical Note, the longer read lengths, as well as the paired-end nature of Illumina data, increase the quality of bacterial de novoassemblies, based on metrics such as N50, contig size and genome coverage. (The Technical Note is from 2010, but the findings and implications for longer reads generated with the tried and trusted SBS chemistry are still very much valid).
Long, paired end reads are also critical for detecting gene fusions, and for characterizing structural variations more accurately. These genomic rearrangement events have been implicated in cancer and other diseases.