Align Custom Genomes, Improve Scaffolding for de novo Assemblies, and more with our latest BaseSpace Labs Apps
The BaseSpace Labs team have kept busy in recent months, bringing multiple new apps and improved functionality for existing apps to BaseSpace. Our new and improved tools enable alignment of custom genomes with BWA, improved scaffolding for de novo assemblies, coordinate mapping conversion from one human assembly to another, and more!
Let’s take a look at our latest Labs apps:
The new BWA Aligner app aligns samples (consisting of FASTQ files) using the BWA-MEM aligner to a reference genome. With this app, you can choose from the available references within the app, output from an assembly app such as SPAdes, or select your own custom genome using the FASTA Uploader app.
BWA Aligner also outputs a BWA Index when using a custom reference which helps to speed up future alignments when using the same reference.
Let’s look at an example:
- The monarch butterfly reference genome available at MonarchBase was uploaded using the FASTA Upload app.
- Sequence data was imported from the NCBI SRA using the SRA Import app.
- Imported sequence data was aligned against the custom reference genome using BWA Aligner.
- View the results here.
CrossMap is an open-source utility for convenient conversion of genome coordinates between different assemblies (such as hg19GRCh38), also referred to as a liftover. CrossMap is able to liftover many of the common bioinformatics file formats including .vcf, .bedGraph, .bigwig, and more. This tool is used by Ensembl’s web-based liftover tool, and currently supports conversion between GRCh assemblies.
Rescaf is an app which consumes scaffolds of a draft assembly and the reads used to create the draft assembly, and outputs an improved set of scaffold sequences in FASTA format. The Rescaf workflow consists of popular open-source tools including BWA-MEM for mapping reads to their scaffolds, the BESST scaffolder to join repaired scaffolds, gap2seq to fill assembly gaps, and QUAST for computing assembly metrics. This tool is designed to work best with bacterial assemblies generated from Nextera Mate Pairs using the SPAdes Assembler app. It is intended for use with small- to medium-sized genomes such as bacteria and model organisms such as Drosophila and C. elegans.
You can view example results in this project.
We also released updates to a couple popular apps:
We have moved the Integrated Genomics Viewer app (formerly IGV) into our BaseSpace Labs development program to enable quick updates. With this update you can now load and view BigWig and BedGraph files from your BaseSpace projects.
Last, the NextBio Annotates RNA-Seq app annotates differentially expressed genes from the Cufflinks and RNA Express apps, using the curated public data from NextBio Research. You can view more details about this app from our previous post. Enhancements to the NextBio Annotates RNA-Seq app include the following:
- Support for RNA Express results in addition to Cufflinks results
- Support for results with non-default group labels
- Speed and performance improvements
We are working hard to bring additional analysis functionality into BaseSpace, and will announce more tools in the near future.
Do you have questions, comments, or suggestions? Submit your feedback to the BaseSpace Labs team at email@example.com.