BaseMount: Directly Linking NGS Data to R Packages for RNA-Seq Differential Expression Analyses

With the recent launch of BaseMount, access to your NGS data has never been so convenient. This early access release is available for all Linux-based operating systems and utilizes a command line interface (CLI) to access personal Projects, Samples, Runs, and AppResults within your BaseSpace account. Below are some simple steps to effectively transform your RNA-Seq data straight from our very own RNA Express app into a Normalized Count Plot, MA-plot, and Principal Component Analysis (PCA) plot. We are going to be using the popular Bioconductor DESeq2 package to construct the plots and the example is a differential expression analysis comparing two tissue samples: Human Brain Reference RNA (HBRR) and Universal Human Reference RNA (UHRR). Follow these steps below to get started:

Setting Up

  1. Download BaseMount, create a ‘BaseSpace’ directory, and initialize the connection with BaseSpace.
    Screen Shot 2015-08-10 at 7.02.07 PM
  2. Open an R session in your terminal or RStudio (recommended) and set the working directory to the folder containing genes.count.csv.Screen Shot 2015-08-10 at 6.30.23 PM

  3. Proceed to import the counts table from BaseMount and provide the metadata (personal file).Screen Shot 2015-08-10 at 6.33.05 PM
  4. Download and import the DESeq2 package from Bioconductor.

Screen Shot 2015-08-10 at 6.36.59 PMFigures and Analysis

  1. Use the DESeq function to perform standard differential expression analysis.

    *The following examples use BaseSpace public data HiSeq 4000: RNA-Seq 64-plex (MAQC HBRR and UHRR) and the design argument is based on the metadata.Screen Shot 2015-08-10 at 6.42.26 PM

    A.    Plot normalized counts for a differentially expressed gene. Screen Shot 2015-08-10 at 6.47.06 PMcounts_blog

    Figure (1): Comparison of normalized counts for SDF4 gene in HBRR vs. UHRR. Visualize differential expression for specific genes with this plot. In this example, the gene with the smallest p-adjusted value (SDF4) was plotted.

    B.   Construct a MA-Plot.                                             Screen Shot 2015-08-10 at 6.51.03 PMMA_Plot

    Figure (2): MA-plot of UHRR vs. HBRR. Many differentially genes are highlighted in red (p-adj < 0.1) as expected with two different sample types.

    C.    Construct a PCA Plot.  Screen Shot 2015-08-10 at 6.54.23 PMpca_blog

    Figure (3): Principal Component Analysis (PCA) plot of HBRR and UHRR samples. The PCA shows clustering and clear separation between sample type (PC1) and library prep kit (PC2). This type of analysis is very useful in visualizing the main metadata components contributing towards the variability in RNA-Seq results.

  2. Learn more analyses at: http://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.pdf

Tags: , , , , , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: