Over the last few years the number of methods for analysing scRNA-seq has exploded and there is now well over 200 software tools available. Each of these tools need to make a choice about how they store and represent the data used during their analysis. One attempt to standardise the data structures that are used is the SingleCellExperiment package created by Davide Risso and Aaron Lun, with help from Keegan Korthauer.
The Bioconductor 3.7 release was announced this week. I thought I would have a look through the new packages and changes to existing packages and point out some of my highlights. The descriptions below are my summaries, if you want to see more detail you can read the full release notes here. Single-cell RNA-seq My interest is in single-cell RNA-seq analysis, so I am going to start off with packages related to this.
For my PhD I am working on methods for analysing single-cell RNA-sequencing (scRNA-seq) data which measure the expression of genes in individual cells. One of the most common analyses done on this type of data is to cluster the cells, often in an attempt to find out what cell types are present in a sample. In a recent seminar I showed some images of what I am calling a “clustering tree” (you can see the slides here if you are interested).
Gantt charts are a project management tool designed to visualise the tasks in a project, how long they will take and what order they must be completed. If you haven’t seen one before essentially they look like a modified horizontal bar chart. Along the horizontal axis is time with tasks along the vertical. Each task consists of a bar where the ends are the start and end times. Often there are also arrows indicating dependencies and a line showing the current date.
Bioconductor 3.3 has just been released. You can find the complete list of new packages (and changes to existing packages) here but here are a few I thought might be interesting based on the description. I might have more to say once I’ve had time to try a few out. debrowser – Interactive plots and tables for differential expression DEFormats – convert between differential expression formats EBSEA – exon based differential expression EmpiricalBrownsMethod – combining dependent p-values Linnorm – normalisation for parametric tests, simulation of RNA-seq data multiClust – feature selection and clustering analysis for transcriptomic data RGraph2js – interactive network visualisations with D3 tximport – import and summarise transcript-level estimates Single-cell These packages are specific to single-cell RNA-seq analysis.