Today I attended the Joining the Dots visualisation symposium. You can see the slides for my talk about clustering trees here. It was a great event and hope we see more meetings like this in the future. Here is an analysis of the Twitter activity on the #jtdwehi hashtag, thanks to code from Neil Saunders. You can see it on Github. Introduction An analysis of tweets from the Joining the Dots symposium.

Continue reading

For my PhD I am working on methods for analysing single-cell RNA-sequencing (scRNA-seq) data which measure the expression of genes in individual cells. One of the most common analyses done on this type of data is to cluster the cells, often in an attempt to find out what cell types are present in a sample. In a recent seminar I showed some images of what I am calling a “clustering tree” (you can see the slides here if you are interested).

Continue reading

PyConAU 2016

Over the weekend I attended PyCon Australia. This was my first time at a purely tech conference and I couldn’t help but compare it to my previous experiences at scientific conferences. DISCLAIMER: Like I said this was my first tech conference and my scientific conference experience is also fairly limited so some of the comments I make might be generalisations that don’t always apply. PyCon started with miniconfs on Friday and continued coding sprints on Monday and Tuesday.

Continue reading

Gantt charts are a project management tool designed to visualise the tasks in a project, how long they will take and what order they must be completed. If you haven’t seen one before essentially they look like a modified horizontal bar chart. Along the horizontal axis is time with tasks along the vertical. Each task consists of a bar where the ends are the start and end times. Often there are also arrows indicating dependencies and a line showing the current date.

Continue reading

Bioconductor 3.3 has just been released. You can find the complete list of new packages (and changes to existing packages) here but here are a few I thought might be interesting based on the description. I might have more to say once I’ve had time to try a few out. debrowser – Interactive plots and tables for differential expression DEFormats – convert between differential expression formats EBSEA – exon based differential expression EmpiricalBrownsMethod – combining dependent p-values Linnorm – normalisation for parametric tests, simulation of RNA-seq data multiClust – feature selection and clustering analysis for transcriptomic data RGraph2js – interactive network visualisations with D3 tximport – import and summarise transcript-level estimates Single-cell These packages are specific to single-cell RNA-seq analysis.

Continue reading

Recently this paper by Ilicic et al. suggested a method for assessing the quality of individual cells in a single-cell RNA-seq experiment. The basic idea is to extract various biological and technical features from each the reads for each cell, then use PCA with outlier detection or a SVM to classify cells as “high” or “low” quality. There are two pieces of software associated with the paper: cellity, an R package that performs the classification and Celloline, a Python script that performs alignment, summarisation and extraction of alignment statistics such as the number of reads aligned to exons, introns, intergenic regions etc.

Continue reading

It’s come to the stage in my Master’s where I have to start thinking about writing my thesis. Apart from all the analysis I have to do before I can do that there is also the question of what I am going to use to construct the document itself. For the last year or so I have been writing using Markdown which is converted to Tex using Pandoc then used to produce a PDF.

Continue reading

Author's picture

Luke Zappia

Bioinformatician in training

PhD student

Australia