Luke Zappia

Bioinformatician (he/him)


I am a bioinformatics postdoctoral researcher in the Theis Lab at the Helmholtz Zentrum München Institute of Computational Biology and the Technische Universität München. My research focuses on the analysis of single-cell RNA sequencing data including the development and benchmarking of computational methods. I am also interested in how best to visualise data more generally.

I completed my PhD in the Oshlack Lab.


  • scRNA-seq analysis
  • Data visualisation
  • R programming


  • Doctor of Philosophy (Bioinformatics), 2019

    The University of Melbourne/Murdoch Children's Research Institute

  • Master of Science (Bioinformatics), 2015

    The University of Melbourne

  • Bachelor of Science (Chemistry), 2011

    The University of Melbourne

  • Diploma in Informatics, 2011

    The University of Melbourne



NBA Positions

Dataset of NBA positions designed to replace the iris dataset.

Phd Commits

Functions for scraping git commits from repositories associated with a PhD (or anything else) and plotting them.

R Package Development Workshop

Materials for the COMBINE Australia R package development workshop


An R package for setting up a website to display analysis of Twitter hashtags

Twitter Stats

Analysis of Twitter activity for hashtags from various events, usually academic conferences.


CRAN R package for creating clustering trees, a visualisation for looking at clustering across resolutions.


Database and website cataloguing software tools for analysing single-cell RNA sequencing data.


Bioconductor R package for simulating scRNA-seq data.


A Python script for pretty printing of TeXcount output

Recent Posts

triple j's Requestival

Analysis of the songs played during triple j’s Requestival event

Back to the SCE-verse!

Updated analysis of packages that use the SingleCellExperiment object in 2020

Bioconductor 3.11 wrap-up

My wrap-up of the Bioconductor 3.11 release.

Caching blogdown posts

Description of how I cache blogdown Markdown files.

Exploring the SCE-verse

Analysis of packages that use the SingleCellExperiment object

Recent & Upcoming Talks

Tools and techniques for single-cell RNA sequencing data

My PhD completion seminar

Tools, simulations and trees for scRNA-seq

Presentation at the Institute of Computational Biology where I described my PhD projects.

Using clustering trees to visualise scRNA-seq data

Selected talk at the Genome Informatics 2018 conference where I described how clustering trees can be used with scRNA-seq data.

clustree: producing clustering trees using ggraph

Presentation at the userR! 2018 conference introducing the clustree package and demonstrating how it makes use of the ggraph package.

Clustering trees for visualising scRNA-seq data

Selected talk at the Oz Single Cells 2018 conference where I presented clustering trees and the clustree package.

Recent Publications

HiTIME: An efficient model-selection approach for the detection of unknown drug metabolites in LC-MS data

The identification of metabolites plays an important role in understanding drug efficacy and safety however these compounds are often …

Benchmarking atlas-level data integration in single-cell genomics

Cell atlases often include samples that span locations, labs, and conditions, leading to complex, nested batch effects in data. Thus, …

Opportunities and challenges in long-read sequencing data analysis

Long-read technologies are overcoming early limitations in accuracy and throughput, broadening their application domains in genomics. …

Tools and techniques for single-cell RNA sequencing data

RNA sequencing of individual cells allows us to take a snapshot of the dynamic processes within a cell and explore differences between …