Luke Zappia

Bioinformatician (he/him)


I am a bioinformatics postdoctoral researcher in the Theis Lab at the Helmholtz Zentrum München Institute of Computational Biology and the Technische Universität München. My research focuses on the analysis of single-cell RNA sequencing data including the development and benchmarking of computational methods. I am also interested in how best to visualise data more generally.

I completed my PhD in the Oshlack Lab.


  • scRNA-seq analysis
  • Data visualisation
  • R programming


  • Doctor of Philosophy (Bioinformatics), 2019

    The University of Melbourne/Murdoch Children's Research Institute

  • Master of Science (Bioinformatics), 2015

    The University of Melbourne

  • Bachelor of Science (Chemistry), 2011

    The University of Melbourne

  • Diploma in Informatics, 2011

    The University of Melbourne




Bioconductor R package converting between scRNA-seq objects.

NBA Positions

Dataset of NBA positions designed to replace the iris dataset.

Scanpy in R

Tutorial describing how to interact with the Scanpy Python package from R.

Phd Commits

Functions for scraping git commits from repositories associated with a PhD (or anything else) and plotting them.

R Package Development Workshop

Materials for the COMBINE Australia R package development workshop


An R package for setting up a website to display analysis of Twitter hashtags

Twitter Stats

Analysis of Twitter activity for hashtags from various events, usually academic conferences.


CRAN R package for creating clustering trees, a visualisation for looking at clustering across resolutions.


Database and website cataloguing software tools for analysing single-cell RNA sequencing data.


Bioconductor R package for simulating scRNA-seq data.


A Python script for pretty printing of TeXcount output

Recent Posts

Bioconductor 3.12 wrap-up

My wrap-up of the Bioconductor 3.12 release.

triple j's Requestival

Analysis of the songs played during triple j’s Requestival event

Back to the SCE-verse!

Updated analysis of packages that use the SingleCellExperiment object in 2020

Bioconductor 3.11 wrap-up

My wrap-up of the Bioconductor 3.11 release.

Caching blogdown posts

Description of how I cache blogdown Markdown files.

Recent & Upcoming Talks

Interoperability between Bioconductor and Python for scRNA-seq analysis

Invited keynote at the European Bioconductor meeting 2020

Tools and techniques for single-cell RNA sequencing data

My PhD completion seminar

Tools, simulations and trees for scRNA-seq

Presentation at the Institute of Computational Biology where I described my PhD projects.

Using clustering trees to visualise scRNA-seq data

Selected talk at the Genome Informatics 2018 conference where I described how clustering trees can be used with scRNA-seq data.

clustree: producing clustering trees using ggraph

Presentation at the userR! 2018 conference introducing the clustree package and demonstrating how it makes use of the ggraph package.

Recent Publications

HiTIME: An efficient model-selection approach for the detection of unknown drug metabolites in LC-MS data

The identification of metabolites plays an important role in understanding drug efficacy and safety however these compounds are often …

Benchmarking atlas-level data integration in single-cell genomics

Cell atlases often include samples that span locations, labs, and conditions, leading to complex, nested batch effects in data. Thus, …

Opportunities and challenges in long-read sequencing data analysis

Long-read technologies are overcoming early limitations in accuracy and throughput, broadening their application domains in genomics. …

Tools and techniques for single-cell RNA sequencing data

RNA sequencing of individual cells allows us to take a snapshot of the dynamic processes within a cell and explore differences between …