<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>lazappi</title>
<link>https://lazappi.id.au/posts/</link>
<atom:link href="https://lazappi.id.au/posts/index.xml" rel="self" type="application/rss+xml"/>
<description>Luke Zappia&#39;s personal website</description>
<generator>quarto-1.9.36</generator>
<lastBuildDate>Mon, 11 May 2026 22:00:00 GMT</lastBuildDate>
<item>
  <title>Bioconductor 3.23 wrap-up</title>
  <link>https://lazappi.id.au/posts/2026-05-12-bioconductor-3-23-wrap-up/</link>
  <description><![CDATA[ 





<p>The Bioconductor 3.23 release was a couple of weeks ago. Here is my wrap-up of new packages and updates. This is only the things I found interesting based on the release notes and they don’t come with any particular recommendations. If there is something else you are interested in have a look at the full release notes <a href="https://bioconductor.org/news/bioc_3_23_release/" title="Bioconductor 3.23 release notes">here</a>.</p>
<section id="my-packages" class="level1">
<h1>My packages</h1>
<section id="anndataranndatar" class="level2">
<h2 class="anchored" data-anchor-id="anndataranndatar"><a href="https://bioconductor.org/packages/release/bioc/html/anndataR.html" title="anndataR"><strong>{anndataR}</strong></a></h2>
<ul>
<li>Add initial support for reading and writing Zarr stores. This was a major effort from members of the community.</li>
<li>Improvements to chunking when writing H5AD files</li>
<li>Improvements to performance when reading sparse matrices</li>
<li>Improved warnings and error handling</li>
<li>Improved tests, documentation and CI</li>
</ul>
</section>
<section id="splattersplatter" class="level2">
<h2 class="anchored" data-anchor-id="splattersplatter"><a href="https://bioconductor.org/packages/release/bioc/html/splatter.html" title="splatter"><strong>{splatter}</strong></a></h2>
<ul>
<li>Replace deprecated functions from <a href="https://bioconductor.org/packages/release/bioc/html/scuttle.html" title="scuttle"><strong>{scuttle}</strong></a> with equivalents in <a href="https://bioconductor.org/packages/release/bioc/html/scrapper.html" title="scrapper"><strong>{scrapper}</strong></a></li>
<li>Deprecate the MFA simulation functions now that the <strong>{mfa}</strong> package is deprecated</li>
<li>Minor maintenance updates</li>
</ul>
</section>
<section id="zellkonverterzellkonverter" class="level2">
<h2 class="anchored" data-anchor-id="zellkonverterzellkonverter"><a href="https://bioconductor.org/packages/release/bioc/html/zellkonverter.html" title="zellkonverter"><strong>{zellkonverter}</strong></a></h2>
<ul>
<li>Minor updates to tests and documentation</li>
</ul>
</section>
</section>
<section id="new-packages" class="level1">
<h1>New packages</h1>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/BatChef.html" title="BatChef"><strong>{BatChef}</strong></a> - benchmark batch correction methods for scRNA-seq data and help pick an appropriate one</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/Battlefield.html" title="Battlefield"><strong>{Battlefield}</strong></a> - low-level utilities for working with spatial transcriptomics regions, interfaces and layers</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/betterChromVAR.html" title="betterChromVAR"><strong>{betterChromVAR}</strong></a> - faster chromVAR-style inference of TF activity for bulk and single-cell ATAC-seq</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/CellMentor.html" title="CellMentor"><strong>{CellMentor}</strong></a> - supervised dimensionality reduction that tries to preserve known cell-type structure</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/DenoIST.html" title="DenoIST"><strong>{DenoIST}</strong></a> - removes neighbourhood contamination from image-based spatial transcriptomics data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/dominatR.html" title="dominatR"><strong>{dominatR}</strong></a> - visualises feature dominance using concepts from physics</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/GraphExperiment.html" title="GraphExperiment"><strong>{GraphExperiment}</strong></a> - extends <code>SingleCellExperiment</code> with infrastructure for storing feature-level networks</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/hammers.html" title="hammers"><strong>{hammers}</strong></a> - utilities package for scRNA-seq analysis using both <code>Seurat</code> and <code>SingleCellExperiment</code></li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/jvecfor.html" title="jvecfor"><strong>{jvecfor}</strong></a> - faster nearest-neighbour search for large single-cell datasets with drop-in replacements for common Bioconductor workflows</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/RankMap.html" title="RankMap"><strong>{RankMap}</strong></a> - fast reference-based cell type annotation for single-cell and spatial transcriptomics data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scConform.html" title="scConform"><strong>{scConform}</strong></a> - cell type annotation with conformal prediction intervals and uncertainty quantification</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scECODA.html" title="scECODA"><strong>{scECODA}</strong></a> - workflow for analysing cell type proportions as compositional data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scLang.html" title="scLang"><strong>{scLang}</strong></a> - developer-facing helpers for writing scRNA-seq packages that work with both Seurat and <code>SingleCellExperiment</code></li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scPassport.html" title="scPassport"><strong>{scPassport}</strong></a> - stores a persistent metadata passport inside <code>Seurat</code> and <code>SingleCellExperiment</code> objects</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scTypeEval.html" title="scTypeEval"><strong>{scTypeEval}</strong></a> - tools for evaluating cell type assignments with limited ground truth data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/SpatialArtifacts.html" title="SpatialArtifacts"><strong>{SpatialArtifacts}</strong></a> - quality control for identifying spatial artifacts in Visium and Visium HD data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/SpNeigh.html" title="SpNeigh"><strong>{SpNeigh}</strong></a> - neighbourhood-aware spatial transcriptomics analysis including boundary detection and spatial differential expression</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/tidyprint.html" title="tidyprint"><strong>{tidyprint}</strong></a> - tidier print methods for <code>SummarizedExperiment</code> objects</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/VISTA.html" title="VISTA"><strong>{VISTA}</strong></a> - wraps differential expression workflows and visualisation in a <code>SummarizedExperiment</code>-based container</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/ZarrArray.html" title="ZarrArray"><strong>{ZarrArray}</strong></a> - <code>DelayedArray</code>-backed infrastructure for working with Zarr datasets in R</li>
</ul>
</section>
<section id="updates" class="level1">
<h1>Updates</h1>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/DropletUtils.html" title="DropletUtils"><strong>{DropletUtils}</strong></a> - <code>downsampleReads()</code> now uses the faster downsampling algorithm from <a href="https://bioconductor.org/packages/release/bioc/html/scuttle.html" title="scuttle"><strong>{scuttle}</strong></a></li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/edgeR.html" title="edgeR"><strong>{edgeR}</strong></a> - new <code>DGEListFromTximport()</code> and <code>DGEListFromTximeta()</code> helpers plus a <code>sampleWeights()</code> function</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/limma.html" title="limma"><strong>{limma}</strong></a> - <code>voom()</code> gains offset support and <code>topTableF()</code> is now finally removed</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/Rarr.html" title="Rarr"><strong>{Rarr}</strong></a> - major updates including moving the <code>DelayedArray</code> backend into the new <a href="https://bioconductor.org/packages/release/bioc/html/ZarrArray.html" title="ZarrArray"><strong>{ZarrArray}</strong></a> package and improved Zarr v3 support</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/Rhdf5lib.html" title="Rhdf5lib"><strong>{Rhdf5lib}</strong></a> - build updates and an update to HDF5 1.14.6</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/rhdf5.html" title="rhdf5"><strong>{rhdf5}</strong></a> - updated to HDF5 1.14.6 and various fixes and improvements</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scran.html" title="scran"><strong>{scran}</strong></a> - deprecates several functions in favour of <a href="https://bioconductor.org/packages/release/bioc/html/scrapper.html" title="scrapper"><strong>{scrapper}</strong></a> and fixes overflow bugs</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scrapper.html" title="scrapper"><strong>{scrapper}</strong></a> - multiple updates to several functions, continues the migration of functionality out of <a href="https://bioconductor.org/packages/release/bioc/html/scran.html" title="scran"><strong>{scran}</strong></a> and <a href="https://bioconductor.org/packages/release/bioc/html/scuttle.html" title="scuttle"><strong>{scuttle}</strong></a></li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scuttle.html" title="scuttle"><strong>{scuttle}</strong></a> - faster <code>downsampleMatrix()</code> and <code>summarizeAssayByGroup()</code>, more deprecations in favour of <code>scrapper</code></li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html" title="SingleCellExperiment"><strong>{SingleCellExperiment}</strong></a> - improved warnings for named assay getters and setters</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/tximeta.html" title="tximeta"><strong>{tximeta}</strong></a> - matching updates for <a href="https://bioconductor.org/packages/release/bioc/html/edgeR.html" title="edgeR"><strong>{edgeR}</strong></a>’s new <code>DGEListFromTximeta()</code> workflow</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/tximport.html" title="tximport"><strong>{tximport}</strong></a> - vignette updates around the new <a href="https://bioconductor.org/packages/release/bioc/html/edgeR.html" title="edgeR"><strong>{edgeR}</strong></a> integration</li>
</ul>
<div class="callout callout-style-default callout-note callout-titled" title="AI Disclaimer">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Note</span>AI Disclaimer
</div>
</div>
<div class="callout-body-container callout-body">
<p>AI was used to research and draft this post. It did a reasonable job on matching the style of previous posts but it missed several packages I had to tell it to add later. Overall, it was moderately successful.</p>
</div>
</div>


</section>

 ]]></description>
  <category>bioconductor</category>
  <category>R</category>
  <guid>https://lazappi.id.au/posts/2026-05-12-bioconductor-3-23-wrap-up/</guid>
  <pubDate>Mon, 11 May 2026 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Benchmarking feature selection for integration</title>
  <link>https://lazappi.id.au/posts/2025-03-15-feature-selection-benchmark/</link>
  <description><![CDATA[ 





<p>Our paper <a href="https://doi.org/10.1038/s41592-025-02624-3">“Feature selection methods affect the performance of scRNA-seq data integration and querying”</a><span class="citation" data-cites="Zappia2025-dq"><sup>1</sup></span> was recently published. As the title suggests, this study looked at different methods of selecting features from scRNA-seq data and how those features affected integrated datasets and using an integrated reference. In this post I wanted to give a summary of what we found and provide some more insights into the process of developing the benchmark.</p>
<section id="motivation" class="level1">
<h1>Motivation</h1>
<p>Like for many steps in scRNA-seq analysis, integration methods typically recommend selecting a subset of genes to combine datasets. In a <a href="https://doi.org/10.1038/s41592-021-01336-8">previous benchmark</a><span class="citation" data-cites="Luecken2021-jo"><sup>2</sup></span> we found that using highly variable genes did in fact improve integration performance compared to using all features but there remained some open questions:</p>
<ol type="1">
<li>What feature selection method to use? In the scIB benchmark we used 2000 batch-aware highly variable genes but there are many other feature selection methods, some of which have been shown to perform better in benchmarks of other tasks.</li>
<li>What effect does feature selection on using an integrated reference? The scIB benchmark looked at how well datasets were integrated but didn’t consider mapping new datasets to that reference.</li>
</ol>
<p>The second question was of particular interest to me. We had already shown that feature selection improves integration but what about using that integrated dataset as a reference? Whenever I had see an example of mapping data to a reference I add also wondered about what happens to any variation in the features not included in the model. For example, can we reliably identify new populations in a mapped query dataset if the variation that separates them is in genes that the model knows nothing about?</p>
<p>As always, the scope of the benchmark grew a bit beyond this but this was the starting point for the study.</p>
</section>
<section id="study-design" class="level1">
<h1>Study design</h1>
<p>We followed the standard benchmarking setup of selecting some test datasets, finding methods to compare and using a set of metrics to evaluate them. The datasets were split into query and reference sets. Features were then selected on the reference and used to integrate it, before mapping the query and classifying the query cells. We selected some cell populations to be only present in the query which we tried to identify after mapping.</p>
<p>There are few things about how we did the study that I think are interesting.</p>
<section id="paper-format" class="level2">
<h2 class="anchored" data-anchor-id="paper-format">Paper format</h2>
<p>From early on in the study, we decided to submit it as a <a href="https://www.nature.com/nmeth/submission-guidelines/registered-reports">registered report</a>. This format is designed for benchmarking studies and changes the review process. Instead of reviewing the completed study, a detailed plan for the benchmark is reviewed with a guarantee that if the accepted plan is followed the paper will be published regardless of the results. This removes some pressure from the authors, allowing you to present the results you found without needing to find an angle to interest reviewers and editors. However, having to follow a pre-approved plan presents some challenges.</p>
<p>For a computational benchmark, it means you have to do significant engineering work in advance to be confident that you can scale to the full benchmark. It also (by design) removes a lot of the flexibility to adjust things as you go, which is challenging as you never really know how computational tools are going to behave in advance. This is part of the reason that we included the metric selection step as part of the study, as we couldn’t do it before the proposal without effectively running the whole benchmark. However that ended up being one of the most interesting parts for me and something I think every benchmarking study should show.</p>
</section>
<section id="benchmarking-pipeline" class="level2">
<h2 class="anchored" data-anchor-id="benchmarking-pipeline">Benchmarking pipeline</h2>
<p>To implement the benchmark we built a pipeline using <a href="https://www.nextflow.io/">Nextflow</a>. For any project, being able to reliably re-run things as needed is useful but for a large benchmarking project where you know things will need to be run multiple times this is especially important. Using Nextflow allowed us to have one workflow which could adapt different parameter sets and could be run locally for testing or on different HPC clusters (as ended up being necessary after an IT incident 😅). Each step in the workflow was implemented as a separate R or Python script which could be run separately or using a Nextflow process in the workflow. We also created a conda environment file for each tool or package, which could be used for multiple processes.<sup>1</sup> Nextflow automatically handles creating the environments when they are needed or updated.</p>
<p>The structure of the pipeline and some example scripts were written and tested before a hackathon that we used to kick off the project. This allowed everyone to start contributing straight away. By using separate scripts for each step everyone could work on separate components in their preferred language without conflicting with what other people were doing.</p>
</section>
</section>
<section id="metric-selection" class="level1">
<h1>Metric selection</h1>
<p>The first results section in the paper doesn’t involve feature selection methods at all, instead it focuses on comparing and selecting the metrics to use. As mentioned above, this was somewhat forced upon us by the paper format but I think it was a worthwhile exercise that should receive more attention. In most benchmarks this would probably be done informally (if at all) but because we had to include it in the project proposal we had to formalise the process.</p>
<p>In the development phase of the project we had collected as many metrics as possible that had been used in previous benchmarks or method comparisons.<sup>2</sup> We divided these into five categories: removal of batch effects during integration (Integration (Batch)), conservation of biological variance during integration (Integration (Bio)), mapping of a query to the reference (Mapping), label projection to cells in the query (Classification) and detection of new populations in the query (Unseen). There were some that we excluded straight away (because they didn’t have a useable implementation or didn’t fit into our framework) but we tried to implement as many as possible. In the metric selection step, we then wanted to decide which metrics to consider. There were a few criteria we considered:</p>
<ul>
<li>Does the metric have a useable dynamic range (i.e.&nbsp;does it actually measure something in our scenario)?</li>
<li>Is the metric correlated with technical factors like the number of selected features?</li>
<li>Is the metric redundant (i.e.&nbsp;is it overly correlated with other metrics in the same category)?</li>
<li>Is the metric correctly categorised (i.e.&nbsp;is it correlated with metrics in the same category and anti-correlated with metrics in other categories)?</li>
</ul>
<p>We did this by running the benchmark using sets of randomly selected features of different sizes which allowed us to investigate metric behaviour without biasing towards any particular method. We also used a highly variable gene method to check correlation with the number of selected features as random sets do not have any inherent ordering. This worked well in our benchmark but could be more challenging in other scenarios where it is not as easy to simulate the methods you are trying to evaluate.</p>
<p>The metric selection step allowed us to choose a set of metrics that we were confident reliably measured what were were interested in evaluating.<sup>3</sup></p>
<section id="baselines-and-scaling" class="level2">
<h2 class="anchored" data-anchor-id="baselines-and-scaling">Baselines and scaling</h2>
<p>Combining scores from multiple metrics is inevitably challenging because they have different effective ranges. Even though we had implemented each metric so that the theoretical worst score was 0 and the theoretical best score was 1 we knew they would have different ranges in practice. In the scIB study<span class="citation" data-cites="Luecken2021-jo"><sup>2</sup></span> we had min-max scaled each metric before combining them but this made the analysing them difficult as adding, removing or modifying any method changed the results. For this study, we instead used a process more similar to what has been implemented in <a href="https://openproblems.bio/">Open Problems</a><span class="citation" data-cites="Luecken2024-ur"><sup>3</sup></span>.</p>
<p>This involved a set of baseline methods that we expected to perform well or poorly (depending on the metric). These were used to establish an effective range for each metric (for a dataset) and other methods were scaled to this range. As well as making the metrics comparable, this also provides context to the metric scores. We can easily see that any score above 1 performs better than all the baseline methods and any score less than 0 performs worse than all the baselines.</p>
<p>Once the metrics scores were on the same scale we could combine them to create category and overall scores that we used for most of the evaluation.</p>
</section>
</section>
<section id="results" class="level1">
<h1>Results</h1>
<p>Now that I have explained how we did the study, let’s talk about the results! I’m going to mention some of the key points but you should really read the paper for the details.</p>
<section id="how-many-features-to-use" class="level2">
<h2 class="anchored" data-anchor-id="how-many-features-to-use">How many features to use?</h2>
<p>Most of the methods we evaluated require the user to set how many features to select. Using a subset of methods from commonly used packages<sup>4</sup> we selected different numbers of features and evaluated the performance. We saw different patterns for different datasets and for each metric category. Integration was slightly better with fewer features and the query categories scored higher with more features.</p>
<p>In the end, we chose to use 2000 features for comparing methods. It was reassuring to see that the number we found was consistent with common practice but there was enough variation here that were recommend to tune the number of features for your dataset and use case.</p>
</section>
<section id="which-method-to-use" class="level2">
<h2 class="anchored" data-anchor-id="which-method-to-use">Which method to use?</h2>
<p>Finally, we have reached the actual comparison between methods 🎉!</p>
<p>The results here are reassuring but maybe not surprising. We found that the standard highly variable feature selection methods were the best performers, particularly the variance stabilising transformation approach in the <a href="https://satijalab.org/seurat/">Seurat</a><span class="citation" data-cites="Satija2015-or"><sup>4</sup></span> and <a href="https://scanpy.readthedocs.io/en/stable/">scanpy</a><span class="citation" data-cites="Wolf2018-na"><sup>5</sup></span> packages. There is maybe some bias here as the reference labels we used for evaluation were likely to have come from one of these methods but I don’t think it is a significant issue given all the other analysis steps that go into annotating cells. More likely, it is just that this approach intuitively makes sense and works pretty well most of the time.</p>
<p>As a comparison we included one supervised method using the cell labels based on a filtered set of Wilcoxon marker genes. This also performed very well which is probably unexpected given we use the same labels for evaluation. However, in most cases cell labels are not available before integration and we saw more variation across datasets (possibly depending on the quality and resolution of the labels). I wouldn’t rule out using supervised features, but I think you need to have confidence in the labels and some motivation for why it would be better than an unsupervised approach. Possibly, using marker genes to augment highly variable features would be a good combined approach.</p>
<p>The other top performer was <a href="https://triku.readthedocs.io/en/latest/">triku</a><span class="citation" data-cites="M_Ascension2022-yp"><sup>6</sup></span> which operates on a neighbourhood graph and is worth considering if you are looking for an alternative approach.</p>
</section>
<section id="do-i-need-to-use-batch-aware-features" class="level2">
<h2 class="anchored" data-anchor-id="do-i-need-to-use-batch-aware-features">Do I need to use batch-aware features?</h2>
<p>Some packages implement batch-aware variants of their features selection methods where they are applied to each batch separately and the results combined in some way (usually by choosing the most commonly selected features). This is the approach we considered in the scIB paper<span class="citation" data-cites="Luecken2021-jo"><sup>2</sup></span> and has since become recommended practice. The intuition is that by selecting features within each batch you avoid choosing those that are different between batches, conserving more biology. We compared batch-aware variants of methods in scanpy but didn’t see any consistent differences that let us say what the effect is.</p>
<p>Personally, I probably wouldn’t bother with batch-aware features, unless it was for computational reasons (because you can process each batch separately and combine the results).</p>
</section>
<section id="should-i-integrate-lineages-separately" class="level2">
<h2 class="anchored" data-anchor-id="should-i-integrate-lineages-separately">Should I integrate lineages separately?</h2>
<p>This question moves a bit beyond the original scope of the paper and comes out of reviewer comments to look more into the interaction between feature selection and biological factors. Here, we compared performance on the full <a href="https://doi.org/10.1038/s41591-023-02327-2">Human Lung Cell Atlas</a><span class="citation" data-cites="Sikkema2023-ia"><sup>7</sup></span> (HLCA) dataset to the immune and epithelial subsets. This section was to test the idea that restricting the biology in the dataset could improve feature selection (by selecting more specific biological features) and integration.</p>
<p>A full benchmark study is needed to properly answer this question but our results don’t support that idea. We saw better performance on the full HLCA, particularly when identifying previously unseen populations using the <a href="https://doi.org/10.1038/s41587-021-01033-z">Milo</a><span class="citation" data-cites="Dann2022-nz"><sup>8</sup></span> metric. My explanation is that by showing the integration model a wider variety of biology it learns more about the possible cell space and therefore better separate new types of cells. But like I said, you could do a full study just to answer this question and we didn’t consider selecting features on each lineage but then integrating the full dataset.</p>
</section>
<section id="how-do-selected-features-interact-with-the-integration-method" class="level2">
<h2 class="anchored" data-anchor-id="how-do-selected-features-interact-with-the-integration-method">How do selected features interact with the integration method?</h2>
<p>In the last section of the results we look at interactions with different integration models. Comparing integration methods has been done previously and was specifically outside the scope of this study but we included a small comparison at the prompting of the reviewers and to investigate some specific questions. We used <a href="https://doi.org/10.1038/s41592-018-0229-2">scVI</a><span class="citation" data-cites="Lopez2018-au"><sup>9</sup></span> as the integration and mapping method for most of the study but here we compared to <a href="https://doi.org/10.15252/msb.20209620">scANVI</a><span class="citation" data-cites="Xu2021-dh"><sup>10</sup></span> and <a href="https://doi.org/10.1038/s41592-019-0619-0">Harmony</a><span class="citation" data-cites="Korsunsky2019-ex"><sup>11</sup></span> followed by <a href="https://doi.org/10.1038/s41467-021-25957-x">Symphony</a><span class="citation" data-cites="Kang2021-ac"><sup>12</sup></span> mapping.</p>
<p>Overall, scANVI performed slightly better than scVI, particularly in the biological categories. This is probably because the model knowing something about cell labels allowed it to overcome deficiencies in the selected features. In contrast, Symphony performed worse in general but particularly in unseen population detection. A proper evaluation would be needed to work out why this is<sup>5</sup> but I would be cautions of using this integration approach of a reference mapping application.</p>
</section>
</section>
<section id="summary" class="level1">
<h1>Summary</h1>
<p>Ok, so after all that, what did we learn?</p>
<ul>
<li>The registered report is an interesting format but it requires you to do a lot of work in advance and doesn’t necessarily make things quicker</li>
<li>Setting up a proper workflow takes some effort but is invaluable when you need to re-run things, especially in a large, computationally-intense project</li>
<li>Choosing the metrics to use is vital and something that more time should be spent on in benchmarking papers</li>
<li>Think carefully about how to scale and combine metric scores, use baseline methods if you can</li>
<li>Around 2000 features generally works well but you should tune this for you dataset and application</li>
<li>Using batch-aware features doesn’t add much</li>
<li>Integrating lineages separately results in worse detection of unseen populations</li>
<li>scANVI generally improves performance over scVI regardless of the feature selection method but Symphony struggled with detecting new populations</li>
</ul>
<p>Thanks for reading! You can find more about this study here:</p>
<ul>
<li><a href="https://doi.org/10.1038/s41592-025-02624-3">Publication</a></li>
<li><a href="https://github.com/theislab/atlas-feature-selection-benchmark">GitHub repository</a></li>
<li><a href="https://figshare.com/projects/Benchmarking_feature_selection_for_scRNA-seq_integration/214819">figshare collection</a></li>
</ul>



</section>


<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">References</h2><div id="refs" class="references csl-bib-body" data-entry-spacing="0" data-line-spacing="2">
<div id="ref-Zappia2025-dq" class="csl-entry">
<div class="csl-left-margin">1. </div><div class="csl-right-inline">Zappia, L., Richter, S., Ramírez-Suástegui, C., Kfuri-Rubens, R., Vornholz, L., Wang, W., Dietrich, O., Frishberg, A., Luecken, M. D. &amp; Theis, F. J. <span class="nocase">Feature selection methods affect the performance of scRNA-seq data integration and querying</span>. <em>Nature methods</em> 1–11 (2025). doi:<a href="https://doi.org/10.1038/s41592-025-02624-3">10.1038/s41592-025-02624-3</a></div>
</div>
<div id="ref-Luecken2021-jo" class="csl-entry">
<div class="csl-left-margin">2. </div><div class="csl-right-inline">Luecken, M. D., Büttner, M., Chaichoompu, K., Danese, A., Interlandi, M., Mueller, M. F., Strobl, D. C., Zappia, L., Dugas, M., Colomé-Tatché, M. &amp; Theis, F. J. <span class="nocase">Benchmarking atlas-level data integration in single-cell genomics</span>. <em>Nature methods</em> (2021). doi:<a href="https://doi.org/10.1038/s41592-021-01336-8">10.1038/s41592-021-01336-8</a></div>
</div>
<div id="ref-Luecken2024-ur" class="csl-entry">
<div class="csl-left-margin">3. </div><div class="csl-right-inline">Luecken, M. D., Gigante, S., Burkhardt, D. B., Cannoodt, R., Strobl, D. C., Markov, N. S., Zappia, L., Palla, G., Lewis, W., Dimitrov, D., Vinyard, M. E., Magruder, D. S., Andersson, A., Dann, E., Qin, Q., Otto, D. J., Klein, M., Botvinnik, O. B., Deconinck, L., Waldrant, K., Open Problems Jamboree Members, Bloom, J. M., Pisco, A. O., Saez-Rodriguez, J., Wulsin, D., Pinello, L., Saeys, Y., Theis, F. J. &amp; Krishnaswamy, S. <span class="nocase">Defining and benchmarking open problems in single-cell analysis</span>. <em>Research square</em> (2024). doi:<a href="https://doi.org/10.21203/rs.3.rs-4181617/v1">10.21203/rs.3.rs-4181617/v1</a></div>
</div>
<div id="ref-Satija2015-or" class="csl-entry">
<div class="csl-left-margin">4. </div><div class="csl-right-inline">Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. &amp; Regev, A. <a href="https://doi.org/10.1038/nbt.3192"><span class="nocase">Spatial reconstruction of single-cell gene expression data</span></a>. <em>Nature Biotechnology</em> <strong>33,</strong> 495–502 (2015).</div>
</div>
<div id="ref-Wolf2018-na" class="csl-entry">
<div class="csl-left-margin">5. </div><div class="csl-right-inline">Wolf, F. A., Angerer, P. &amp; Theis, F. J. <a href="https://doi.org/10.1186/s13059-017-1382-0"><span class="nocase">SCANPY: large-scale single-cell gene expression data analysis</span></a>. <em>Genome biology</em> <strong>19,</strong> 15 (2018).</div>
</div>
<div id="ref-M_Ascension2022-yp" class="csl-entry">
<div class="csl-left-margin">6. </div><div class="csl-right-inline">M Ascensión, A., Ibáñez-Solé, O., Inza, I., Izeta, A. &amp; Araúzo-Bravo, M. J. <a href="https://doi.org/10.1093/gigascience/giac017"><span class="nocase">Triku: a feature selection method based on nearest neighbors for single-cell data</span></a>. <em>GigaScience</em> <strong>11,</strong> (2022).</div>
</div>
<div id="ref-Sikkema2023-ia" class="csl-entry">
<div class="csl-left-margin">7. </div><div class="csl-right-inline">Sikkema, L., Ramírez-Suástegui, C., Strobl, D. C., Gillett, T. E., Zappia, L., Madissoon, E., Markov, N. S., Zaragosi, L.-E., Ji, Y., Ansari, M., Arguel, M.-J., Apperloo, L., Banchero, M., Bécavin, C., Berg, M., Chichelnitskiy, E., Chung, M.-I., Collin, A., Gay, A. C. A., Gote-Schniering, J., Hooshiar Kashani, B., Inecik, K., Jain, M., Kapellos, T. S., Kole, T. M., Leroy, S., Mayr, C. H., Oliver, A. J., Papen, M. von, Peter, L., Taylor, C. J., Walzthoeni, T., Xu, C., Bui, L. T., De Donno, C., Dony, L., Faiz, A., Guo, M., Gutierrez, A. J., Heumos, L., Huang, N., Ibarra, I. L., Jackson, N. D., Kadur Lakshminarasimha Murthy, P., Lotfollahi, M., Tabib, T., Talavera-López, C., Travaglini, K. J., Wilbrey-Clark, A., Worlock, K. B., Yoshida, M., Lung Biological Network Consortium, Berge, M. van den, Bossé, Y., Desai, T. J., Eickelberg, O., Kaminski, N., Krasnow, M. A., Lafyatis, R., Nikolic, M. Z., Powell, J. E., Rajagopal, J., Rojas, M., Rozenblatt-Rosen, O., Seibold, M. A., Sheppard, D., Shepherd, D. P., Sin, D. D., Timens, W., Tsankov, A. M., Whitsett, J., Xu, Y., Banovich, N. E., Barbry, P., Duong, T. E., Falk, C. S., Meyer, K. B., Kropski, J. A., Pe’er, D., Schiller, H. B., Tata, P. R., Schultze, J. L., Teichmann, S. A., Misharin, A. V., Nawijn, M. C., Luecken, M. D. &amp; Theis, F. J. <a href="https://doi.org/10.1038/s41591-023-02327-2"><span class="nocase">An integrated cell atlas of the lung in health and disease</span></a>. <em>Nature medicine</em> <strong>29,</strong> 1563–1577 (2023).</div>
</div>
<div id="ref-Dann2022-nz" class="csl-entry">
<div class="csl-left-margin">8. </div><div class="csl-right-inline">Dann, E., Henderson, N. C., Teichmann, S. A., Morgan, M. D. &amp; Marioni, J. C. <a href="https://doi.org/10.1038/s41587-021-01033-z"><span class="nocase">Differential abundance testing on single-cell data using k-nearest neighbor graphs</span></a>. <em>Nature biotechnology</em> <strong>40,</strong> 245–253 (2022).</div>
</div>
<div id="ref-Lopez2018-au" class="csl-entry">
<div class="csl-left-margin">9. </div><div class="csl-right-inline">Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. &amp; Yosef, N. <a href="https://doi.org/10.1038/s41592-018-0229-2"><span class="nocase">Deep generative modeling for single-cell transcriptomics</span></a>. <em>Nature methods</em> <strong>15,</strong> 1053–1058 (2018).</div>
</div>
<div id="ref-Xu2021-dh" class="csl-entry">
<div class="csl-left-margin">10. </div><div class="csl-right-inline">Xu, C., Lopez, R., Mehlman, E., Regier, J., Jordan, M. I. &amp; Yosef, N. <a href="https://doi.org/10.15252/msb.20209620"><span class="nocase">Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models</span></a>. <em>Molecular systems biology</em> <strong>17,</strong> e9620 (2021).</div>
</div>
<div id="ref-Korsunsky2019-ex" class="csl-entry">
<div class="csl-left-margin">11. </div><div class="csl-right-inline">Korsunsky, I., Millard, N., Fan, J., Slowikowski, K., Zhang, F., Wei, K., Baglaenko, Y., Brenner, M., Loh, P.-R. &amp; Raychaudhuri, S. <span class="nocase">Fast, sensitive and accurate integration of single-cell data with Harmony</span>. <em>Nature methods</em> (2019). doi:<a href="https://doi.org/10.1038/s41592-019-0619-0">10.1038/s41592-019-0619-0</a></div>
</div>
<div id="ref-Kang2021-ac" class="csl-entry">
<div class="csl-left-margin">12. </div><div class="csl-right-inline">Kang, J. B., Nathan, A., Weinand, K., Zhang, F., Millard, N., Rumker, L., Moody, D. B., Korsunsky, I. &amp; Raychaudhuri, S. <a href="https://doi.org/10.1038/s41467-021-25957-x"><span class="nocase">Efficient and precise single-cell reference atlas mapping with Symphony</span></a>. <em>Nature communications</em> <strong>12,</strong> 5890 (2021).</div>
</div>
</div></section><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>I would probably use Docker containers or <a href="https://viash.io/">Viash</a> components for this now, but this worked well enough↩︎</p></li>
<li id="fn2"><p>We wanted to avoid developing new metrics but there were some that we significantly modified↩︎</p></li>
<li id="fn3"><p>See the paper for exactly which metrics we selected and why↩︎</p></li>
<li id="fn4"><p>It was computationally infeasible to do this for all methods, given the computation required for each integration run↩︎</p></li>
<li id="fn5"><p>Our workflow was designed around scVI but Symphony is a completely different class of integration model and there may be some bias there↩︎</p></li>
</ol>
</section><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@misc{zappia2025,
  author = {Zappia, Luke},
  title = {Benchmarking Feature Selection for Integration},
  date = {2025-03-15},
  url = {https://lazappi.id.au/posts/2025-03-15-feature-selection-benchmark/},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-zappia2025" class="csl-entry quarto-appendix-citeas">
<div class="">Zappia, L. Benchmarking feature selection for
integration. (2025). at &lt;<a href="https://lazappi.id.au/posts/2025-03-15-feature-selection-benchmark/">https://lazappi.id.au/posts/2025-03-15-feature-selection-benchmark/</a>&gt;</div>
</div></div></section></div> ]]></description>
  <category>scrna-seq</category>
  <category>benchmarking</category>
  <category>feature selection</category>
  <category>integration</category>
  <category>publication</category>
  <guid>https://lazappi.id.au/posts/2025-03-15-feature-selection-benchmark/</guid>
  <pubDate>Fri, 14 Mar 2025 23:00:00 GMT</pubDate>
  <media:content url="https://lazappi.id.au/posts/2025-03-15-feature-selection-benchmark/thumbnail.png" medium="image" type="image/png" height="102" width="144"/>
</item>
<item>
  <title>scverse conference</title>
  <link>https://lazappi.id.au/posts/2024-09-15-scverse-conference/</link>
  <description><![CDATA[ 





<section id="summary" class="level1">
<h1>Summary</h1>
<p>This week was the first ever <a href="https://scverse.org/conference2024">scverse conference</a> held in Munich from 10-12 September. While it was based around the <a href="https://scverse.org/">scverse</a> software community the conference covered more than just those core packages for single-cell analysis and included talks on a range of biological topics as well as a diverse range of workshops. Putting together any conference is a lot of work, but particularly the first attempt for what is still a new community. I was very impressed with what the organisers were able to put together, how smoothly everything ran and the number of attendees from around the world. It is great to see these growing into a real community that is more than just the core maintainers from a few packages and I look forward to seeing what they do in the years ahead. As with any effort on the work of students, the difficulty is always maintaining momentum as those people move on and everything is based to a new generation but I am hopeful they are building something that will be sustainable with support from senior academics and industry.</p>
</section>
<section id="keynote-sketchotes" class="level1">
<h1>Keynote sketchotes</h1>
<p>Here are my sketchnotes summarising the keynote talks (click to expand):</p>
<div class="quarto-layout-panel" data-layout-nrow="3">
<div class="quarto-layout-row">
<div class="quarto-layout-cell" style="flex-basis: 50.0%;justify-content: flex-start;">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="1-RobPatro.jpg" class="lightbox" data-gallery="scverse-sketchnotes" title="Rob Patro - “Upstream of the single-cell data deluge: On the importance of accurate, efficient and open methods for preprocessing single-cell data”"><img src="https://lazappi.id.au/posts/2024-09-15-scverse-conference/1-RobPatro.jpg" class="img-fluid figure-img" alt="Sketchnotes of Rob Patro's scverse conference keynote 'Upstream of the single-cell data deluge: On the importance of accurate, efficient and open methods for preprocessing single-cell data'"></a></p>
<figcaption>Rob Patro - “Upstream of the single-cell data deluge: On the importance of accurate, efficient and open methods for preprocessing single-cell data”</figcaption>
</figure>
</div>
</div>
<div class="quarto-layout-cell" style="flex-basis: 50.0%;justify-content: flex-start;">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="2-AngelaOlivieraPisco.jpg" class="lightbox" data-gallery="scverse-sketchnotes" title="Angela Oliveira Pisco - “Multimodal Atlas for Biological Data Analysis and Drug Discovery”"><img src="https://lazappi.id.au/posts/2024-09-15-scverse-conference/2-AngelaOlivieraPisco.jpg" class="img-fluid figure-img" alt="Sketchnotes of Angela Oliveira Pisco's scverse conference keynote 'Multimodal Atlas for Biological Data Analysis and Drug Discovery'"></a></p>
<figcaption>Angela Oliveira Pisco - “Multimodal Atlas for Biological Data Analysis and Drug Discovery”</figcaption>
</figure>
</div>
</div>
</div>
<div class="quarto-layout-row">
<div class="quarto-layout-cell" style="flex-basis: 50.0%;justify-content: flex-start;">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="3-ChristinaLeslie.jpg" class="lightbox" data-gallery="scverse-sketchnotes" title="Christina Leslie - “Machine learning for regulatory genomics at single-cell resolution”"><img src="https://lazappi.id.au/posts/2024-09-15-scverse-conference/3-ChristinaLeslie.jpg" class="img-fluid figure-img" alt="Sketchnotes of Christina Leslie's scverse conference keynote 'Machine learning for regulatory genomics at single-cell resolution'"></a></p>
<figcaption>Christina Leslie - “Machine learning for regulatory genomics at single-cell resolution”</figcaption>
</figure>
</div>
</div>
<div class="quarto-layout-cell" style="flex-basis: 50.0%;justify-content: flex-start;">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="4-AlexWolf.jpg" class="lightbox" data-gallery="scverse-sketchnotes" title="Alex Wolf - “Many anecdotes make a novel? Study-centered analysis &amp; training models”"><img src="https://lazappi.id.au/posts/2024-09-15-scverse-conference/4-AlexWolf.jpg" class="img-fluid figure-img" alt="Sketchnotes of Alex Wolf's scverse conference keynote 'Many anecdotes make a novel? Study-centered analysis &amp; training models'"></a></p>
<figcaption>Alex Wolf - “Many anecdotes make a novel? Study-centered analysis &amp; training models”</figcaption>
</figure>
</div>
</div>
</div>
<div class="quarto-layout-row">
<div class="quarto-layout-cell" style="flex-basis: 50.0%;justify-content: flex-start;">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="5-MariaBrbic.jpg" class="lightbox" data-gallery="scverse-sketchnotes" title="Maria Brbic - “Towards AI-driven discoveries in Single-Cell genomics”"><img src="https://lazappi.id.au/posts/2024-09-15-scverse-conference/5-MariaBrbic.jpg" class="img-fluid figure-img" alt="Sketchnotes of Maria Brbic's scverse conference keynote 'Towards AI-driven discoveries in Single-Cell genomics'"></a></p>
<figcaption>Maria Brbic - “Towards AI-driven discoveries in Single-Cell genomics”</figcaption>
</figure>
</div>
</div>
<div class="quarto-layout-cell" style="flex-basis: 50.0%;justify-content: flex-start;">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="6-FabianTheis.jpg" class="lightbox" data-gallery="scverse-sketchnotes" title="Fabian Theis - “From scanpy to the virtual cell: the coming-of-age of single cell analysis”"><img src="https://lazappi.id.au/posts/2024-09-15-scverse-conference/6-FabianTheis.jpg" class="img-fluid figure-img" alt="Sketchnotes of Fabian Theis's scverse conference keynote 'From scanpy to the virtual cell: the coming-of-age of single cell analysis'"></a></p>
<figcaption>Fabian Theis - “From scanpy to the virtual cell: the coming-of-age of single cell analysis”</figcaption>
</figure>
</div>
</div>
</div>
</div>


</section>

 ]]></description>
  <category>scverse</category>
  <category>conference</category>
  <category>sketchnotes</category>
  <guid>https://lazappi.id.au/posts/2024-09-15-scverse-conference/</guid>
  <pubDate>Sat, 14 Sep 2024 22:00:00 GMT</pubDate>
</item>
<item>
  <title>scRNA-tools low-maintenance mode</title>
  <link>https://lazappi.id.au/posts/2024-03-04-scRNAtools-low-maintenance/</link>
  <description><![CDATA[ 





<p>You are probably here because you have seen or heard that the <a href="https://www.scrna-tools.org/">scRNA-tools database</a> will be entering low-maintenance mode. This post explains a bit more about what that means and some of the motivation behind the decision.</p>
<section id="hang-on-whats-scrna-tools" class="level2">
<h2 class="anchored" data-anchor-id="hang-on-whats-scrna-tools">Hang on, what’s scRNA-tools?</h2>
<p>scRNA-tools is a database that catalogues software tools for analysing scRNA-seq data which I created during my PhD and have curated and maintained ever since. You can read more about it on the <a href="../../projects/scRNA-tools">project page</a>, in the <a href="../../publications/2018-zappia-scrnatools">original paper</a> and in our analysis of the first <a href="../../publications/2021-zappia-1000-tools">1000 tools</a> in the database.</p>
</section>
<section id="so-whats-happening" class="level2">
<h2 class="anchored" data-anchor-id="so-whats-happening">So, what’s happening?</h2>
<p>In May 2024 I will be putting scRNA-tools into what I am calling “low-maintenance mode”. Basically, this means I will stop actively seeking out new tools to add to the database and updates for existing tools. I will still make changes based on <a href="https://www.scrna-tools.org/submit">user submissions</a> and the existing automated checks but the rate of updates will significantly decrease. Upgrades to the database, website and other code are unlikely to happen unless things break.</p>
</section>
<section id="thats-sad-why-now" class="level2">
<h2 class="anchored" data-anchor-id="thats-sad-why-now">That’s sad 😿, why now?</h2>
<p>I’m moving onto the next step in my career and can no longer justify the 2-3 hours per week it takes review new papers, curate them and add tools to the database. Transiting to low-maintenance mode means less frequent updates and should reduce the commitment to something I can keep up with, similar to the software packages I maintain. I would love to keep scRNA-tools running as it has but the continuous, ongoing nature of the curation makes it difficult when you can no-longer justify it as “work”, and this will only increase as the field continuous to grow.</p>
<p>I also have to acknowledge that I haven’t developed scRNA-tools as I would have liked over the last few years. There are several new categories and features I wanted to add but, as the typical academic says, I never found the time.</p>
</section>
<section id="maybe-someone-else-could-take-over" class="level2">
<h2 class="anchored" data-anchor-id="maybe-someone-else-could-take-over">Maybe someone else could take over?</h2>
<p>I always wanted scRNA-tools to be a community project so I am definitely open to this but I would need to be convinced that whoever takes over would be able to commit to actively maintaining the project. In the past I have had people reach out to contribute, and even recruited a team to help with curation and expanding the database, but that always petered out after a few weeks or months. I totally don’t blame anyone for that, there is very little for a PhD student to gain from contributing to a project like this, but it does make me cautious about handing over the project to someone who may not end up putting more time into it than I can.</p>
</section>
<section id="last-thoughts" class="level2">
<h2 class="anchored" data-anchor-id="last-thoughts">Last thoughts</h2>
<p>scRNA-tools has helped record the first years of the single-cell genomics revolution and has been a big part of my work life for the last eight or so years. While I think it is a valuable resource it is also limited, especially as the field expands into other modalities. It will be sad it see it not updated as frequently but it’s time for us both to move on.</p>


</section>

 ]]></description>
  <category>scRNA-tools</category>
  <category>database</category>
  <guid>https://lazappi.id.au/posts/2024-03-04-scRNAtools-low-maintenance/</guid>
  <pubDate>Sun, 03 Mar 2024 23:00:00 GMT</pubDate>
</item>
<item>
  <title>Bioconductor 3.12 wrap-up</title>
  <link>https://lazappi.id.au/posts/2020-10-30-bioconductor-3-12-wrap-up/</link>
  <description><![CDATA[ 





<p>The Bioconductor 3.12 release was this week. Here is my wrap-up of new packages and updates. This is only the things I found interesting based on the release and they don’t come with any particular endorsement. If there is something else you are looking for have a look at the release notes <a href="https://bioconductor.org/news/bioc_3_12_release/" title="Bioconductor 3.12 release notes">here</a>.</p>
<section id="my-packages" class="level1">
<h1>My packages</h1>
<section id="splattersplatter" class="level2">
<h2 class="anchored" data-anchor-id="splattersplatter"><a href="https://bioconductor.org/packages/release/bioc/html/splatter.html" title="splatter"><strong>{splatter}</strong></a></h2>
<ul>
<li>Add the splatPop simulation. This is a extension to the splat simulation contributed by Christina Azodi and Davis McCarthy that adds population effects. It allows you to specify relatedness between individuals and generate cell-type specific eQTL effects.</li>
<li>Add a batch.rmEffect parameter to the Splat simulation. This allows generation of a paired simulation without any batch effects.</li>
<li>Add a new minimiseSCE function which can be used to remove unneeded information from simulation output (or any SingleCellExperiment)</li>
<li>All simulations now return sparse assay matrices by default when they would be smaller than the equivalent dense matrix. This is controlled by a new sparsify argument.</li>
<li>Users will now be automatically prompted to install packages if they try to use a simulation for which the suggested dependencies are not available</li>
</ul>
</section>
<section id="zellkonverterzellkonverter" class="level2">
<h2 class="anchored" data-anchor-id="zellkonverterzellkonverter"><a href="https://bioconductor.org/packages/release/bioc/html/zellkonverter.html" title="zellkonverter"><strong>{zellkonverter}</strong></a></h2>
<p>This is a new package (with help from Aaron Lun) that contains methods to convert between SingleCellExperiment and Python AnnData objects.</p>
</section>
</section>
<section id="new-packages" class="level1">
<h1>New packages</h1>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/ADImpute.html" title="ADImpute"><strong>{ADImpute}</strong></a> - dropout imputation using information from gene regulatory networks</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/aggregateBioVar.html" title="aggregateBioVar"><strong>{aggregateBioVar}</strong></a> - provides tools to aggregating SummarizedExperiment objects at the subject level</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/BayesSpace.html" title="BayesSpace"><strong>{BayesSpace}</strong></a> - tools for clustering tha enhancing spatial gene expression</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/BiocIO.html" title="BiocIO"><strong>{BiocIO}</strong></a> - generics for importing and exporting biological data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/biocthis.html" title="biocthis"><strong>{biocthis}</strong></a> - automate setting up packages for Bioconductor</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/bluster.html" title="bluster"><strong>{bluster}</strong></a> - wraps common clustering algorithms for Bioconductor</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/corral.html" title="corral"><strong>{corral}</strong></a> - correspondence analysis for single-cell data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/densvis.html" title="densvis"><strong>{densvis}</strong></a> - implements the density-preserving modifications to t-SNE and UMAP</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/escape.html" title="escape"><strong>{escape}</strong></a> - bridging package for scRNA-seq gene set enrichment</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/ExperimentSubset.html" title="ExperimentSubset"><strong>{ExperimentSubset}</strong></a> - interface for accessing subsets of SummarizedExperiment objects</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/Herper.html" title="Herper"><strong>{Herper}</strong></a> - interface for managing conda environments</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/ILoReg.html" title="ILoReg"><strong>{ILoReg}</strong></a> - high-resolution scRNA-seq population identification</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/Informeasure.html" title="Informeasure"><strong>{Informeasure}</strong></a> - implementation of information measures such as mutual information</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/Nebulosa.html" title="Nebulosa"><strong>{Nebulosa}</strong></a> - visualisation of scRNA-seq using gene-weighted density estimation</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/NewWave.html" title="NewWave"><strong>{NewWave}</strong></a> - dimensionality reduction and batch correction for scRNA-seq</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/pipeComp.html" title="pipeComp"><strong>{pipeComp}</strong></a> - simple framework for comparing pipelines</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/recount3.html" title="recount3"><strong>{recount3}</strong></a> - access data from recount3</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scDataviz.html" title="scCB2"><strong>{scCB2}</strong></a> - extends the EmptyDrops method for identifying real cells by testing clusters</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scDataviz.html" title="scDataviz"><strong>{scDataviz}</strong></a> - functions for plotting single-cell data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scuttle.html" title="scuttle"><strong>{scuttle}</strong></a> - basic functions for single-cell analysis (most previously in <strong>{scater}</strong>)</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/snifter.html" title="snifter"><strong>{snifter}</strong></a> - R wrapper for the Python openTSNE library</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/SpatialExperiment.html" title="SpatialExperiment"><strong>{SpatialExperiment}</strong></a> - S4 class for storing spatial experiments</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/SPsimSeq.html" title="SPsimSeq"><strong>{SPsimSeq}</strong></a> - Semi-parametric RNA-seq simulation</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/velociraptor.html" title="velociraptor"><strong>{velociraptor}</strong></a> - R wrapper for the scVelo Python package</li>
</ul>
</section>
<section id="updates" class="level1">
<h1>Updates</h1>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/BASiCS.html" title="BASiCS"><strong>{BASiCS}</strong></a> - many updates and improvements</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/basilisk.html" title="basilisk"><strong>{basilisk}</strong></a> - support for setting conda channels and safer environment construction</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/DESeq2.html" title="DESeq2"><strong>{DESeq2}</strong></a> - overhaul of dispersion estimation allowing use of the <strong>{glmGamPoi}</strong> package for single-cell data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/DropletUtils.html" title="DropletUtils"><strong>{DropletUtils}</strong></a> - functions for handling ambient counts</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/edgeR.html" title="edgeR"><strong>{edgeR}</strong></a> - support for SummarizedExperiment objects and improvements to the limma voom-lmFit pipeline</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/limma.html" title="limma"><strong>{limma}</strong></a> - improvements to some fitting functions</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scran.html" title="scran"><strong>{scran}</strong></a> - various new function arguments, some functions moved to other packages</li>
</ul>


</section>

 ]]></description>
  <category>bioconductor</category>
  <category>R</category>
  <guid>https://lazappi.id.au/posts/2020-10-30-bioconductor-3-12-wrap-up/</guid>
  <pubDate>Thu, 29 Oct 2020 23:00:00 GMT</pubDate>
</item>
<item>
  <title>triple j’s Requestival</title>
  <link>https://lazappi.id.au/posts/2020-07-11-requestival/</link>
  <description><![CDATA[ 





<section id="intro" class="level1">
<h1>Intro</h1>
<p>From 25 to 31 May triple j ran an event they called the <a href="https://www.abc.net.au/triplej/the-latest/what-is-requestival-and-how-can-you-get-involved/12254122" title="What is requestival?">“Requestival”</a>. For anyone not from Australia, <a href="https://www.abc.net.au/triplej/" title="triple j website">triple j</a> is a national public radio station with a mandate to target a youth audience and promote alternative and Australian music. For many Australians, particularly those that grew up in regional areas, triple j is responsible for providing their first exposure to music outside the current top 50 or golden oldies. The idea behind the Requestival was fairly simple. To make up for the cancellation of many music festivals due to the coronavirus triple j would hold their own virtual festival with the twist being that every song they played would be requested by listeners. The second twist was that they would play any request, not just the kind of new alternative they usually play. And they did mean anything including Beethoven, TV show themes and Taylor Swift who all got some air time during the week.</p>
<p>I thought this would be a cool dataset to have a play around with. I’ll show some of the highlights here but have a look at the <a href="https://jdblischak.github.io/workflowr/index.html" title="workflowr website"><strong>{workflowr}</strong></a> <a href="https://lazappi.github.io/requestival/" title="Requesitval analysis website">analysis website</a> for more details or check out the code on <a href="https://github.com/lazappi/requestival" title="Requesitval analysis GitHub repository">GitHub</a>.</p>
</section>
<section id="getting-the-data" class="level1">
<h1>Getting the data</h1>
<p>triple j has a recently played page on their website that has a live stream of the songs they are playing. This page also has a search function that let’s you find which song was played at a particular time. I will note that there may be some errors in this list, there was at least one time I heard a song played but saw something else get added to the list. To get the raw data I simply searched for each day of the Requestival, clicked the “Show more” button until all the songs for that day were shown and then downloaded the HTML page. It’s probably possible to do this all programmatically but it was only a few pages so it didn’t seem worth the effort. Thankfully the HTML was fairly simple and neat so it was relatively easy to extract the information I wanted using <a href="https://rvest.tidyverse.org/" title="rvest website"><strong>{rvest}</strong></a>. What I ended up with was a <code>tibble</code> with times, song name, artist and release and links to Spotify, YouTube and the triple j Unearthed website. There were 27 songs that didn’t have release information on the website for some reason. See the <a href="https://lazappi.github.io/requestival/01-scraping.html" title="Analysis website - scraping">scraping page</a> of the analysis website for more details of this part.</p>
<p>There was a bit more tidying I wanted to before any analysis. This was things like converting the file names into days, converting the time strings to <code>hms</code> objects in the correct timezone and replacing Unicode characters. The Spotify and YouTube links included queries which it might be useful to have so I also extracted those into new columns and created a column to indicate if a song was from Unearthed. The final tidying step was to select songs played between 6 am and 9 pm as these were the official hours of the Requestival. See the details <a href="https://lazappi.github.io/requestival/02-tidying.html" title="Analysis website - tidying">here</a>.</p>
<p><img src="https://lazappi.id.au/posts/2020-07-11-requestival/play_times.png" class="img-fluid"></p>
<p>After all this I ended up with a dataset with 1187 songs from 896 artists off 1049 releases.</p>
</section>
<section id="augmentation" class="level1">
<h1>Augmentation</h1>
<p>There isn’t much analysis you can do on a simple list of songs so I wanted to augment the data in some way. I have seen some cool analysis using information from Spotify before so I thought that would be a good source to try. Despite the <a href="https://www.rcharlie.com/spotifyr/" title="spotifyr website"><strong>{spotifyr}</strong></a> package providing a nice interface to the Spotify API this wasn’t quite as easy as I had hoped. I initially wanted to use the queries I had extracted from the Spotify links but these weren’t in the format that <strong>{spotifyr}</strong> needs so I had to construct them manually. I also ran into the problem that the combination of song, artist and release was both too specific and not specific enough. For some songs I got no results when I searched for all this information while for other songs I got many results. These were often duplicates due to releases with slight differences in different countries etc. but sometimes they were unrelated songs that just happened to have enough key words in common.</p>
<p>In the end I searched first for song, artist and release and took the first result if there was one, if not I searched for just song and artist. The only other constraint I had was that the album release date had to have a precision of “day” (an exact date, not just a month/year). I found that was able rule out some bad matches. In the end I was able to find matches for 994 songs but these aren’t perfect. There are probably some incorrect matches and there are a few I have seen where the song was correct but the best version wasn’t selected, for example the match for The Mamas &amp; The Papas “California Dreamin’” had an album date from 2020 even though the album “If You Can Believe Your Eyes And Ears” was first released in 1966. This is the reality of working with real world data I guess and the majority of matches should be close enough to still be useful. See the augmentation page if you are interested in all the details.</p>
<p>Once I had the track ideas for these songs it was relatively simple to extract more information about them from Spotify including the album release date, song duration, whether it is marked as explicit and a Spotify “popularity” score. Spotify also has more musical information about each track. There are some basic things like whether the track is in a major key, the loudness and tempo but also complex scores between zero and one for “valence”, “energy”, “danceability”, “speechiness”, “acousticness” and “liveness”. See the <a href="https://developer.spotify.com/documentation/web-api/reference/tracks/get-several-audio-features/" title="Spotify audio features">Spotify page</a> for more details but basically these try to capture the general feel of each song.</p>
</section>
<section id="analysis" class="level1">
<h1>Analysis</h1>
<section id="exploration" class="level2">
<h2 class="anchored" data-anchor-id="exploration">Exploration</h2>
<p>The first kind of analysis I wanted to do was some basic exploration of the final dataset. What I did was plot a histogram or barplot of each column and pick out the top and bottom five scoring songs. You can see all the details on the <a href="https://lazappi.github.io/requestival/04-exploration.html" title="Analysis website - exploration">exploration page</a> but I’ll just list a few highlights here.</p>
<ul>
<li>There were 43 songs played more than once (some of these are probably errors)</li>
<li>The most played artists were BENEE, Childish Gambino, CHVRCHES, DMA’s and The Chats with five plays each. There were 209 artists with more than one play.</li>
<li>The most played release was “Live At The Wireless” with 32 plays. These are songs that triple j have recorded at festivals, concerts etc. The “Like A Version” segment from triple j also contributed nine plays.</li>
<li>312 songs (26%) of played songs are available on Unearthed.</li>
<li>Most of the songs are fairly recent but there are some older ones (and probably a few where the dates are wrong)</li>
</ul>
<p><img src="https://lazappi.id.au/posts/2020-07-11-requestival/album_date.png" class="img-fluid"></p>
<ul>
<li>The longest song played was “Fools Gold” by The Stone Roses at almost 10 minutes long (9:54) and the shortest was “Counting Worms” by “Knocked Loose” at just over a minute (1:12)</li>
<li>The most popular songs played (according to Spotify) were “goosebumps” and “HIGHEST IN THE ROOM” by Travis Scott and “WHATS POPPIN” by Jack Harlow</li>
<li>The loudest song was “Ask For The Anthem” by Ocean Grove and the quietest was “What’s Up” by 4 Non Blondes</li>
<li>The fastest song was “93 ’Til Infinity” by Souls of Mischief and the slowest was “Soon” by Angie McMahon</li>
<li>The song with highest “valence” was “September” by Earth, Wind &amp; Fire and the lowest was “Raining Blood” by Slayer</li>
<li>The song with highest “energy” was “Raining Blood” by Slayer and the lowest was “In Disguise” by Ashe</li>
<li>The song with highest “danceability” was “Wash &amp; Set” by Leikeli47 and the lowest was “I Still Dream About You” by The Smith Street Band</li>
<li>The song with highest “speechiness” was “RNP” by YBN Cordae and the lowest was “Strangers” by Tia Gostelow</li>
<li>The song with highest “acousticness” was “Punching In A Dream” by The Naked And Famous<sup>1</sup> and the lowest was “Go” by Pearl Jam</li>
<li>The song with highest “liveness” was “Formation” by Beyoncé<sup>2</sup> and the lowest was “Fit But You Know It” by The Streets</li>
</ul>
</section>
<section id="embedding" class="level2">
<h2 class="anchored" data-anchor-id="embedding">Embedding</h2>
<p>Now that I had a look at each variable individually I wanted to see what the dataset looked like as a whole. The first thing you always do with a new dataset is a PCA so for the songs with Spotify data I took the duration, loudness, tempo, valence, energy, danceability, speechiness, acousticness and liveness. The scatter plots don’t look great but the loadings are interesting.</p>
<p><img src="https://lazappi.id.au/posts/2020-07-11-requestival/pca_loadings.png" class="img-fluid"></p>
<p>For example you can see that PC1 separates loudness and enery from acousticness and PC6 separates speeciness from valence and tempo.</p>
<p>I then did a t-SNE using these principal components. I had hoped that this information might be enough to see some clear separation between genres or styles but this isn’t really the case. There are few variable which are restricted to one part of the space, for example liveness:</p>
<p><img src="https://lazappi.id.au/posts/2020-07-11-requestival/tsne_liveness.png" class="img-fluid"></p>
<p>But most of the other variables just show a general trend like loudness.</p>
<p><img src="https://lazappi.id.au/posts/2020-07-11-requestival/tsne_loudness.png" class="img-fluid"></p>
<p>See the <a href="https://lazappi.github.io/requestival/05-embedding.html" title="Analysis website - embedding">embedding page</a> if you would like to see what all the other variables look like.</p>
</section>
</section>
<section id="outro" class="level1">
<h1>Outro</h1>
<p>This is really only an exploratory analysis and there are lots more things you could do: clustering to group songs, looking at correlations, text analysis etc. I had hoped to find some of the really unusal things that were played (like the Pokemon theme song) but I don’t think they are musically different enough for these characteristics to pull them out. In the end I would call this project a partial success, I didn’t find anything super interesting but I need learn some things and it was fun to play around with. The dataset is available on <a href="https://github.com/lazappi/requestival" title="Requesitval analysis GitHub repository">GitHub</a> and I would be interested to see if you can pull anything else out of it. Maybe I will return to it and do something more detailed in the future.</p>


</section>


<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>This got matched to the “Stripped” version on Spotify↩︎</p></li>
<li id="fn2"><p>This got matched to the version from “Homecoming: The Live Album”↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>analysis</category>
  <category>music</category>
  <guid>https://lazappi.id.au/posts/2020-07-11-requestival/</guid>
  <pubDate>Fri, 10 Jul 2020 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Back to the SCE-verse!</title>
  <link>https://lazappi.id.au/posts/2020-05-12-back-to-the-sce-verse/</link>
  <description><![CDATA[ 





<p>A few weeks ago I did a short <a href="../../post/2020-04-29-bioconductor-3-11-wrap-up" title="Bioconductor 3.11 wrap up">wrap up</a> of the latest <a href="https://bioconductor.org/news/bioc_3_11_release/" title="Bioc 3.11 news">Bioconductor 3.11 release</a>. The <a href="../../post/2018-05-04-bioconductor-3-7-wrap-up" title="Bioconductor 3.7 wrap up">last time I did that</a> (for the 3.7 release in 2018) I <a href="../../post/2018-05-20-exploring-the-sce-verse" title="Exploring the SCE-verse">followed it up with a post</a> looking at packages which depend on the <a href="https://bioconductor.org/packages/SingleCellExperiment/" title="SingleCellExperiment package"><strong>{SingleCellExperiment}</strong></a> package. A lot has changed in the scRNA-seq world over the last two years and we have grown from around 200 analysis tools to <a href="https://www.scrna-tools.org" title="scRNA-tools analysis page">almost 650</a>. Given that growth I thought it would be good to repeat that analysis and see what the SCE-verse looks like in 2020.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"BiocPkgTools"</span>)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tidygraph"</span>)</span>
<span id="cb1-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ggraph"</span>)</span>
<span id="cb1-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tidyverse"</span>)</span></code></pre></div></div>
<section id="getting-package-network" class="level1">
<h1>Getting package network</h1>
<p><a href="https://bioconductor.org/packages/BiocPkgTools/" title="BiocPkgTools"><strong>{BiocPkgTools}</strong></a> is now a fully-fledged Bioconductor package and has some new functionst that make it easy to get information about the packages connected to <strong>{SingleCellExperiment}</strong>.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Get Bioconductor package information</span></span>
<span id="cb2-2">bioc_pkgs <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">biocPkgList</span>()</span>
<span id="cb2-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Get Bioconductor dependencies</span></span>
<span id="cb2-4">all_bioc_deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">buildPkgDependencyDataFrame</span>()</span>
<span id="cb2-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Select reverse dependencies for SingleCellExperiment</span></span>
<span id="cb2-6">bioc_revdeps  <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(all_bioc_deps, dependency <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"SingleCellExperiment"</span>)</span>
<span id="cb2-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Select dependencies of the reverse dependencies</span></span>
<span id="cb2-8">bioc_edges <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> bioc_revdeps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>(</span>
<span id="cb2-10">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(all_bioc_deps, Package <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%in%</span> bioc_revdeps<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>Package)</span>
<span id="cb2-11">    )</span></code></pre></div></div>
<p>The information from <strong>{BiocPkgTools}</strong> includes CRAN packages that are dependencies of Bioconductor packages but doesn’t include any CRAN packages that depend on <strong>{SingleCellExperiment}</strong>. To get those we will use a tidier version of the function in the previous blog post. This function just uses <strong>{purr}</strong> to loop over the different types of dependencies and calls the <code>tools::package_dependencies()</code> function.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">get_cran_deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(pkgs, db, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">reverse =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>,</span>
<span id="cb3-2">                          <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">types =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Depends"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Imports"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Suggests"</span>)) {</span>
<span id="cb3-3"></span>
<span id="cb3-4">    deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> purrr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map_dfr</span>(types, <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(.type) {</span>
<span id="cb3-5">        deps_list <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> tools<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">package_dependencies</span>(</span>
<span id="cb3-6">            pkgs, db,</span>
<span id="cb3-7">            <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">which   =</span> .type,</span>
<span id="cb3-8">            <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">reverse =</span> reverse</span>
<span id="cb3-9">        )</span>
<span id="cb3-10">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tibble</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Package =</span> pkgs, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">dependency =</span> deps_list, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">edgetype =</span> .type)</span>
<span id="cb3-11">    }) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-12">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unnest</span>(dependency)</span>
<span id="cb3-13"></span>
<span id="cb3-14">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (reverse) {</span>
<span id="cb3-15">        deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> dplyr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(</span>
<span id="cb3-16">                deps,</span>
<span id="cb3-17">                <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Package =</span> dependency,</span>
<span id="cb3-18">                <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">dependency =</span> Package,</span>
<span id="cb3-19">                edgetype</span>
<span id="cb3-20">            )</span>
<span id="cb3-21">    }</span>
<span id="cb3-22"></span>
<span id="cb3-23">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(deps)</span>
<span id="cb3-24">}</span>
<span id="cb3-25"></span>
<span id="cb3-26">db <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">available.packages</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">repos =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"http://cran.r-project.org"</span>)</span>
<span id="cb3-27">cran_revdeps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_cran_deps</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"SingleCellExperiment"</span>, db, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">reverse =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span>
<span id="cb3-28">cran_edges <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> cran_revdeps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-29">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>(</span>
<span id="cb3-30">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_cran_deps</span>(cran_revdeps<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>Package, db)</span>
<span id="cb3-31">    )</span></code></pre></div></div>
<p>Now we can combine the networks from Bioconductor and CRAN and create a graph structure.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">revdeps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> bioc_revdeps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>(cran_revdeps) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">distinct</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(</span>
<span id="cb4-5">        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Repo =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">if_else</span>(Package <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%in%</span> bioc_pkgs<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>Package, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Bioconductor"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"CRAN"</span>)</span>
<span id="cb4-6">    )</span>
<span id="cb4-7"></span>
<span id="cb4-8">nodes <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> revdeps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(Package, Repo) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-10">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">add_row</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"SingleCellExperiment"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Repo =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Bioconductor"</span>)</span>
<span id="cb4-11"></span>
<span id="cb4-12">edges <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> bioc_edges <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-13">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>(cran_edges) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-14">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(dependency <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%in%</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"SingleCellExperiment"</span>, revdeps<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>Package)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-15">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">distinct</span>()</span>
<span id="cb4-16"></span>
<span id="cb4-17">graph <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tbl_graph</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nodes =</span> nodes, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">edges =</span> edges)</span></code></pre></div></div>
<p>We now have a graph that contains <strong>{SingleCellExperiment}</strong> and the 77 packages that depend on it. We also have edges between the reverse dependencies but have removed edges to other packages.</p>
</section>
<section id="what-uses-singlecellexperiment" class="level1">
<h1>What uses <strong>{SingleCellExperiment}</strong>?</h1>
<section id="where-are-the-packages-from" class="level2">
<h2 class="anchored" data-anchor-id="where-are-the-packages-from">Where are the packages from?</h2>
<p>Before we look at the relationships between packages let’s look at the reverse dependencies themselves. We expect that most of the dependencies will be other Bioconductor packages but are there any in CRAN?</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(revdeps, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> Repo, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> Repo)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb5-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_bar</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb5-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_fill_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb5-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_minimal</span>()</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2020-05-12-back-to-the-sce-verse/repos-1.png" class="img-fluid"></p>
<p>There are only 2 packages from CRAN, which are the same two we saw last time. These are <a href="https://CRAN.R-project.org/package=Seurat" title="Seurat"><strong>{Seurat}</strong></a>, which contains the other major R object for scRNA-seq data and includes functions for converting from <strong>{SingleCellExperiment}</strong> objects, and <a href="https://CRAN.R-project.org/package=Seurat" title="clustree"><strong>{clustree}</strong></a>, my package for visualising clustering across resolutions which includes a <strong>{SingleCellExperiment}</strong> interface.</p>
</section>
<section id="what-dependency-do-they-have" class="level2">
<h2 class="anchored" data-anchor-id="what-dependency-do-they-have">What dependency do they have?</h2>
<p>What about the types of dependencies?</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1">edges <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb6-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(dependency <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"SingleCellExperiment"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb6-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> edgetype, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> edgetype)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_bar</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_fill_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_minimal</span>()</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2020-05-12-back-to-the-sce-verse/dep-types-1.png" class="img-fluid"></p>
<p>Most of the packages either “import” or “depend” on <strong>{SingleCellExperiment}</strong>. This is unsurprising given that is a core data structure. I suspect most of the “suggests” packages work with several data structures but we would have to check this and see.</p>
</section>
<section id="what-do-they-do" class="level2">
<h2 class="anchored" data-anchor-id="what-do-they-do">What do they do?</h2>
<p>All packages in Bioconductor are annotated with “biocViews”. There are a set of labels designed to show what a package can be used for. They provide a convenient way to get a overview of what the packages in the SCE-verse do.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1">bioc_views <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> bioc_pkgs <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(Package <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%in%</span> revdeps<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>Package) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(biocViews) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unnest</span>(biocViews) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">biocViews =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fct_lump_n</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fct_infreq</span>(biocViews), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>))</span>
<span id="cb7-6"></span>
<span id="cb7-7"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(bioc_views, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fct_rev</span>(biocViews))) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_bar</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_flip</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-10">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggtitle</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Most common biocViews"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-11">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ylab</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Number of packages"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-12">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_minimal</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-13">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">axis.title.y =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">element_blank</span>())</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2020-05-12-back-to-the-sce-verse/biocViews-1.png" class="img-fluid"></p>
<p>Many of the most common biocViews are fairly general terms but there is a set that stands out as being specific to scRNA-seq data including “SingleCell”, “GeneExpression”, “RNASeq” and “Transcriptomics”. Further down the list we see terms related to specific analysis tasks (“Clustering”, “Visualization”, “DifferentialExpression”, “DimensionReduction”, “Normalization” etc.). Many of these are similar to the categories we came up with to group tools on <a href="https://www.scrna-tools.org" title="scRNA-tools analysis page">scRNA-tools</a>. It would be interesting to look at the differences in ranking but I suspect there are enough differences in how the terms are used that it would be difficult to compare them.</p>
</section>
</section>
<section id="relationships-between-packages" class="level1">
<h1>Relationships between packages</h1>
<p>Now that we have a bit of an overview of what packages there are let’s see how they relate to each other. First let’s plot all the packages in our graph. Remember that everything here depends on <strong>{SingleCellExperiment}</strong> and we have links between them but not dependencies to other packages.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(graph, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">layout =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"fr"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_fan</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> edgetype),</span>
<span id="cb8-3">                  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">arrow =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrow</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">length =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unit</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)),</span>
<span id="cb8-4">                  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_cap =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">circle</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> Repo)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_text</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> Package, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> Repo), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">repel =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-7">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Set1"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_edge_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_graph</span>()</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2020-05-12-back-to-the-sce-verse/graph-all-1.png" class="img-fluid"></p>
<p>A lot of these packages have no other dependencies in this community except for <strong>{SingleCellExperiment}</strong>. While I’m sure that they are very useful they aren’t very interesting for this analysis so let’s remove them by excluding nodes with a single edge.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1">graph_deg2 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> graph <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">activate</span>(nodes) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Degree =</span> igraph<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">degree</span>(graph)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(Degree <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb9-5"></span>
<span id="cb9-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(graph_deg2, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">layout =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"fr"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-7">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_fan</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> edgetype),</span>
<span id="cb9-8">                  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">arrow =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrow</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">length =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unit</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)),</span>
<span id="cb9-9">                  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_cap =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">circle</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-10">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> Repo)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-11">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_text</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> Package, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> Repo), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">repel =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-12">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Set1"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-13">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_edge_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-14">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_graph</span>()</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2020-05-12-back-to-the-sce-verse/graph-deg2-1.png" class="img-fluid"></p>
<p>That’s better but it’s still pretty crowded. Let’s see if we can pick out the most important nodes using a centrality measure.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1">graph_central <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> graph_deg2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">activate</span>(nodes) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Centrality =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">centrality_authority</span>()) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(Centrality <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>)</span>
<span id="cb10-5"></span>
<span id="cb10-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(graph_central, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">layout =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"fr"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-7">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_fan</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> edgetype),</span>
<span id="cb10-8">                  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">arrow =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrow</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">length =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unit</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)),</span>
<span id="cb10-9">                  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_cap =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">circle</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-10">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> Repo)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-11">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_text</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> Package, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> Repo), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">repel =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-12">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Set1"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-13">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_edge_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-14">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_graph</span>()</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2020-05-12-back-to-the-sce-verse/graph-central-1.png" class="img-fluid"></p>
<p>No surprises there! This has picked out what are probably the most used and influential R packages for scRNA-seq analysis. The central Bioconductor packages of <strong>{SingleCellExperiement}</strong> (object), <strong>{scater}</strong> (quality control and visualisation) and <strong>{scran}</strong> (normalisation and downstream analysis). They are joined by <strong>{Seurat}</strong> which is perhaps the most complete R scRNA-seq analysis package but is hosted on CRAN and uses it’s own object. It’s centrality to this graph suggests that many Bioconductor packages provide some kind of support for <strong>{Seurat}</strong> objects.</p>
<p>Let’s zoom out a little bit by relaxing the centrality threshold and see what else we find.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1">graph_central <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> graph_deg2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">activate</span>(nodes) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Centrality =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">centrality_authority</span>()) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(Centrality <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)</span>
<span id="cb11-5"></span>
<span id="cb11-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(graph_central, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">layout =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"fr"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-7">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_fan</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> edgetype),</span>
<span id="cb11-8">                  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">arrow =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrow</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">length =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unit</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mm"</span>)),</span>
<span id="cb11-9">                  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_cap =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">circle</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mm"</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-10">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> Repo)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-11">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_text</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> Package, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> Repo), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">repel =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-12">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Set1"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-13">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_edge_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-14">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_graph</span>()</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2020-05-12-back-to-the-sce-verse/graph-central2-1.png" class="img-fluid"></p>
<p>This has added a few more central packages: <strong>{iSEE}</strong> which is a Shiny app for interacting with <strong>{SingleCellExperiment}</strong> objects, <strong>{MAST}</strong> for differential expression testing, <strong>{zinbwave}</strong> for dimensionality reduction and integration and <strong>{destiny}</strong> for creating diffusion maps.</p>
</section>
<section id="conclusion" class="level1">
<h1>Conclusion</h1>
<p>It is great to see the expansion of the SCE-verse! The object has been well taken up by the community and I think there is now a better understanding of the value of using standard objects, both in increased interoperabilty between packages but all time and effort saved during development. I had hoped to see more connections between packages though. Apart from a few central packages there aren’t that many dependencies with the SCE-verse which suggests there could be a lot of duplicated functionality within packages. Now that we have a common object to work with perhaps the next step forward in the development of the ecosystem is to centralise and reuse common functions so that developers can focus on innovative new methods? I have been fairly selective in this analysis though so it is possible I have missed some of this which is happening already.</p>
<p>It would (still) be good with repeat this for other major object, perhaps I will get to that in a future post.</p>


</section>

 ]]></description>
  <category>bioconductor</category>
  <category>R</category>
  <category>scrna-seq</category>
  <category>SingleCellExperiment</category>
  <category>analysis</category>
  <guid>https://lazappi.id.au/posts/2020-05-12-back-to-the-sce-verse/</guid>
  <pubDate>Mon, 11 May 2020 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Bioconductor 3.11 wrap-up</title>
  <link>https://lazappi.id.au/posts/2020-04-29-bioconductor-3-11-wrap-up/</link>
  <description><![CDATA[ 





<p>The Bioconductor 3.11 release was yesterday. Here is my wrap-up of new packages and updates. This is only the things I found interesting based on the release and they don’t come with any particular endorsement. If there is something else you are looking for have a look at the release notes <a href="https://bioconductor.org/news/bioc_3_11_release/" title="Bioc 3.11 news">here</a>.</p>
<section id="new-packages" class="level1">
<h1>New packages</h1>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/basilisk.html" title="basilisk"><strong>{basilisk}</strong></a> - installs a self-contained Python instance that is maintained by the R installation</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/BiocDockerManager.html" title="BiocDockerManager"><strong>{BiocDockerManager}</strong></a> - management of Bioconductor Docker images</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/CiteFuse.html" title="CiteFuse"><strong>{CiteFuse}</strong></a> - suite of methods for working with CITE-seq data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/clustifyr.html" title="clustifyr"><strong>{clustifyr}</strong></a> - classify cells in scRNA-seq data using external references</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/cmapR.html" title="cmapR"><strong>{cmapR}</strong></a> - interface for the Broad Institute Connectivity Map resource of gene perturbation expression profiles</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/ctgGEM.html" title="ctgGEM"><strong>{ctgGEM}</strong></a> - streamlines building of cell-state hierarchies from single-cell gene expression data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/distinct.html" title="distinct"><strong>{distinct}</strong></a> - differential testing of distributions</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/dittoSeq.html" title="dittoSeq"><strong>{dittoSeq}</strong></a> - user friendly visualisation of bulk and single-cell RNA-seq data including consideration of colour blindness</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/Dune.html" title="Dune"><strong>{Dune}</strong></a> - merges pairs of clusters to increase ARI and improve reproducibility</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/frenchFISH.html" title="frenchFISH"><strong>{frenchFISH}</strong></a> - Poisson models for DNA copy number from FISH data (<em>bonus points for the excellent name</em>)</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/GeneTonic.html" title="GeneTonic"><strong>{GeneTonic}</strong></a> - Shiny app for looking at enrichment results from expression data (<em>another great name</em>)</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/HIPPO.html" title="HIPPO"><strong>{HIPPO}</strong></a> - scRNA-seq feature selection and clustering</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/HIPPO.html" title="ISEEu"><strong>{ISEEu}</strong></a> - extensions for the <a href="https://bioconductor.org/packages/release/bioc/html/iSEE.html" title="iSEE"><strong>{iSEE}</strong></a> shiny app</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/mitch.html" title="mitch"><strong>{mitch}</strong></a> - multi-contrast enrichment analysis</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/peco.html" title="peco"><strong>{peco}</strong></a> - predicting cell-cycle progression using scRNA-seq data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scClassify.html" title="scClassify"><strong>{scClassify}</strong></a> - multi-scale classification of scRNA-seq data using cell hierarchies</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scHOT.html" title="scHOT"><strong>{scHOT}</strong></a> - testing changes in higher-order structure of gene expression such as across (pseudo) time or space</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scry.html" title="scry"><strong>{scry}</strong></a> - count-based feature selection and dimensionality reduction for small count data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scTHI.html" title="scTHI"><strong>{scTHI}</strong></a> - identify active ligand-receptor pairs in scRNA-seq data</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/SingleCellSignalR.html" title="SingleCellSignalR"><strong>{SingleCellSignalR}</strong></a> - clustering of scRNA-seq data and inference of cell-cell interactions</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/sparseMatrixStats.html" title="sparseMatrixStats"><strong>{sparseMatrixStats}</strong></a> - high-performance functions for row and column operations on sparse matrices, inspired by <strong>{matrixStats}</strong></li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/tidybulk.html" title="tidybulk"><strong>{tidybulk}</strong></a> - tidy wrappers for bulk RNA-seq analysis</li>
</ul>
</section>
<section id="updates" class="level1">
<h1>Updates</h1>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/BASiCS.html" title="BASiCS"><strong>{BASiCS}</strong></a> - many updates and release of version 2</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/batchelor.html" title="batchelor"><strong>{batchelor}</strong></a> - support of arbitrary design matrices and extension of MNN integration to cluster centroids</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/DropletUtils.html" title="DropletUtils"><strong>{DropletUtils}</strong></a> - down sampling of batches, writing to 10x format, removal of chimeric reads, demultiplexing of cell hashing experiments</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/edgeR.html" title="edgeR"><strong>{edgeR}</strong></a> - integration of the limma voom-lmFit pipeline, support for <a href="https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html" title="SummarizedExperiment"><strong>{SummarizedExperiment}</strong></a> objects</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/iSEE.html" title="iSEE"><strong>{iSEE}</strong></a> - support of extensions such as those in <a href="https://bioconductor.org/packages/release/bioc/html/HIPPO.html" title="ISEEu"><strong>{ISEEu}</strong></a></li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/limma.html" title="limma"><strong>{limma}</strong></a> - <code>changeLog()</code> function can be used with any package, improved treatment of NA values</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/missMethyl.html" title="missMethyl"><strong>{missMethyl}</strong></a> - bug fixes and <code>fract.counts</code> argument</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/pcaTools.html" title="PCAtools"><strong>{PCAtools}</strong></a> - functions for choosing the ideal number of components to retain</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scater.html" title="scater"><strong>{scater}</strong></a> - multi-feature set UMAP and various new arguments</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scran.html" title="scran"><strong>{scran}</strong></a> - new functions for sub-clustering and cluster bootstrapping, wrappers for graph-based clustering, updates to marker identification functions</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html" title="SummarizedExperiment"><strong>{SummarizedExperiment}</strong></a> - support for assays with more than four dimensions, changes to assay getters and setters to check dimnames</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/tximeta.html" title="tximeta"><strong>{tximeta}</strong></a> - functions for splitting <a href="https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html" title="SummarizedExperiment"><strong>{SummarizedExperiment}</strong></a> objects and conversion to DGEList</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/tximport.html" title="tximport"><strong>{tximport}</strong></a> - improved support for importing Alevin output</li>
</ul>
</section>
<section id="workflow-packages" class="level1">
<h1>Workflow packages</h1>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scClassify.html" title="fluentGenomics"><strong>{fluentGenomics}</strong></a> - extended workflow using the <strong>{plyranges}</strong> and <a href="https://bioconductor.org/packages/release/bioc/html/tximeta.html" title="tximeta"><strong>{tximeta}</strong></a> packages for fluent genomic analysis</li>
</ul>


</section>

 ]]></description>
  <category>bioconductor</category>
  <category>R</category>
  <guid>https://lazappi.id.au/posts/2020-04-29-bioconductor-3-11-wrap-up/</guid>
  <pubDate>Tue, 28 Apr 2020 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Caching blogdown posts</title>
  <link>https://lazappi.id.au/posts/2020-04-09-caching-blogdown/</link>
  <description><![CDATA[ 





<p>This website is built using <a href="https://bookdown.org/yihui/blogdown/" title="blogdown website"><strong>blogdown</strong></a> which is a great package that let’s you easily turn <a href="https://rmarkdown.rstudio.com/" title="R Markdown website">R Markdown</a> documents into a <a href="https://gohugo.io/" title="Hugo website"><strong>Hugo</strong></a> blog. While a normal Markdown blog can include code a <strong>blogdown</strong> blog runs that code and includes the output. One thing that <strong>blogdown</strong> does which isn’t necessarily desirable is re-knit every R Markdown document whenever the site is built.<sup>1</sup> This can slow down the build process but it can also result in changes to the content of a post. For example imagine a post that scraps some data from the internet. If that code is run a month or a year from now that data could have changed in a way that affects the meaning of the post. Perhaps a more likely scenario is changes to package functionality which change results or stop code working altogether. This post describes the build process I have come up with to try and avoid this happening.</p>
<section id="blogdown-file-formats" class="level1">
<h1><strong>blogdown</strong> file formats</h1>
<p>I mentioned that <strong>blogdown</strong> works with R Markdown files but it actually handles three different file types which are treated in different ways (see <a href="https://bookdown.org/yihui/blogdown/output-format.html" title="blogdown file formats">here</a> for more details):</p>
<ol type="1">
<li><code>.Rmd</code> - R Markdown files that are rendered directly to <code>.html</code> by <strong>blogdown</strong> and friends (including <a href="https://pandoc.org/" title="Pandoc website"><strong>Pandoc</strong></a>)</li>
<li><code>.rmarkdown</code> - R Markdown files that are knitted to <code>.markdown</code> files by <strong>blogdown</strong> and then rendered to <code>.html</code> by <strong>Hugo</strong>.</li>
<li><code>.md</code> - Standard Markdown files which are ignored by <strong>blogdown</strong> and rendered by <strong>Hugo</strong>.</li>
</ol>
<p>The <code>.Rmd</code> workflow is usually recommended and because it makes use of <strong>Pandoc</strong> it enables several features including citations which are useful for an academic blog. However it also comes with the (potential) problem with re-running code mentioned above. What I would like to have is something like the <code>.rmarkdown</code> workflow but where the intermediate Markdown file is still rendered to <code>.html</code> using <strong>Pandoc</strong> instead of <strong>Hugo</strong>.</p>
</section>
<section id="the-blogdown-build-process" class="level1">
<h1>The <strong>blogdown</strong> build process</h1>
<p>Before we try and modify it let’s have a look at how the standard <strong>blogdown</strong> build process works. To build the website we use the <a href="https://github.com/rstudio/blogdown/blob/86ea620d6dfbe0f745ad89dc131b0dc6662e572c/R/render.R#L36" title="blogdown::build_site() function"><code>blogdown::build_site()</code></a> function. This takes a <code>local</code> argument which sets whether the site is being viewed locally or not as well as a <code>method</code> argument (which we will get to later). This is (briefly) what happens when you call <code>build_site()</code>:</p>
<ol type="1">
<li>Checks arguments and gets a list of files to build</li>
<li>Calls the <a href="https://github.com/rstudio/blogdown/blob/86ea620d6dfbe0f745ad89dc131b0dc6662e572c/R/render.R#L68" title="blogdown:::build_rmds() function"><code>blogdown:::build_rmds()</code></a> function
<ol type="1">
<li>This function copies by-product files (such as plot output) from where they have been stored to the build directory</li>
<li>Each file is passed to the <a href="https://github.com/rstudio/blogdown/blob/86ea620d6dfbe0f745ad89dc131b0dc6662e572c/R/render.R#L115" title="blogdown:::render_page() function"><code>blogdown:::render_page()</code></a> function
<ol type="1">
<li>This function is a wrapper which calls the <a href="https://github.com/rstudio/blogdown/blob/86ea620d6dfbe0f745ad89dc131b0dc6662e572c/inst/scripts/render_page.R" title="render_page.R script"><code>render_page.R</code></a> script</li>
<li>The script creates a new local environment (I assume there is a good reason to do this)</li>
<li>The file is rendered in the new environment (with some post-processing if the output is Markdown)</li>
</ol></li>
<li>After rendering (if the output is Markdown) the YAML frontmatter is copied to the output file</li>
</ol></li>
<li>By-products are moved back to their storage locations</li>
<li><strong>Hugo</strong> is called to build the website</li>
</ol>
</section>
<section id="my-modifications" class="level1">
<h1>My modifications</h1>
<p>I mentioned earlier that <code>blogdown::build_site()</code> has a <code>method</code> argument. This can take values of <code>"html"</code> which is the default process I have just described or <code>"custom"</code> which replaces this process by running a <code>R/build.R</code> script which can do whatever you like.<sup>2</sup> I have created a custom build script which is very similar to the <strong>blogdown</strong> functions with a few modifications. It is also inspired by <a href="https://yutani.rbind.io/post/2017-10-25-blogdown-custom/" title="How Not To Knit All Rmd Files With Blogdown">this post</a> but this method caches <code>.md</code> files which are rendered by <strong>Hugo</strong> rather than <strong>Pandoc</strong>. I’ll include some snippets below but the full script is <a href="https://github.com/lazappi/lazappi_blog/blob/0efce159942167b35649c4dfb0f7b832fab9a137/R/build.R" title="Custom build.R script">here</a> if you are interested.</p>
<section id="only-render-some-.rmd-files" class="level2">
<h2 class="anchored" data-anchor-id="only-render-some-.rmd-files">Only render some <code>.Rmd</code> files</h2>
<p>When finding files to render the script also checks to see if there is a <code>.md.cached</code> file in the same directory and that it is newer than the <code>.Rmd</code> file.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1">rmd_files <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> blogdown<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list_rmds</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>, <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">message</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Found "</span>, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(rmd_files), <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">" R Markdown files"</span>)</span>
<span id="cb1-3">md_files <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sub</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\\</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">.Rmd$"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".md.cached"</span>, rmd_files)</span>
<span id="cb1-4"></span>
<span id="cb1-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Only knit Rmd files if...</span></span>
<span id="cb1-6">to_render <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">file.exists</span>(md_files) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span>             <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># md file does not exist OR</span></span>
<span id="cb1-7">    utils<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">file_test</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"-ot"</span>, md_files, rmd_files)  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># it is older than the Rmd</span></span></code></pre></div></div>
<p>If the <code>.md.cached</code> file exists (and is newer) it is rendered instead of the <code>.Rmd</code> file.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">message</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Rendering "</span>, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>to_render), <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">" cached Markdown files..."</span>)</span>
<span id="cb2-2"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> (md <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> md_files[<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>to_render]) {</span>
<span id="cb2-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">message</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Rendering "</span>, md, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"..."</span>)</span>
<span id="cb2-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">render_md</span>(md, base)</span>
<span id="cb2-5">}</span></code></pre></div></div>
<p>Otherwise the <code>.Rmd</code> file is rendered when required.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">message</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Rendering "</span>, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>(to_render), <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">" R Markdown files..."</span>)</span>
<span id="cb3-2"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> (rmd <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> rmd_files[to_render]) {</span>
<span id="cb3-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">message</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Rendering "</span>, rmd, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"..."</span>)</span>
<span id="cb3-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">render_rmd</span>(rmd, base)</span>
<span id="cb3-5">}</span></code></pre></div></div>
<p>One thing I found is important during the rendering process is that the YAML frontmatter is preprended to the output HTML file. I’m not entirely sure why but if you don’t do this the files aren’t included in the website properly by <strong>Hugo</strong>.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">blogdown<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">prepend_yaml</span>(md, out, x, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">callback =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(s) {</span>
<span id="cb4-2">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">getOption</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blogdown.draft.output"</span>, <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)) {</span>
<span id="cb4-3">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(s)</span>
<span id="cb4-4">    }</span>
<span id="cb4-5">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(s) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">||</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">grep</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"^draft: "</span>, s)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>) {</span>
<span id="cb4-6">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(s)</span>
<span id="cb4-7">    }</span>
<span id="cb4-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">append</span>(s, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"draft: yes"</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb4-9">})</span></code></pre></div></div>
</section>
<section id="keeping-intermediate-markdown-files" class="level2">
<h2 class="anchored" data-anchor-id="keeping-intermediate-markdown-files">Keeping intermediate Markdown files</h2>
<p>In theory it should be possible to keep the intermediate Markdown file simply by setting <code>keep_md: true</code> in the document YAML frontmatter (or a central <code>_output.yml</code> file). Unfortunately that argument currently isn’t passed on in a way that works (see issue <a href="https://github.com/rstudio/blogdown/issues/445" title="keep_md GitHub issue">here</a>). This means that we also need to create a <a href="https://github.com/lazappi/lazappi_blog/blob/0efce159942167b35649c4dfb0f7b832fab9a137/R/render_page.R" title="Custom render_page.R script">custom <code>render_page.R</code></a> script. This script makes sure that the <code>keep_md</code> option is set when rendering <code>.Rmd</code> files.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">output_format <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> rmarkdown<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">resolve_output_format</span>(input)</span>
<span id="cb5-2">output_format<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>keep_md <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span></span></code></pre></div></div>
<p>The other thing we do is rename the kept intermediate Markdown files. If we Left them with the <code>.md</code> extension they would be rendered by <strong>Hugo</strong> after running the build script. I chose to name them <code>.md.cached</code> but they could have any extension.</p>
</section>
</section>
<section id="wrapping-up" class="level1">
<h1>Wrapping up</h1>
<p>These scripts are now being used to build this blog. They seem to work 🤞 but I haven’t tested them extensively and I expect there will be some issues if I try some posts with more complex code in them (for example I’m not sure what will happen if I try and include a HTML widget). I’m still not certain this is the best approach but I have learnt a lot about how <strong>blogdown</strong> work (although there is still a lot I don’t understand 😸).</p>


</section>


<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>I’m not entirely sure this is correct and based on some comments from Yihui it might be possible to avoid this happening in a standard way but I have seen enough similar questions that it seems other people have run into the same problem.↩︎</p></li>
<li id="fn2"><p>It is acutally slightly more complicated than that. When <code>method = "html"</code> the <code>R/build.R</code> script is actually run after the normal process (if it exists) and can be used to do various things.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>blogdown</category>
  <category>R</category>
  <guid>https://lazappi.id.au/posts/2020-04-09-caching-blogdown/</guid>
  <pubDate>Thu, 09 Apr 2020 11:41:09 GMT</pubDate>
</item>
<item>
  <title>Exploring the SCE-verse</title>
  <link>https://lazappi.id.au/posts/2018-05-20-exploring-the-sce-verse/</link>
  <description><![CDATA[ 





<p>Over the last few years the number of methods for analysing scRNA-seq has exploded and there is now well over 200 software tools available. Each of these tools need to make a choice about how they store and represent the data used during their analysis. One attempt to standardise the data structures that are used is the <a href="https://www.bioconductor.org/packages/SingleCellExperiment" title="SingleCellExperiment">SingleCellExperiment</a> package created by Davide Risso and Aaron Lun, with help from Keegan Korthauer. This package became publicly available as part of the Bioconductor 3.6 release in October 2017. Since we have recently had another Bioconductor release I thought I would have a look at the community of tools that has been developed around SingleCellExperiment.</p>
<section id="what-is-a-singlecellexperiment" class="level1">
<h1>What is a SingleCellExperiment?</h1>
<p>Before we have a look at what packages use the SingleCellExperiment it’s probably useful to briefly discuss what it is. The SingleCellExperiment object is an extension of the older <a href="https://www.bioconductor.org/packages/SummarizedExperiment" title="SummarizedExperiment">SummarizedExperiment</a> object. This is an S4 class developed for use in Bioconductor packages with the main parts being a central set of matrix “assays” along with tables providing extra information about the rows and columns. There is also a metadata slot which is a list containing any other information related to the experiment.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lazappi.id.au/posts/2018-05-20-exploring-the-sce-verse/summarized_experiment.png" class="img-fluid figure-img"></p>
<figcaption>Diagram of a SummarizedExperiment object</figcaption>
</figure>
</div>
<p>One of the key benefits of using a structure like the SummarizedExperiment is that all the data related to an analysis is held in one spot. This makes it easier to pass things between functions or output results as well as reducing the possibilities for mismatches. The SingleCellExperiment adds some extra features that are useful for scRNA-seq analysis including:</p>
<ul>
<li>Slots for holding:
<ul>
<li>Dimenstionality reductions</li>
<li>Spike-in information</li>
<li>Size factors for normalisation</li>
</ul></li>
<li>Convenient access for named assays - counts, normcounts, cpm etc.</li>
</ul>
<p>The idea is then that the SingleCellExperiment can be used by a range of packages to store data during analysis and extended when required. Let’s have a look at what those packages are currently by seeing what depends on SingleCellExperiment.</p>
</section>
<section id="getting-package-information" class="level1">
<h1>Getting package information</h1>
<p>First let’s load the packages we need for this analysis. <a href="https://github.com/seandavi/BiocPkgTools">BiocPkgTools</a> will need to be installed from GitHub but the rest are on CRAN.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"BiocPkgTools"</span>) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># https://github.com/seandavi/BiocPkgTools</span></span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tidygraph"</span>)</span>
<span id="cb1-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ggraph"</span>)</span>
<span id="cb1-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tidyverse"</span>)</span></code></pre></div></div>
<p>We can use Sean Davis’ BiocPkgTools package to get information about Bioconductor packages, but then we need to do some filtering to get the information we want. It turns out I needed to do this multiple times so here is a function I wrote to make it a bit easier. It takes the database of information about all Bioconductor packages, the name of a package we are interested in and a flag indicating if we want normal or reverse dependencies.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1">get_bioc_deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(bpi, pkg, reverse) {</span>
<span id="cb2-2">    deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> bpi <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-3">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(Package <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> pkg)</span>
<span id="cb2-4"></span>
<span id="cb2-5">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (reverse) {</span>
<span id="cb2-6">        deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> deps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-7">            <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">depends =</span> dependsOnMe, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">imports =</span> importsMe,</span>
<span id="cb2-8">                   <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">suggests =</span> suggestsMe)</span>
<span id="cb2-9">    } <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span> {</span>
<span id="cb2-10">        deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> deps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-11">            <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">depends =</span> Depends, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">imports =</span> Imports,</span>
<span id="cb2-12">                   <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">suggests =</span> Suggests)</span>
<span id="cb2-13">    }</span>
<span id="cb2-14"></span>
<span id="cb2-15">    deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> deps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-16">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gather</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">key =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"type"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">value =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"package"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-17">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">separate_rows</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-18">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">is.na</span>(package))</span>
<span id="cb2-19"></span>
<span id="cb2-20">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (reverse) {</span>
<span id="cb2-21">        deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> deps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-22">            <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package2 =</span> pkg) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-23">            <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rename</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package1 =</span> package)</span>
<span id="cb2-24">    } <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span> {</span>
<span id="cb2-25">        deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> deps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-26">            <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package1 =</span> pkg) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb2-27">            <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rename</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package2 =</span> package)</span>
<span id="cb2-28">    }</span>
<span id="cb2-29"></span>
<span id="cb2-30">    deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> deps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(package1, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">uses =</span> type, package2)</span>
<span id="cb2-31">}</span></code></pre></div></div>
<p>If we use it to search for reverse dependencies of SingleCellExperiment we can see it returns a data frame with which Bioconductor packages use SingleCellExperiment and the relationship between them.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">bpi <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">getBiocPkgList</span>()</span>
<span id="cb3-2">bioc_revdeps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_bioc_deps</span>(bpi, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"SingleCellExperiment"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">reverse =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span>
<span id="cb3-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(bioc_revdeps)</span></code></pre></div></div>
<p>We can do a similar thing for CRAN packages with use of the <code>tools::package_dependencies</code> function.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">get_cran_deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(pkg, db, reverse) {</span>
<span id="cb4-2"></span>
<span id="cb4-3">    types <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Depends"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Imports"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Suggests"</span>)</span>
<span id="cb4-4"></span>
<span id="cb4-5">    deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sapply</span>(types, <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(type) {</span>
<span id="cb4-6">        deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> tools<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">package_dependencies</span>(pkg, db, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">which =</span> type,</span>
<span id="cb4-7">                                            <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">reverse =</span> reverse)</span>
<span id="cb4-8">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> type, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste</span>(deps[[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]], <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">collapse =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">", "</span>))</span>
<span id="cb4-9">    })</span>
<span id="cb4-10"></span>
<span id="cb4-11">    deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> deps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-12">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">t</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-13">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as_data_frame</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-14">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tolower</span>(type)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-15">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(package <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-16">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">separate_rows</span>(package)</span>
<span id="cb4-17"></span>
<span id="cb4-18">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">nrow</span>(deps) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>) {</span>
<span id="cb4-19">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tibble</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package1 =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">character</span>(), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">uses =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">character</span>(),</span>
<span id="cb4-20">                      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package2 =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">character</span>()))</span>
<span id="cb4-21">    }</span>
<span id="cb4-22"></span>
<span id="cb4-23">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (reverse) {</span>
<span id="cb4-24">        deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> deps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-25">            <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package2 =</span> pkg) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-26">            <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rename</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package1 =</span> package)</span>
<span id="cb4-27">    } <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span> {</span>
<span id="cb4-28">        deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> deps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-29">            <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package1 =</span> pkg) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-30">            <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rename</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package2 =</span> package)</span>
<span id="cb4-31">    }</span>
<span id="cb4-32"></span>
<span id="cb4-33">    deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> deps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(package1, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">uses =</span> type, package2)</span>
<span id="cb4-34">}</span>
<span id="cb4-35">db <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">available.packages</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">repos =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"http://cran.r-project.org"</span>)</span>
<span id="cb4-36">cran_revdeps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_cran_deps</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"SingleCellExperiment"</span>, db, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">reverse =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span>
<span id="cb4-37"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(cran_revdeps)</span></code></pre></div></div>
</section>
<section id="what-uses-singlecellexperiment" class="level1">
<h1>What uses SingleCellExperiment?</h1>
<p>We now have two tables showing us which Bioconductor and CRAN packages make use of SingleCellExperiment. Tables can be fairly boring to look at though so let’s use the relationships to construct a graph using <a href="https://cran.r-project.org/package=tidygraph" title="tidygraph">tidygraph</a>.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">nodes <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> bioc_revdeps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>(cran_revdeps) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>uses) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gather</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">key =</span> id, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">value =</span> package) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>id) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">distinct</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-7">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">repo =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">if_else</span>(package <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%in%</span> bpi<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>Package, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Bioconductor"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"CRAN"</span>))</span>
<span id="cb5-8">edges <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> bioc_revdeps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>(cran_revdeps) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb5-10">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rename</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">from =</span> package1, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">to =</span> package2)</span>
<span id="cb5-11">graph <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tbl_graph</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nodes =</span> nodes, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">edges =</span> edges)</span></code></pre></div></div>
<p>We can now visualise the relationships using <a href="https://cran.r-project.org/package=ggraph" title="ggraph">ggraph</a>:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(graph, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">layout =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"fr"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_fan</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> uses),</span>
<span id="cb6-3">                  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">arrow =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrow</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">length =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unit</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)),</span>
<span id="cb6-4">                  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_cap =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">circle</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> repo)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_text</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> package, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> repo), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">repel =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-7">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Set1"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_edge_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_graph</span>()</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2018-05-20-exploring-the-sce-verse/plot-graph-1.png" class="img-fluid"></p>
<p>This doesn’t tell us a lot we didn’t already know but it does allow us to see everything in one place. We can see that there are only a couple of CRAN packages, which is unsurprising given that SingleCellExperiment is part of Bioconductor, and that most packages either “import” or “depend” on SingleCellExperiment.</p>
<p>What about the relationships between the packages that depend on SingleCellExperiment? Are there communitites of related scRNA-seq analysis tools?</p>
</section>
<section id="adding-an-extra-hop" class="level1">
<h1>Adding an extra hop</h1>
<p>We can reuse the functions we wrote earlier to get the dependencies (and reverse dependencies) of our list of scRNA-seq packages. We will also do a little bit of extra processing to tidy up some of the results.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1">more_deps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map2</span>(nodes<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>package, nodes<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>repo, <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x, y) {</span>
<span id="cb7-2">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (y <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Bioconductor"</span>) {</span>
<span id="cb7-3">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_bioc_deps</span>(bpi, x, <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb7-4">    } <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span> {</span>
<span id="cb7-5">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_cran_deps</span>(x, db, <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb7-6">    }</span>
<span id="cb7-7">}) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package2 =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str_remove</span>(package2, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">" ?</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\\</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">(</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\\</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">D+[0-9</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\\</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">.]+</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\\</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">)"</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb7-10">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(package2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"R"</span>)</span>
<span id="cb7-11">more_revdeps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map2</span>(nodes<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>package, nodes<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>repo, <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x, y) {</span>
<span id="cb7-12">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (y <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Bioconductor"</span>) {</span>
<span id="cb7-13">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_bioc_deps</span>(bpi, x, <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span>
<span id="cb7-14">    } <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span> {</span>
<span id="cb7-15">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_cran_deps</span>(x, db, <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span>
<span id="cb7-16">    }</span>
<span id="cb7-17">}) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>()</span></code></pre></div></div>
<p>Let’s build another graph and plot what we get. As you can see it’s a bit crowded so I have left off the package labels.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1">nodes <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> more_deps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>(more_revdeps) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>uses) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gather</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">key =</span> id, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">value =</span> package) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>id) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">distinct</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-7">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">repo =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">if_else</span>(package <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%in%</span> bpi<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>Package, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Bioconductor"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"CRAN"</span>))</span>
<span id="cb8-8">edges <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> more_deps <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>(more_revdeps) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-10">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rename</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">from =</span> package1, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">to =</span> package2) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb8-11">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">distinct</span>()</span>
<span id="cb8-12">graph <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tbl_graph</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nodes =</span> nodes, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">edges =</span> edges)</span>
<span id="cb8-13"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(graph, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">layout =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"fr"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-14">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_fan</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> uses),</span>
<span id="cb8-15">                  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">arrow =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrow</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">length =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unit</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)),</span>
<span id="cb8-16">                  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_cap =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">circle</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-17">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> repo)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-18">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#geom_node_text(aes(label = package, colour = repo), repel = TRUE) +</span></span>
<span id="cb8-19">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Set1"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-20">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_edge_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-21">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_graph</span>()</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2018-05-20-exploring-the-sce-verse/plot-more-1.png" class="img-fluid"></p>
<p>Our graph has a lot more information now, but is probably too complext to tell us anything useful. In particularly we can see there are a lot of nodes around the edges that have one package depending on them. Let’s get rid of those by removing sink node.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1">graph <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tbl_graph</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nodes =</span> nodes, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">edges =</span> edges) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">activate</span>(nodes) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb9-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">node_is_sink</span>())</span>
<span id="cb9-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(graph, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">layout =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"fr"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_fan</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> uses),</span>
<span id="cb9-6">                   <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">arrow =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrow</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">length =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unit</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)),</span>
<span id="cb9-7">                   <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_cap =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">circle</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> repo)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_text</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> package, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> repo), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,</span>
<span id="cb9-10">                   <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">repel =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-11">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Set1"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-12">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_edge_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-13">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_graph</span>()</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2018-05-20-exploring-the-sce-verse/remove-sinks-1.png" class="img-fluid"></p>
<p>That’s much better! We can now see some of the structure between our packages. There are quite few packages that rely on SingleCellExperiment but nothing else. Removing source nodes as well will tidy this up a bit more.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1">graph <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tbl_graph</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nodes =</span> nodes, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">edges =</span> edges) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">activate</span>(nodes) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">node_is_sink</span>()) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">node_is_source</span>())</span>
<span id="cb10-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(graph, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">layout =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"fr"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_fan</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> uses),</span>
<span id="cb10-7">                   <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">arrow =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrow</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">length =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unit</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)),</span>
<span id="cb10-8">                   <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_cap =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">circle</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> repo)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-10">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_text</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> package, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> repo),</span>
<span id="cb10-11">                   <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">repel =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-12">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Set1"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-13">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_edge_color_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-14">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_graph</span>()</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2018-05-20-exploring-the-sce-verse/remove-sources-1.png" class="img-fluid"></p>
<p>Now we can see the core SingleCellExperiment package network. Apart from SingleCellExperiment itself there are three main packages: scater, scran and splatter. Scater and scran are two low-level scRNA-seq analysis package with scater providing functions for tasks such as visualisation and filtering and scran focusing more on normalisation and removal of batch effects. It is unsurprising that these packages show up as Aaron Lun is heavily involved in the development of both of them as well as SingleCellExperiment. Splatter is a bit of a different case as it suggests many packages that provide the core functions for it’s simulations but isn’t used by any other analysis packages. There are a few other influential packages on the periphery of this network, particularly monocle and Seurat.</p>
</section>
<section id="what-do-these-packages-do" class="level1">
<h1>What do these packages do?</h1>
<p>We have had a look at how packages that use SingleCellExperiment are related but what do they actually do? Bioconductor categorises packages using “biocViews”, tags that describe software in various ways. Let’s summarise those for the packages that use SingleCellExperiment.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1">plot_data <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> bpi <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(Package <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%in%</span> bioc_revdeps<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>package1) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(Package, biocViews) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">separate_rows</span>(biocViews) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(biocViews) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summarise</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">count =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">n</span>()) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-7">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrange</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>count) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">prop =</span> count <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">n</span>()) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">biocViews =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(biocViews, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">level =</span> biocViews))</span>
<span id="cb11-10"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(plot_data, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> biocViews, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> prop)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-11">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_col</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-12">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">labels =</span> scales<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>percent) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-13">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggtitle</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"biocViews for packages that use SingleCellExperiment"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-14">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_minimal</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-15">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">axis.text.x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">element_text</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">angle =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">90</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">hjust =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">vjust =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>),</span>
<span id="cb11-16">          <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">axis.title =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">element_blank</span>(),</span>
<span id="cb11-17">          <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">panel.grid =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">element_blank</span>())</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2018-05-20-exploring-the-sce-verse/biocViews-1.png" class="img-fluid"></p>
<p>Unsurprisingly the top few categories (“SingleCell”, “GeneExpression”, “RNASeq”, “Transcriptomics”, “Sequencing”) are related to scRNA-seq data in general, along with the “Software” tag. I actually find it a bit surprising that these aren’t more common with only <!-- `r round(plot_data[plot_data$biocViews == "SingleCell", "prop"] * 100)` --> 66 percent of packages labelled as “SingleCell”. After these we have some of the most common scRNA-seq analysis tasks, “Clustering”, “Visualization”, “DifferentialExpression” and “DimensionReduction”. This is similar to what we see in the scRNA-tools database, but obviously from a much smaller sample.</p>
<p>We can’t do the same thing for the CRAN packages, but as there are only a couple of these we can just describe them. Seurat is perhaps the most complete R scRNA-seq analysis covering most steps in a standard workflow. It’s connection to SingleCellExperiment is through functions that have recently been added to convert to/from it’s own object. Clustree is a package for visualising clustering results in general and suggests SingleCellExperiment to provide a convenience function for people working with scRNA-seq data.</p>
</section>
<section id="where-to-from-here" class="level1">
<h1>Where to from here?</h1>
<p>It’s only been about six months since SingleCellExperiment joined Bioconductor release but we are already seeing a community of packages growing up around it. Hopefully we see this continue and there is a new batch of packages using it in the next Bioconductor release. For anyone who is working on a scRNA-seq package I strongly encourage you to consider basing it around SingleCellExperiment. It can take some time to get your head around how it works but the infrastructure it provides will save you a lot of time in the long run. It also makes things a lot easier for your uses who won’t have to learn a new data structure to use your package and can make use of a range of packages without having to convert between objects. If you don’t want to be locked into the Bioconductor ecosystem think about using the <a href="https://cran.r-project.org/package=Seurat" title="Seurat">Seurat</a> object instead or if you work in Python consider the <a href="https://anndata.readthedocs.io/en/latest/" title="anndata">anndata</a> object. There is also the <a href="http://loompy.org/" title="loom">loom</a> format which has both R and Python interfaces. Whatever standard works for you everyone will be better off if the community can make use of a small number of data structure rather than each package using their own.</p>


</section>

 ]]></description>
  <category>bioconductor</category>
  <category>R</category>
  <category>scrna-seq</category>
  <category>SingleCellExperiment</category>
  <category>analysis</category>
  <guid>https://lazappi.id.au/posts/2018-05-20-exploring-the-sce-verse/</guid>
  <pubDate>Fri, 08 Jun 2018 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Bioconductor 3.7 wrap-up</title>
  <link>https://lazappi.id.au/posts/2018-05-04-bioconductor-3-7-wrap-up/</link>
  <description><![CDATA[ 





<p>The Bioconductor 3.7 release was announced this week. I thought I would have a look through the new packages and changes to existing packages and point out some of my highlights. The descriptions below are my summaries, if you want to see more detail you can read the full release notes <a href="https://bioconductor.org/news/bioc_3_7_release/" title="Bioc 3.7 news">here</a>.</p>
<section id="single-cell-rna-seq" class="level1">
<h1>Single-cell RNA-seq</h1>
<p>My interest is in single-cell RNA-seq analysis, so I am going to start off with packages related to this.</p>
<section id="new-packages" class="level2">
<h2 class="anchored" data-anchor-id="new-packages">New packages</h2>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/BEARscc.html" title="BEARscc"><strong>BEARscc</strong></a> - noise estimation tool to assess scRNA-seq clusters</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/ccfindR.html" title="ccfindR"><strong>ccfindR</strong></a> - collection of tools for cancer scRNA-seq analysis, including meta-gene identification and trees of cell clusters</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/DEsingle.html" title="DESingle"><strong>DESingle</strong></a> - detects three types of differential expression betweeen two groups of cells, differential expression status, differential expression abundance and general differential expression</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/DropletUtils.html" title="DropletUtils"><strong>DropletUtils</strong></a> - utility functions for handling data from droplet technologies like the 10x Chromium</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/iSEE.html" title="iSEE"><strong>iSEE</strong></a> - Interactive SummarizedExperiment Explorer, Shiny-based GUI for exploring data in SummarizedExperiment objects, with special attention given to SingleCellExperiment</li>
</ul>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lazappi.id.au/posts/2018-05-04-bioconductor-3-7-wrap-up/iSEE.png" class="img-fluid figure-img"></p>
<figcaption>Example iSEE window</figcaption>
</figure>
</div>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/LineagePulse.html" title="LineagePulse"><strong>LineagePulse</strong></a> - differential expression and expression model fitting package for scRNA-seq, accounting for batch effects, dropout and sequencing depth</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/MetaNeighbor.html" title="MetaNeighbour"><strong>MetaNeighbour</strong></a> - quantify cell type replicability across datasets</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/netSmooth.html" title="netSmooth"><strong>netSmooth</strong></a> - imputation of scRNA-seq data using biological networks</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scFeatureFilter.html" title="scFeatureFilter"><strong>scFeatureFilter</strong></a> - correlation based method for removing genes affected by systematic noise</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/singleCellTK.html" title="singleCellTK"><strong>singleCellTK</strong></a> - Shiny-based interactive scRNA-seq analysis toolkit</li>
<li><a href="https://bioconductor.org/packages/release/data/experiment/html/TENxBrainData.html" title="TENxBrainData"><strong>TENxBrainData</strong></a> - scRNA-seq data from 1.3 million mouse brain cells</li>
<li><a href="https://bioconductor.org/packages/release/workflows/html/simpleSingleCell.html" title="simpleSingleCell"><strong>simpleSingleCell</strong></a> - workflow implementing low-level scRNA-seq analysis using scran, scater and other Bioconductor packages</li>
</ul>
</section>
<section id="updates" class="level2">
<h2 class="anchored" data-anchor-id="updates">Updates</h2>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/AUCell.html" title="AUCell"><strong>AUCell</strong></a> - new Shiny app and plotting functions, support for sparse matrices</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/clusterExperiment.html" title="clusterExperiment"><strong>clusterExperiment</strong></a> - support for hdf5 files and SingleCellExperiment objects</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/monocle.html" title="monocle"><strong>monocle</strong></a> - changes to clustering algorithms</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scater.html" title="scater"><strong>scater</strong></a> - changes to <code>calculateQCMetrics()</code> and plotting functions, some functionality moved to new packages</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scDD.html" title="scDD"><strong>scDD</strong></a> - proportion of zeros test now use the Wald test instead of likelihood ratio, performance improvements</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/scran.html" title="scran"><strong>scran</strong></a> - various bug fixes, improvments and new arguments</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html" title="SingleCellExperiment"><strong>SingleCellExperiment</strong></a> - new functions for clearing and setting information</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/splatter.html" title="splatter"><strong>splatter</strong></a> - new options for Splat simulation library size and dropout parameters, new SparseDC simulation, improvfed print output</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/zinbwave.html" title="zinbwave"><strong>zinbwave</strong></a> - now uses <code>counts</code> assay by default, uses can specify which assay to use, computational weights now saved as an assay, improved documentation</li>
</ul>
</section>
</section>
<section id="other-areas" class="level1">
<h1>Other areas</h1>
<section id="new-packages-1" class="level2">
<h2 class="anchored" data-anchor-id="new-packages-1">New packages</h2>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/enrichplot.html" title="enrichplot"><strong>enrichplot</strong></a> - ggplot2 based functions for visualising gene-set enrichment results</li>
</ul>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lazappi.id.au/posts/2018-05-04-bioconductor-3-7-wrap-up/enrichplot.png" class="img-fluid figure-img"></p>
<figcaption>network plot from enrichplot</figcaption>
</figure>
</div>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/GARS.html" title="GARS"><strong>GARS</strong></a> - feature selection for high-dimensional datasets using genetic algorithms</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/plyranges.html" title="plyranges"><strong>plyranges</strong></a> - dplyr-like interface for Range and GenomicRanges objects</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/PowerExplorer.html" title="PowerExplorer"><strong>PowerExplorer</strong></a> - simulation based power calculations</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/singscore.html" title="singscore"><strong>singscore</strong></a> - rank-based single-sample gene set scoring method</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/SummarizedBenchmark.html" title="SummarizedBenchmark"><strong>SummarizedBenchmark</strong></a> - BenchDesign and SummarizedBenchmark classes for building, executing and evaluating software benchmark experiments</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/vidger.html" title="vidger"><strong>vidger</strong></a> - function for visualising differential expression results from Cuffdiff, DESeq2 and edgeR</li>
</ul>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lazappi.id.au/posts/2018-05-04-bioconductor-3-7-wrap-up/vidger.png" class="img-fluid figure-img"></p>
<figcaption>Example plot from vidger</figcaption>
</figure>
</div>
<ul>
<li><a href="https://bioconductor.org/packages/release/workflows/html/BiocMetaWorkflow.html" title="BiocMetaWorkflow"><strong>BiocMetaWorkflow</strong></a> - workflow describing how to use BiocWorkflowTools to submit a single R Markdown document to both Bioconductor and F1000Research</li>
</ul>
</section>
<section id="updates-1" class="level2">
<h2 class="anchored" data-anchor-id="updates-1">Updates</h2>
<ul>
<li><a href="https://bioconductor.org/packages/release/bioc/html/DESeq2.html" title="DESeq2"><strong>DESeq2</strong></a> - performance improvements and deprecation of designs without replicates</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/edgeR.html" title="edgeR"><strong>edgeR</strong></a> - new <code>read10X()</code>, <code>nearestTSS()</code>, <code>nearestReftoX()</code>, <code>modelMatrixMeth()</code> and <code>filterByExpr()</code> functions</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/GenomicRanges.html" title="GenomicRanges"><strong>GenomicRanges</strong></a> - GenomicRanges is now a list subclass, performance improvements</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/minfi.html" title="minfi"><strong>minfi</strong></a> - preliminary support for DelayedArray minfi objects</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html" title="SummarizedExperiment"><strong>SummarizedExperiment</strong></a> - new <code>subset</code> method</li>
<li><a href="https://bioconductor.org/packages/release/bioc/html/tximport.html" title="tximport"><strong>tximport</strong></a> - support for StringTie output</li>
</ul>


</section>
</section>

 ]]></description>
  <category>bioconductor</category>
  <category>R</category>
  <category>scrna-seq</category>
  <guid>https://lazappi.id.au/posts/2018-05-04-bioconductor-3-7-wrap-up/</guid>
  <pubDate>Thu, 03 May 2018 22:00:00 GMT</pubDate>
</item>
<item>
  <title>My AFL-Elo model</title>
  <link>https://lazappi.id.au/posts/2018-04-21-my-afl-elo-model/</link>
  <description><![CDATA[ 





<!-- ```{r knitr, include = FALSE}
knitr::opts_chunk$set(autodep        = TRUE,
                      cache          = FALSE,
                      cache.comments = TRUE,
                      echo           = FALSE,
                      error          = FALSE,
                      fig.align      = "center",
                      fig.width      = 10,
                      fig.height     = 8,
                      message        = FALSE,
                      warning        = FALSE)
``` -->
<!-- ```{r libraries}
library("knitr")
library("here")
library("formattable")
library("tidyverse")
``` -->
<p>Over the last few years I have followed a lot of the work done by <a href="http://fivethirtyeight.com/" title="FiveThirtyEight">FiveThiryEight</a>, particularly their attempts to model and predict sport. More recently I have discovered there is a community of people trying to do similar things for the AFL, including <a href="https://thearcfooty.com/" title="The Arc">The Arc</a>, <a href="https://squiggle.com.au/" title="Squiggle">Squiggle</a>, <a href="http://www.matterofstats.com/" title="Matter of Stats">Matter of Stats</a> and <a href="http://www.hpnfooty.com/" title="Hurling People Now">Hurling People Now</a>.</p>
<p>Many of these modelling projects are based around the <a href="https://en.wikipedia.org/wiki/Elo_rating_system" title="Elo system">Elo system</a>. If you haven’t heard of it before this model is a ranking system originally designed for chess by a Hungarian physicist. In the simplest form each player (or team) is assigned a ranking. When a match is played you can estimate a win probability based on the differences between the rankings. The rankings are then adjusted based on the result in such a way that unexpected results cause bigger changes than those that are closer to what was predicted. This model is relatively naive and simple to implement, no knowledge of the players or teams themselves is required, just the results of matches, but can still produce good predictions.</p>
<p>Given this I thought it would be a good place to start. My version of the model is closely based on the one described by The Arc <a href="https://thearcfooty.com/2016/12/29/introducing-the-arcs-ratings-system/" title="The Arc model">here</a>. There were a few different things I wanted to try but (as always) everything took longer than I planned, so what I have done in the end is very similar. The one area where I have done things differently is the process used to select the parameters of the model. This part wasn’t really described in the post on The Arc so I was left to my own devices. Here are brief descriptions of the parameters, but if you are interested I suggest you check out the outline of the model on The Arc which has a lot more detail:</p>
<ul>
<li><strong>New team rating</strong> - The starting rating for new teams that enter the competition (Gold Coast and GWS). Original teams start with a rating of 1500.</li>
<li><strong>New season adjustment</strong> - Amount to regress to the mean at the beginning of a new season</li>
<li><strong>HGA alpha</strong> - Weighting given to travel distance when calculating home ground advantage (HGA)</li>
<li><strong>HGA beta</strong> - Weighting given to ground experience when calculating HGA</li>
<li><strong>p</strong> - Controls how win probabilities are converted to margins</li>
<li><strong>k</strong> - Controls how differences between predicted and actual results affect ratings. Greater values cause greater changes, meaning the the model reacts quicker to what has happened but also that it is more unstable. In many ways this is the critical parameter for the Elo model. For this version of the model we use three different values:
<ul>
<li><strong>Early</strong> - Used for the first five rounds of the regular season</li>
<li><strong>Normal</strong> - Used for the remainder of the regular season</li>
<li><strong>Finals</strong> - Used for finals matches</li>
</ul></li>
</ul>
<p>To select these parameters I chose to use a <a href="https://en.wikipedia.org/wiki/Genetic_algorithm" title="Genetic optimisation">genetic optimisation algorithm</a>. Partly because it is potentially able to explore a wider parameter space, but also because I think they are cool. To do this we need a measure of fitness that we are aiming for. For sport predictions there are generally two things we want to know, who is going to win and by how much. Estimating these can often be best done using different sets of parameters. For this reason I ran the optimisation procedure three times, once optimising for win prediction accuracy, once optimising for the mean absolute error in predicting the margin and once for a 50/50 balance between the two. Each optimisation procedure was run for 100 generations with 100 individuals in each generation, training the model on all AFL games from 1997 to 2016 and assessing performance on the games from 2000 to 2016. This leaves the 2017 season as a validation set to check the selected parameters. Here the best performing parameter sets from each of the optimisations compared to the default parameters based on The Arc:</p>
<!-- ```{r opt-summary}
opt_summ <- read_tsv(here("static/data/afl2018/optimisation_summary.tsv"),
                     col_types = cols(
                         .default = col_double(),
                         Version = col_character()
                     ))
opt_summ %>%
    mutate(new_team_rating = round(new_team_rating),
           new_season_adjustment = round(new_season_adjustment, 2),
           hga_alpha = round(hga_alpha, 2),
           hga_beta = round(hga_beta, 2),
           pred_p = round(pred_p, 4),
           adjust_k_early = round(adjust_k_early),
           adjust_k_normal = round(adjust_k_normal),
           adjust_k_finals = round(adjust_k_finals),
           Margin2016 = round(Margin2016, 2),
           Predict2016 = round(Predict2016, 2),
           Margin2017 = round(Margin2017, 2),
           Predict2017 = round(Predict2017, 2)) %>%
    mutate_all(as.character) %>%
    select(-Version) %>%
    rename(NewTeamRating = new_team_rating,
           NewSeasonAdjustment = new_season_adjustment,
           HGA_Alpha = hga_alpha,
           HGA_Beta = hga_beta,
           p = pred_p,
           k_Early = adjust_k_early,
           k_Normal = adjust_k_normal,
           k_Finals = adjust_k_finals) %>%
    t() %>%
    data.frame() %>%
    rename(Default = X1, Margin = X2, Balanced = X3, Prediction = X4) %>%
    kable()
``` -->
<table class="table">
<thead>
<tr class="header">
<th>
</th>
<th align="left">
Default
</th>
<th align="left">
Margin
</th>
<th align="left">
Balanced
</th>
<th align="left">
Prediction
</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>
NewTeamRating
</td>
<td align="left">
1090
</td>
<td align="left">
1292
</td>
<td align="left">
1284
</td>
<td align="left">
1106
</td>
</tr>
<tr class="even">
<td>
NewSeasonAdjustment
</td>
<td align="left">
0.1
</td>
<td align="left">
0.33
</td>
<td align="left">
0.39
</td>
<td align="left">
0.54
</td>
</tr>
<tr class="odd">
<td>
HGA_Alpha
</td>
<td align="left">
6
</td>
<td align="left">
1.33
</td>
<td align="left">
2.89
</td>
<td align="left">
2.05
</td>
</tr>
<tr class="even">
<td>
HGA_Beta
</td>
<td align="left">
15
</td>
<td align="left">
12.89
</td>
<td align="left">
2.1
</td>
<td align="left">
5.68
</td>
</tr>
<tr class="odd">
<td>
p
</td>
<td align="left">
0.0464
</td>
<td align="left">
0.027
</td>
<td align="left">
0.0204
</td>
<td align="left">
0.078
</td>
</tr>
<tr class="even">
<td>
k_Early
</td>
<td align="left">
82
</td>
<td align="left">
92
</td>
<td align="left">
92
</td>
<td align="left">
55
</td>
</tr>
<tr class="odd">
<td>
k_Normal
</td>
<td align="left">
62
</td>
<td align="left">
62
</td>
<td align="left">
42
</td>
<td align="left">
38
</td>
</tr>
<tr class="even">
<td>
k_Finals
</td>
<td align="left">
72
</td>
<td align="left">
33
</td>
<td align="left">
80
</td>
<td align="left">
43
</td>
</tr>
<tr class="odd">
<td>
Margin2016
</td>
<td align="left">
29.9
</td>
<td align="left">
29.82
</td>
<td align="left">
29.76
</td>
<td align="left">
32.51
</td>
</tr>
<tr class="even">
<td>
Predict2016
</td>
<td align="left">
0.68
</td>
<td align="left">
0.68
</td>
<td align="left">
0.68
</td>
<td align="left">
0.69
</td>
</tr>
<tr class="odd">
<td>
Margin2017
</td>
<td align="left">
29.09
</td>
<td align="left">
28.94
</td>
<td align="left">
29.23
</td>
<td align="left">
30.18
</td>
</tr>
<tr class="even">
<td>
Predict2017
</td>
<td align="left">
0.61
</td>
<td align="left">
0.63
</td>
<td align="left">
0.61
</td>
<td align="left">
0.62
</td>
</tr>
</tbody>
</table>
<p>Based on the 2017 results I decided to go with the Margin model. Despite being optimised for margin accuracy it also performed the best at predicting results in 2017. This might suggest that the optimisation procedure is not ideal, but that is a problem for another day… Encouragingly, all three of my models outperform the defaults, which suggests that the results will be somewhere in the range of the The Arc, and I am more than happy with that.</p>
<p>If you are interested in how I have done things I have made an <a href="https://github.com/lazappi/aflelo" title="aflelo"><code>aflelo</code></a> R package which you can install from GitHub and my analysis and predictions for each round will be available <a href="https://github.com/lazappi/afl-2018" title="afl2018">here</a>.</p>
<p>Now that I have a model I can use it to make predictions about the 2018 season!</p>
<section id="round-5" class="level1">
<h1>Round 5</h1>
<section id="summary" class="level2">
<h2 class="anchored" data-anchor-id="summary">Summary</h2>
<!-- ```{r summ_table}
summ_table <- read_rds(here("static/data/afl2018/R5/summary_table.Rds"))
summ_table
``` -->
<table class="table table-condensed">
<thead>
<tr>
<th style="text-align:right;">
Team
</th>
<th style="text-align:right;">
Rating
</th>
<th style="text-align:right;">
Change
</th>
<th style="text-align:right;">
Points
</th>
<th style="text-align:right;">
Percentage
</th>
<th style="text-align:right;">
ProjRating
</th>
<th style="text-align:right;">
ProjPoints
</th>
<th style="text-align:right;">
Top2
</th>
<th style="text-align:right;">
Top4
</th>
<th style="text-align:right;">
Top8
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:right;">
Richmond
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #add8e6">1570</span>
</td>
<td style="text-align:right;">
<span style="color: green">19</span>
</td>
<td style="text-align:right;">
12
</td>
<td style="text-align:right;">
130
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #add8e6">1555</span>
</td>
<td style="text-align:right;">
56.5
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffb6c1">28.2</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffb6c1">49.8</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffb6c1">78.9</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Sydney
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #b1d9e7">1561</span>
</td>
<td style="text-align:right;">
<span style="color: red">-6</span>
</td>
<td style="text-align:right;">
12
</td>
<td style="text-align:right;">
108
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #b2dae7">1546</span>
</td>
<td style="text-align:right;">
56.9
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffb6c1">27.9</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffb6c1">49.4</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffb6c1">79.0</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
GW Sydney
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #b2dae7">1558</span>
</td>
<td style="text-align:right;">
<span style="color: green">0</span>
</td>
<td style="text-align:right;">
12
</td>
<td style="text-align:right;">
140
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #b3dbe7">1544</span>
</td>
<td style="text-align:right;">
56.3
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffb7c2">27.5</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffb7c2">48.7</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffb6c1">79.4</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
West Coast
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #b3dbe7">1556</span>
</td>
<td style="text-align:right;">
<span style="color: green">21</span>
</td>
<td style="text-align:right;">
12
</td>
<td style="text-align:right;">
136
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #b2dae7">1545</span>
</td>
<td style="text-align:right;">
54.7
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffc3cc">23.0</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffc0c9">42.7</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffbac4">74.9</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Hawthorn
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #b4dbe8">1553</span>
</td>
<td style="text-align:right;">
<span style="color: green">34</span>
</td>
<td style="text-align:right;">
12
</td>
<td style="text-align:right;">
127
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #b6dce8">1539</span>
</td>
<td style="text-align:right;">
56.3
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffb9c3">26.9</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffb8c2">48.2</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffb6c1">78.7</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Adelaide
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #b7dde9">1546</span>
</td>
<td style="text-align:right;">
<span style="color: red">-44</span>
</td>
<td style="text-align:right;">
8
</td>
<td style="text-align:right;">
107
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #b8dde9">1535</span>
</td>
<td style="text-align:right;">
50.7
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffdce1">13.3</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffd3da">29.7</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffc5cd">63.1</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Geelong
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #b9dee9">1542</span>
</td>
<td style="text-align:right;">
<span style="color: green">10</span>
</td>
<td style="text-align:right;">
8
</td>
<td style="text-align:right;">
109
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #badeea">1532</span>
</td>
<td style="text-align:right;">
50.0
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffdfe4">12.0</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffd6dc">27.6</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffc6cf">61.3</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Collingwood
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #bfe0eb">1530</span>
</td>
<td style="text-align:right;">
<span style="color: green">44</span>
</td>
<td style="text-align:right;">
8
</td>
<td style="text-align:right;">
107
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #c0e1ec">1521</span>
</td>
<td style="text-align:right;">
49.6
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffe2e6">11.1</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffd7dd">26.8</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffc8d1">59.1</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Port Adelaide
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #bfe0eb">1529</span>
</td>
<td style="text-align:right;">
<span style="color: red">-18</span>
</td>
<td style="text-align:right;">
12
</td>
<td style="text-align:right;">
117
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #c0e1eb">1522</span>
</td>
<td style="text-align:right;">
51.9
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffd7dd">15.4</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffcfd6">32.5</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffc3cc">64.6</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
North Melbourne
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #cee7f0">1497</span>
</td>
<td style="text-align:right;">
<span style="color: green">28</span>
</td>
<td style="text-align:right;">
8
</td>
<td style="text-align:right;">
134
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #cde7f0">1499</span>
</td>
<td style="text-align:right;">
43.2
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fff2f4">4.7</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffecef">12.6</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffdce1">38.4</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Essendon
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #cfe8f0">1494</span>
</td>
<td style="text-align:right;">
<span style="color: green">18</span>
</td>
<td style="text-align:right;">
8
</td>
<td style="text-align:right;">
99
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #cde7f0">1499</span>
</td>
<td style="text-align:right;">
40.7
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fff7f8">3.0</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fff2f4">8.5</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffe5e9">28.2</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Melbourne
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #d0e8f0">1493</span>
</td>
<td style="text-align:right;">
<span style="color: red">-34</span>
</td>
<td style="text-align:right;">
8
</td>
<td style="text-align:right;">
98
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #d0e8f0">1495</span>
</td>
<td style="text-align:right;">
43.1
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fff5f6">3.8</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffeef1">11.3</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffdee3">35.5</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Western Bulldogs
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ddeff4">1463</span>
</td>
<td style="text-align:right;">
<span style="color: green">6</span>
</td>
<td style="text-align:right;">
4
</td>
<td style="text-align:right;">
72
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #deeff5">1470</span>
</td>
<td style="text-align:right;">
36.0
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fffdfd">0.7</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fffafb">3.2</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fff0f3">16.3</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Fremantle
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #e0f0f5">1458</span>
</td>
<td style="text-align:right;">
<span style="color: green">0</span>
</td>
<td style="text-align:right;">
8
</td>
<td style="text-align:right;">
89
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #e1f0f5">1466</span>
</td>
<td style="text-align:right;">
39.3
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fffafb">1.8</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fff6f7">5.9</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffeaed">23.3</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
St Kilda
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #e9f4f8">1438</span>
</td>
<td style="text-align:right;">
<span style="color: red">-10</span>
</td>
<td style="text-align:right;">
4
</td>
<td style="text-align:right;">
68
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #e6f3f7">1456</span>
</td>
<td style="text-align:right;">
29.4
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fffefe">0.2</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fffdfd">1.1</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fffafa">6.4</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Gold Coast
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #f3f9fb">1415</span>
</td>
<td style="text-align:right;">
<span style="color: red">-21</span>
</td>
<td style="text-align:right;">
8
</td>
<td style="text-align:right;">
83
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #f4f9fb">1433</span>
</td>
<td style="text-align:right;">
32.7
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fffdfe">0.4</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fffcfd">1.6</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fff7f8">9.1</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Brisbane
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #f7fbfc">1407</span>
</td>
<td style="text-align:right;">
<span style="color: red">-19</span>
</td>
<td style="text-align:right;">
0
</td>
<td style="text-align:right;">
64
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #f8fbfd">1426</span>
</td>
<td style="text-align:right;">
24.3
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fffefe">0.1</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fffefe">0.3</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #fffdfd">2.7</span>
</td>
</tr>
<tr>
<td style="text-align:right;">
Carlton
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffffff">1390</span>
</td>
<td style="text-align:right;">
<span style="color: red">-28</span>
</td>
<td style="text-align:right;">
0
</td>
<td style="text-align:right;">
61
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffffff">1415</span>
</td>
<td style="text-align:right;">
20.5
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffffff">0.0</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffffff">0.1</span>
</td>
<td style="text-align:right;">
<span style="display: block; padding: 0 4px; border-radius: 4px; background-color: #ffffff">1.2</span>
</td>
</tr>
</tbody>
</table>
</section>
<section id="predictions" class="level2">
<h2 class="anchored" data-anchor-id="predictions">Predictions</h2>
<p><img src="https://lazappi.id.au/posts/2018-04-21-my-afl-elo-model/predictions.png" class="img-fluid"></p>
</section>
<section id="projections" class="level2">
<h2 class="anchored" data-anchor-id="projections">Projections</h2>
<section id="ladder" class="level3">
<h3 class="anchored" data-anchor-id="ladder">Ladder</h3>
<p><img src="https://lazappi.id.au/posts/2018-04-21-my-afl-elo-model/ladder.png" class="img-fluid"></p>
</section>
<section id="premiership-points" class="level3">
<h3 class="anchored" data-anchor-id="premiership-points">Premiership points</h3>
<p><img src="https://lazappi.id.au/posts/2018-04-21-my-afl-elo-model/points.png" class="img-fluid"></p>
</section>
</section>
<section id="history" class="level2">
<h2 class="anchored" data-anchor-id="history">History</h2>
<p><img src="https://lazappi.id.au/posts/2018-04-21-my-afl-elo-model/history.png" class="img-fluid"></p>


</section>
</section>

 ]]></description>
  <category>afl</category>
  <category>elo</category>
  <category>sport</category>
  <guid>https://lazappi.id.au/posts/2018-04-21-my-afl-elo-model/</guid>
  <pubDate>Fri, 20 Apr 2018 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Welcome to Blogdown!</title>
  <link>https://lazappi.id.au/posts/2018-03-24-welcome-to-blogdown/</link>
  <description><![CDATA[ 





<section id="welcome-to-blogdown" class="level1">
<h1>Welcome to Blogdown!</h1>
<p>I’ve just finished migrating this blog from Ghost to Blogdown. Ghost was great in a lot a ways but Blogdown adds the ability to write in RMarkdown which I’m hoping will encourage me to post more often, particularly on things that include some code or analysis.</p>
<p>Thanks to everyone who has written about their experiences with Blogdown. I didn’t keep track of the resources I found useful so I can’t point them out but they definitely made the process easier.</p>
<p>See you around!</p>


</section>

 ]]></description>
  <category>blogdown</category>
  <guid>https://lazappi.id.au/posts/2018-03-24-welcome-to-blogdown/</guid>
  <pubDate>Fri, 23 Mar 2018 23:00:00 GMT</pubDate>
  <media:content url="https://lazappi.id.au/posts/2018-03-24-welcome-to-blogdown/featured.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Joining the Dots Twitter analysis</title>
  <link>https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/</link>
  <description><![CDATA[ 





<p>Today I attended the <a href="https://joiningthedots.github.io/">Joining the Dots</a> visualisation symposium. You can see the slides for my talk about clustering trees <a href="https://speakerdeck.com/lazappi/building-a-clustering-tree">here</a>. It was a great event and hope we see more meetings like this in the future. Here is an analysis of the Twitter activity on the <a href="https://twitter.com/search?q=%23jtdwehi&amp;src=typd">#jtdwehi</a> hashtag, thanks to code from <a href="https://nsaunders.wordpress.com">Neil Saunders</a>. You can see it on <a href="https://github.com/lazappi/jtdwehi-twitter">Github</a>.</p>
<section id="introduction" class="level1">
<h1>Introduction</h1>
<p>An analysis of tweets from the Joining the Dots symposium. 1237 tweets were collected using the <code>rtweet</code> R package:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1">jtdwehi <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">search_tweets</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"#jtdwehi"</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">saveRDS</span>(jtdwehi, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"data/jtdwehi.Rds"</span>)</span></code></pre></div></div>
<section id="search-all-the-hashtags" class="level2">
<h2 class="anchored" data-anchor-id="search-all-the-hashtags">Search all the hashtags!</h2>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/hashtags-1.png" class="img-fluid"></p>
</section>
</section>
<section id="timeline" class="level1">
<h1>Timeline</h1>
<section id="tweets-by-day" class="level2">
<h2 class="anchored" data-anchor-id="tweets-by-day">Tweets by day</h2>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/tweets-by-day-1.png" class="img-fluid"></p>
</section>
<section id="tweets-by-day-and-time" class="level2">
<h2 class="anchored" data-anchor-id="tweets-by-day-and-time">Tweets by day and time</h2>
<p>Filtered for dates July 21-26, Prague time.</p>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/tweets-by-day-hour-1.png" class="img-fluid"></p>
</section>
</section>
<section id="users" class="level1">
<h1>Users</h1>
<section id="top-tweeters" class="level2">
<h2 class="anchored" data-anchor-id="top-tweeters">Top tweeters</h2>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/tweets-top-users-1.png" class="img-fluid"></p>
</section>
<section id="sources" class="level2">
<h2 class="anchored" data-anchor-id="sources">Sources</h2>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/tweets-top-sources-1.png" class="img-fluid"></p>
</section>
</section>
<section id="networks" class="level1">
<h1>Networks</h1>
<section id="replies" class="level2">
<h2 class="anchored" data-anchor-id="replies">Replies</h2>
<p>The “replies network”, composed from users who reply directly to one another, coloured by page rank.</p>
<p>Better to view the original PNG file in the <code>data</code> directory.</p>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/jtdwehi_replies.png" class="img-fluid"></p>
</section>
<section id="mentions" class="level2">
<h2 class="anchored" data-anchor-id="mentions">Mentions</h2>
<p>The “mentions network”, where users mention other users in their tweets. Filtered for k-core &gt;= 4 and coloured by modularity class.</p>
<p>Better to view the original PNG file in the <code>data</code> directory.</p>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/jtdwehi_mentions-1.png" class="img-fluid"></p>
</section>
</section>
<section id="retweets" class="level1">
<h1>Retweets</h1>
<section id="retweet-proportion" class="level2">
<h2 class="anchored" data-anchor-id="retweet-proportion">Retweet proportion</h2>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/is-retweet-1.png" class="img-fluid"></p>
</section>
<section id="retweet-count" class="level2">
<h2 class="anchored" data-anchor-id="retweet-count">Retweet count</h2>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/retweet-count-1.png" class="img-fluid"></p>
</section>
<section id="top-retweets" class="level2">
<h2 class="anchored" data-anchor-id="top-retweets">Top retweets</h2>
<table>
<thead>
<tr>
<th style="text-align:left;">
screen_name
</th>
<th style="text-align:left;">
text
</th>
<th style="text-align:right;">
retweet_count
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;">
<em>lazappi</em>
</td>
<td style="text-align:left;">
Slides from my #jtdwehi talk today about building a clustering tree https://t.co/lwTztVstOC
</td>
<td style="text-align:right;">
12
</td>
</tr>
<tr>
<td style="text-align:left;">
<em>lazappi</em>
</td>
<td style="text-align:left;">
.@bestqualitycrab Visualising creative research (more creatively) #jtdwehi #sketchnotes https://t.co/DXhk1u22nf
</td>
<td style="text-align:right;">
10
</td>
</tr>
<tr>
<td style="text-align:left;">
FCTweedie
</td>
<td style="text-align:left;">
.@claresloggett’s tips on where to start with data viz in Python #jtdwehi https://t.co/jN626uOAqd
</td>
<td style="text-align:right;">
10
</td>
</tr>
<tr>
<td style="text-align:left;">
FCTweedie
</td>
<td style="text-align:left;">
Visualising grant recipients: Davids most funded but Richards get more money #jtdwehi https://t.co/iPImbK4paf
</td>
<td style="text-align:right;">
9
</td>
</tr>
<tr>
<td style="text-align:left;">
mikejonesmelb
</td>
<td style="text-align:left;">
Really valuable point from <span class="citation" data-cites="KathyReid">@KathyReid</span>: sometimes #dataviz decisions affected by need to consider political priorities and buy-in #jtdwehi
</td>
<td style="text-align:right;">
9
</td>
</tr>
<tr>
<td style="text-align:left;">
gravitron
</td>
<td style="text-align:left;">
<span class="citation" data-cites="bestqualitycrab">@bestqualitycrab</span> demoing dataviz: ask the tricky Q’s not the obvious. Consider the felt not just the instrumental.… https://t.co/ca1zCn4oSO
</td>
<td style="text-align:right;">
8
</td>
</tr>
<tr>
<td style="text-align:left;">
mikejonesmelb
</td>
<td style="text-align:left;">
More on the Transport Network Strategic Investment Tool (TraNSIT) here https://t.co/z5v827bfjd <span class="citation" data-cites="Xavier_Ho">@Xavier_Ho</span> #jtdwehi
</td>
<td style="text-align:right;">
8
</td>
</tr>
<tr>
<td style="text-align:left;">
mikejonesmelb
</td>
<td style="text-align:left;">
To visualise data is to encode it; how can we decode it? So Isabelle created Tracey McTraceface https://t.co/4YoxS4T6OS #jtdwehi
</td>
<td style="text-align:right;">
7
</td>
</tr>
<tr>
<td style="text-align:left;">
oldmateo
</td>
<td style="text-align:left;">
:: "Research publishing methods stuck in the Stone Age" :: Brendan Ansell on balancing completeness and salience i… https://t.co/7WVV2Ni31U
</td>
<td style="text-align:right;">
7
</td>
</tr>
<tr>
<td style="text-align:left;">
gravitron
</td>
<td style="text-align:left;">
<span class="citation" data-cites="bestqualitycrab">@bestqualitycrab</span> leading a chorus of Slipping Away. Just your run of the mill dataviz conference. #JoiningTheDots… https://t.co/6oxUMXZfpm
</td>
<td style="text-align:right;">
7
</td>
</tr>
</tbody>
</table>
</section>
</section>
<section id="favourites" class="level1">
<h1>Favourites</h1>
<section id="favourite-proportion" class="level2">
<h2 class="anchored" data-anchor-id="favourite-proportion">Favourite proportion</h2>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/has-favorite-1.png" class="img-fluid"></p>
</section>
<section id="favourite-count" class="level2">
<h2 class="anchored" data-anchor-id="favourite-count">Favourite count</h2>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/favorite-count-1.png" class="img-fluid"></p>
</section>
<section id="top-favourites" class="level2">
<h2 class="anchored" data-anchor-id="top-favourites">Top favourites</h2>
<table>
<thead>
<tr>
<th style="text-align:left;">
screen_name
</th>
<th style="text-align:left;">
text
</th>
<th style="text-align:right;">
favorite_count
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;">
<em>lazappi</em>
</td>
<td style="text-align:left;">
Slides from my #jtdwehi talk today about building a clustering tree https://t.co/lwTztVstOC
</td>
<td style="text-align:right;">
19
</td>
</tr>
<tr>
<td style="text-align:left;">
Xavier_Ho
</td>
<td style="text-align:left;">
People are flowing back in #jtdwehi https://t.co/t4aU8WXoX9
</td>
<td style="text-align:right;">
16
</td>
</tr>
<tr>
<td style="text-align:left;">
WEHI_research
</td>
<td style="text-align:left;">
Welcome to delegates attending today’s symposium Joining the Dots: The Art and Science of Data Visualisation! #jtdwehi #dataviz
</td>
<td style="text-align:right;">
16
</td>
</tr>
<tr>
<td style="text-align:left;">
FCTweedie
</td>
<td style="text-align:left;">
Visualising grant recipients: Davids most funded but Richards get more money #jtdwehi https://t.co/iPImbK4paf
</td>
<td style="text-align:right;">
12
</td>
</tr>
<tr>
<td style="text-align:left;">
<em>lazappi</em>
</td>
<td style="text-align:left;">
.@bestqualitycrab Visualising creative research (more creatively) #jtdwehi #sketchnotes https://t.co/DXhk1u22nf
</td>
<td style="text-align:right;">
11
</td>
</tr>
<tr>
<td style="text-align:left;">
robbie_bonelli
</td>
<td style="text-align:left;">
So inspired by the talk given by <span class="citation" data-cites="bestqualitycrab">@bestqualitycrab</span> on the problem of #genderequality and how #dataviz can help us! Thanks Deb! #jtdwehi
</td>
<td style="text-align:right;">
11
</td>
</tr>
<tr>
<td style="text-align:left;">
KathyReid
</td>
<td style="text-align:left;">
The incredible <span class="citation" data-cites="bestqualitycrab">@bestqualitycrab</span> keynoting #jtdwehi https://t.co/mLgKdVt4IX
</td>
<td style="text-align:right;">
11
</td>
</tr>
<tr>
<td style="text-align:left;">
FCTweedie
</td>
<td style="text-align:left;">
.@claresloggett’s tips on where to start with data viz in Python #jtdwehi https://t.co/jN626uOAqd
</td>
<td style="text-align:right;">
11
</td>
</tr>
<tr>
<td style="text-align:left;">
peterneish
</td>
<td style="text-align:left;">
Building a clustering tree https://t.co/KDgdRfBejZ #jtdwehi
</td>
<td style="text-align:right;">
11
</td>
</tr>
<tr>
<td style="text-align:left;">
FCTweedie
</td>
<td style="text-align:left;">
Representing Greek films via olive trees (which are are actually Markov chains) #jtdwehi https://t.co/SB2CG4oH8D
</td>
<td style="text-align:right;">
10
</td>
</tr>
</tbody>
</table>
</section>
</section>
<section id="quotes" class="level1">
<h1>Quotes</h1>
<section id="quote-proportion" class="level2">
<h2 class="anchored" data-anchor-id="quote-proportion">Quote proportion</h2>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/is-quote-1.png" class="img-fluid"></p>
</section>
<section id="quote-count" class="level2">
<h2 class="anchored" data-anchor-id="quote-count">Quote count</h2>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/quotes-count-1.png" class="img-fluid"></p>
</section>
<section id="top-quotes" class="level2">
<h2 class="anchored" data-anchor-id="top-quotes">Top quotes</h2>
<table>
<thead>
<tr>
<th style="text-align:left;">
screen_name
</th>
<th style="text-align:left;">
text
</th>
<th style="text-align:right;">
quote_count
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;">
peterneish
</td>
<td style="text-align:left;">
Would love to see some taxonomic data plotted like this. #jtdwehi https://t.co/EbBL872fum
</td>
<td style="text-align:right;">
5
</td>
</tr>
<tr>
<td style="text-align:left;">
Xavier_Ho
</td>
<td style="text-align:left;">
overlaying clusters: the datavis movie #jtdwehi https://t.co/KA5ovvvW6r
</td>
<td style="text-align:right;">
5
</td>
</tr>
<tr>
<td style="text-align:left;">
frostickle
</td>
<td style="text-align:left;">
<p>Where can people go from here, to take advantage of things they’ve learnt at #jtdwehi? <span class="citation" data-cites="ResPlat">@ResPlat</span>? <span class="citation" data-cites="OKFNau">@OKFNau</span>?</p>
#dataviz https://t.co/TM6ngns9RS
</td>
<td style="text-align:right;">
4
</td>
</tr>
<tr>
<td style="text-align:left;">
rowlandm
</td>
<td style="text-align:left;">
The money shot from <span class="citation" data-cites="_lazappi_">@_lazappi_</span> ! #jtdwehi https://t.co/nqynLrC7Vg
</td>
<td style="text-align:right;">
3
</td>
</tr>
<tr>
<td style="text-align:left;">
Xavier_Ho
</td>
<td style="text-align:left;">
Slide here: https://t.co/o2E59HHoZE #jtdwehi https://t.co/L98WV1tXgu
</td>
<td style="text-align:right;">
3
</td>
</tr>
<tr>
<td style="text-align:left;">
karinv
</td>
<td style="text-align:left;">
Thanks to <span class="citation" data-cites="FCTweedie">@FCTweedie</span> and <span class="citation" data-cites="rubin_af">@rubin_af</span> for a great day of #dataviz! #jtdwehi https://t.co/Hti5FQtMGz
</td>
<td style="text-align:right;">
3
</td>
</tr>
<tr>
<td style="text-align:left;">
rowlandm
</td>
<td style="text-align:left;">
LImited funding … sounds like research! #jtdwehi https://t.co/gZwllFhtRe
</td>
<td style="text-align:right;">
2
</td>
</tr>
<tr>
<td style="text-align:left;">
peterneish
</td>
<td style="text-align:left;">
Fascinating insights into the life sciences #jtdwehi https://t.co/LpRwfP00ns
</td>
<td style="text-align:right;">
2
</td>
</tr>
<tr>
<td style="text-align:left;">
karinv
</td>
<td style="text-align:left;">
Adding the correct hashtag! (sorry folks) #jtdwehi https://t.co/PoGZe8k1k8
</td>
<td style="text-align:right;">
2
</td>
</tr>
<tr>
<td style="text-align:left;">
robbie_bonelli
</td>
<td style="text-align:left;">
Depressing and motivating! #jtdwehi https://t.co/YCGB1ibYkw
</td>
<td style="text-align:right;">
2
</td>
</tr>
</tbody>
</table>
</section>
</section>
<section id="media" class="level1">
<h1>Media</h1>
<section id="media-count" class="level2">
<h2 class="anchored" data-anchor-id="media-count">Media count</h2>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/has-media-1.png" class="img-fluid"></p>
</section>
<section id="top-media" class="level2">
<h2 class="anchored" data-anchor-id="top-media">Top media</h2>
<table>
<thead>
<tr>
<th style="text-align:left;">
screen_name
</th>
<th style="text-align:left;">
text
</th>
<th style="text-align:right;">
favorite_count
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;">
Xavier_Ho
</td>
<td style="text-align:left;">
People are flowing back in #jtdwehi https://t.co/t4aU8WXoX9
</td>
<td style="text-align:right;">
16
</td>
</tr>
<tr>
<td style="text-align:left;">
FCTweedie
</td>
<td style="text-align:left;">
Visualising grant recipients: Davids most funded but Richards get more money #jtdwehi https://t.co/iPImbK4paf
</td>
<td style="text-align:right;">
12
</td>
</tr>
<tr>
<td style="text-align:left;">
<em>lazappi</em>
</td>
<td style="text-align:left;">
.@bestqualitycrab Visualising creative research (more creatively) #jtdwehi #sketchnotes https://t.co/DXhk1u22nf
</td>
<td style="text-align:right;">
11
</td>
</tr>
<tr>
<td style="text-align:left;">
KathyReid
</td>
<td style="text-align:left;">
The incredible <span class="citation" data-cites="bestqualitycrab">@bestqualitycrab</span> keynoting #jtdwehi https://t.co/mLgKdVt4IX
</td>
<td style="text-align:right;">
11
</td>
</tr>
<tr>
<td style="text-align:left;">
FCTweedie
</td>
<td style="text-align:left;">
.@claresloggett’s tips on where to start with data viz in Python #jtdwehi https://t.co/jN626uOAqd
</td>
<td style="text-align:right;">
11
</td>
</tr>
<tr>
<td style="text-align:left;">
FCTweedie
</td>
<td style="text-align:left;">
Representing Greek films via olive trees (which are are actually Markov chains) #jtdwehi https://t.co/SB2CG4oH8D
</td>
<td style="text-align:right;">
10
</td>
</tr>
<tr>
<td style="text-align:left;">
frostickle
</td>
<td style="text-align:left;">
<p>Now <span class="citation" data-cites="Xavier_Ho">@Xavier_Ho</span> from the <span class="citation" data-cites="CSIROnews">@CSIROnews</span> is talking about Visualising the Australian Transport Network</p>
#jtdwehi #dataviz https://t.co/DcvXYmD45F
</td>
<td style="text-align:right;">
10
</td>
</tr>
<tr>
<td style="text-align:left;">
FCTweedie
</td>
<td style="text-align:left;">
Getting underway for #jtdwehi with acknowledgement of country from <span class="citation" data-cites="WEHI_research">@WEHI_research</span>’s director https://t.co/oNcnu5wtd9
</td>
<td style="text-align:right;">
10
</td>
</tr>
<tr>
<td style="text-align:left;">
FCTweedie
</td>
<td style="text-align:left;">
Patriarchy looks like this! What happens when we can describe the shape of injustice #jtdwehi https://t.co/8A7EhnFmt5
</td>
<td style="text-align:right;">
9
</td>
</tr>
<tr>
<td style="text-align:left;">
gravitron
</td>
<td style="text-align:left;">
Best URL of the day goes to <span class="citation" data-cites="Isa_Kiko">@Isa_Kiko</span>’s https://t.co/kapY0Aeacy A great looking tool! #JoiningTheDots #jtdwehi https://t.co/gal2v1PUJY
</td>
<td style="text-align:right;">
7
</td>
</tr>
</tbody>
</table>
<section id="most-liked-media-image" class="level3">
<h3 class="anchored" data-anchor-id="most-liked-media-image">Most liked media image</h3>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/most_liked_media.jpg" class="img-fluid"></p>
</section>
</section>
</section>
<section id="tweet-text" class="level1">
<h1>Tweet text</h1>
<p>The 100 words used 3 or more times.</p>
<p><img src="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/count-words-1.png" class="img-fluid"></p>


</section>

 ]]></description>
  <category>conference</category>
  <category>visualisation</category>
  <category>twitter</category>
  <guid>https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/</guid>
  <pubDate>Thu, 17 Aug 2017 22:00:00 GMT</pubDate>
  <media:content url="https://lazappi.id.au/posts/2017-08-18-joining-the-dots-twitter-analysis/featured.png" medium="image" type="image/png" height="108" width="144"/>
</item>
<item>
  <title>Building a clustering tree</title>
  <link>https://lazappi.id.au/posts/2017-07-19-building-a-clustering-tree/</link>
  <description><![CDATA[ 





<p>For my PhD I am working on methods for analysing single-cell RNA-sequencing (scRNA-seq) data which measure the expression of genes in individual cells. One of the most common analyses done on this type of data is to cluster the cells, often in an attempt to find out what cell types are present in a sample.</p>
<p>In a recent seminar I showed some images of what I am calling a “clustering tree” (you can see the slides <a href="https://speakerdeck.com/lazappi/wehi-bioinformatics-seminar">here</a> if you are interested). This is a visualisation I came up with to show the relationship between clusterings as the number of clusters is increased. A few people asked how I had made it so here is a short example.</p>
<section id="setup" class="level2">
<h2 class="anchored" data-anchor-id="setup">Setup</h2>
<p>First we need to load the libraries we are going to use:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Simulation</span></span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(splatter)</span>
<span id="cb1-3"></span>
<span id="cb1-4"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Clustering</span></span>
<span id="cb1-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(Seurat) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Installed from https://github.com/satijalab/seurat</span></span>
<span id="cb1-6"></span>
<span id="cb1-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Graphs</span></span>
<span id="cb1-8"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(igraph)</span>
<span id="cb1-9"></span>
<span id="cb1-10"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Plotting</span></span>
<span id="cb1-11"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggraph)</span>
<span id="cb1-12"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(viridis)</span>
<span id="cb1-13"></span>
<span id="cb1-14"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Data manipulation</span></span>
<span id="cb1-15"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidyverse)</span></code></pre></div></div>
<p>For this example I am going to simulate some scRNA-seq data with eight different groups with different numbers of cells using <a href="https://bioconductor.org/packages/splatter"><code>Splatter</code></a>.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1">sim <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">splatSimulateGroups</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">groupCells =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">80</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">60</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>),</span>
<span id="cb2-2">                            <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">seed =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">verbose =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span></code></pre></div></div>
<p>Let’s take a quick look at this to see if it is anything like we would expect:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plotTSNE</span>(sim, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour_by =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Group"</span>)</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2017-07-19-building-a-clustering-tree/sim-tSNE-1.png" class="img-fluid"></p>
<p>Here we can see the different groups, so there should be something for our clustering analysis to find.</p>
<p>The clustering pacakge we are going to use is <a href="http://satijalab.org/seurat/"><code>Seurat</code></a> which uses it’s own object to store the data. Here is a small function I have written to convert from the <code>SCESet</code> object produced by <code>Splatter</code> to the <code>Seurat</code> object required by <code>Seurat</code>.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">SCESetToSeurat <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(sce) {</span>
<span id="cb4-2">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">is</span>(sce,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'SCESet'</span>)) {</span>
<span id="cb4-3">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">stop</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sce must be an SCESet object"</span>)</span>
<span id="cb4-4">    }</span>
<span id="cb4-5"></span>
<span id="cb4-6">    counts <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> scater<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">counts</span>(sce)</span>
<span id="cb4-7"></span>
<span id="cb4-8">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">is.null</span>(counts)) {</span>
<span id="cb4-9">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">stop</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sce must contain counts to convert to Seurat"</span>)</span>
<span id="cb4-10">    }</span>
<span id="cb4-11"></span>
<span id="cb4-12">    seurat <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">new</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"seurat"</span>,</span>
<span id="cb4-13">                    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">raw.data =</span> counts,</span>
<span id="cb4-14">                    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">is.expr =</span> sce<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>lowerDetectionLimit,</span>
<span id="cb4-15">                    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data.info =</span> Biobase<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pData</span>(sce),</span>
<span id="cb4-16">                    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">cell.names =</span> Biobase<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sampleNames</span>(sce))</span>
<span id="cb4-17"></span>
<span id="cb4-18">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(seurat)</span>
<span id="cb4-19">}</span></code></pre></div></div>
<p>Now to convert the dataset.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">seurat <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">SCESetToSeurat</span>(sim)</span></code></pre></div></div>
</section>
<section id="clustering" class="level2">
<h2 class="anchored" data-anchor-id="clustering">Clustering</h2>
<p>We now have a dataset in the format required by <code>Seurat</code>. Before we do any clustering we need to run through some setup steps. I’m not going to explain what they are doing here, if you want to now the details refer to the <code>Seurat</code> <a href="http://satijalab.org/seurat/pbmc-tutorial.html">tutorials</a>.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1">seurat <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Setup</span>(seurat, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">project =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Example"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">meta.data =</span> seurat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>data.info)</span>
<span id="cb6-2">seurat <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">MeanVarPlot</span>(seurat, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fxn.x =</span> expMean, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fxn.y =</span> logVarDivMean,</span>
<span id="cb6-3">                        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x.low.cutoff =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x.high.cutoff =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,</span>
<span id="cb6-4">                        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y.cutoff =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">do.contour =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb6-5"></span>
<span id="cb6-6">seurat <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">PCA</span>(seurat, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">pc.genes =</span> seurat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>var.genes, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">do.print =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span></code></pre></div></div>
<p>Now we can do the clustering. The parameter we are interested in is the <code>resolution</code> parameter which controls how many clusters <code>Seurat</code> returns. I start by setting <code>resolution = 0</code>. This will create a cluster containing all cells that will serve as the root of our tree. We also ask <code>Seurat</code> to store some of the intermediate calculations so we don’t have to do them again when we cluster with different resolutions:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1">seurat <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">FindClusters</span>(seurat, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">pc.use =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">resolution =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">algorithm =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,</span>
<span id="cb7-2">                        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">print.output =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">save.SNN =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span></code></pre></div></div>
<p>We can now loop over a range of resolutions that we are interested in. I have only tried a few values here but if this was a real dataset you might want to try some more.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> (res <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.6</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.9</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.2</span>)) {</span>
<span id="cb8-2">    seurat <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">FindClusters</span>(seurat, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">resolution =</span> res, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">algorithm =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,</span>
<span id="cb8-3">                            <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">print.output =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb8-4">}</span></code></pre></div></div>
</section>
<section id="get-results" class="level2">
<h2 class="anchored" data-anchor-id="get-results">Get results</h2>
<p><code>Seurat</code> stores the cluster labels in the <code>data.info</code> slot in columns starting with <code>res.</code>. This is the only part we are interested in so let’s extract just those columns.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1">clusterings <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> seurat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>data.info <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">contains</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"res."</span>))</span>
<span id="cb9-2"></span>
<span id="cb9-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(clusterings)</span>
<span id="cb9-4"></span>
<span id="cb9-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##       res.0 res.0.3 res.0.6 res.0.9 res.1.2</span></span>
<span id="cb9-6"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Cell1     0       1       0       0       0</span></span>
<span id="cb9-7"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Cell2     0       1       0       0       0</span></span>
<span id="cb9-8"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Cell3     0       1       0       0       0</span></span>
<span id="cb9-9"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Cell4     0       1       0       0       0</span></span>
<span id="cb9-10"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Cell5     0       1       0       0       0</span></span>
<span id="cb9-11"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Cell6     0       1       0       0       0</span></span></code></pre></div></div>
<p>We now know which cluster each cell was assigned to at each resolution but to build the tree we need some more information. This next function looks at two neighbouring resolutions and works out how many cells moved from a cluster in the lower resolution to each cluster in the higher resolution. These transitions are going to form the edges of our tree.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1">getEdges <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(clusterings) {</span>
<span id="cb10-2"></span>
<span id="cb10-3">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Loop over the different resolutions</span></span>
<span id="cb10-4">    transitions <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lapply</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ncol</span>(clusterings) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(i) {</span>
<span id="cb10-5"></span>
<span id="cb10-6">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Extract two neighbouring clusterings</span></span>
<span id="cb10-7">        from.res <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sort</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">colnames</span>(clusterings))[i]</span>
<span id="cb10-8">        to.res <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sort</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">colnames</span>(clusterings))[i <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb10-9"></span>
<span id="cb10-10">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Get the cluster names</span></span>
<span id="cb10-11">        from.clusters <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sort</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unique</span>(clusterings[, from.res]))</span>
<span id="cb10-12">        to.clusters <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sort</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unique</span>(clusterings[, to.res]))</span>
<span id="cb10-13"></span>
<span id="cb10-14">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Get all possible combinations</span></span>
<span id="cb10-15">        trans.df <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expand.grid</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">FromClust =</span> from.clusters,</span>
<span id="cb10-16">                                <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ToClust =</span> to.clusters)</span>
<span id="cb10-17"></span>
<span id="cb10-18">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Loop over the possible transitions</span></span>
<span id="cb10-19">        trans <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">apply</span>(trans.df, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x) {</span>
<span id="cb10-20">            from.clust <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> x[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb10-21">            to.clust <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> x[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]</span>
<span id="cb10-22"></span>
<span id="cb10-23">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Find the cells from those clusters</span></span>
<span id="cb10-24">            is.from <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> clusterings[, from.res] <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> from.clust</span>
<span id="cb10-25">            is.to <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> clusterings[, to.res] <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> to.clust</span>
<span id="cb10-26"></span>
<span id="cb10-27">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Count them up</span></span>
<span id="cb10-28">            trans.count <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>(is.from <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&amp;</span> is.to)</span>
<span id="cb10-29"></span>
<span id="cb10-30">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Get the sizes of the two clusters</span></span>
<span id="cb10-31">            from.size <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>(is.from)</span>
<span id="cb10-32">            to.size <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>(is.to)</span>
<span id="cb10-33"></span>
<span id="cb10-34">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Get the proportions of cells moving along this edge</span></span>
<span id="cb10-35">            trans.prop.from <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> trans.count <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> from.size</span>
<span id="cb10-36">            trans.prop.to <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> trans.count <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> to.size</span>
<span id="cb10-37"></span>
<span id="cb10-38">            <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(trans.count, trans.prop.from, trans.prop.to))</span>
<span id="cb10-39">        })</span>
<span id="cb10-40"></span>
<span id="cb10-41">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Tidy up the results</span></span>
<span id="cb10-42">        trans.df<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>FromRes <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.numeric</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gsub</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"res."</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>, from.res))</span>
<span id="cb10-43">        trans.df<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>ToRes <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.numeric</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gsub</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"res."</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>, to.res))</span>
<span id="cb10-44">        trans.df<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>TransCount <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> trans[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, ]</span>
<span id="cb10-45">        trans.df<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>TransPropFrom <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> trans[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, ]</span>
<span id="cb10-46">        trans.df<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>TransPropTo <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> trans[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, ]</span>
<span id="cb10-47"></span>
<span id="cb10-48">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(trans.df)</span>
<span id="cb10-49">    })</span>
<span id="cb10-50"></span>
<span id="cb10-51">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Bind the results from the different resolutions together</span></span>
<span id="cb10-52">    transitions <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">do.call</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rbind"</span>, transitions)</span>
<span id="cb10-53"></span>
<span id="cb10-54">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Tidy everything up</span></span>
<span id="cb10-55">    levs <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sort</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.numeric</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">levels</span>(transitions<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>ToClust)))</span>
<span id="cb10-56">    transitions <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> transitions <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-57">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">FromClust =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(FromClust,</span>
<span id="cb10-58">                                    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">levels =</span> levs))  <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-59">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ToClust =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(ToClust, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">levels =</span> levs))</span>
<span id="cb10-60"></span>
<span id="cb10-61">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(transitions)</span>
<span id="cb10-62">}</span>
<span id="cb10-63"></span>
<span id="cb10-64">edges <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">getEdges</span>(clusterings)</span>
<span id="cb10-65"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(edges)</span>
<span id="cb10-66"></span>
<span id="cb10-67"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##   FromClust ToClust FromRes ToRes TransCount TransPropFrom TransPropTo</span></span>
<span id="cb10-68"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 1         0       0     0.0   0.3        135     0.3600000           1</span></span>
<span id="cb10-69"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 2         0       1     0.0   0.3        100     0.2666667           1</span></span>
<span id="cb10-70"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 3         0       2     0.0   0.3         60     0.1600000           1</span></span>
<span id="cb10-71"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 4         0       3     0.0   0.3         50     0.1333333           1</span></span>
<span id="cb10-72"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 5         0       4     0.0   0.3         30     0.0800000           1</span></span>
<span id="cb10-73"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 6         0       0     0.3   0.6          0     0.0000000           0</span></span></code></pre></div></div>
<p>Some of these columns are pretty obvious but the last three could do with an explanation. <code>TransCount</code> is the number of cells that move along this edge. <code>TransPropFrom</code> is the proportion of the cells in the lower resolution cluster that have made this transition and <code>TransPropTo</code> is the proportion of cells in the higher resolution cluster that came from this edge.</p>
<p>Getting the information about the nodes of the tree is easier as these just represent the clusters. This function summarises the cluster information and converts it to long format.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1">getNodes <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(clusterings) {</span>
<span id="cb11-2">    nodes <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> clusterings <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-3">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gather</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">key =</span> Res, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">value =</span> Cluster) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-4">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(Res, Cluster) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-5">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summarise</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Size =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">n</span>()) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-6">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ungroup</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-7">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Res =</span> stringr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str_replace</span>(Res, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"res."</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-8">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Res =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.numeric</span>(Res), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Cluster =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.numeric</span>(Cluster)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-9">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Node =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste0</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"R"</span>, Res, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"C"</span>, Cluster)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb11-10">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(Node, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">everything</span>())</span>
<span id="cb11-11">}</span>
<span id="cb11-12"></span>
<span id="cb11-13">nodes <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">getNodes</span>(clusterings)</span>
<span id="cb11-14"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(nodes)</span>
<span id="cb11-15"></span>
<span id="cb11-16"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## # A tibble: 6 x 4</span></span>
<span id="cb11-17"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##     Node   Res Cluster  Size</span></span>
<span id="cb11-18"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##    &lt;chr&gt; &lt;dbl&gt;   &lt;dbl&gt; &lt;int&gt;</span></span>
<span id="cb11-19"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 1   R0C0   0.0       0   375</span></span>
<span id="cb11-20"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 2 R0.3C0   0.3       0   135</span></span>
<span id="cb11-21"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 3 R0.3C1   0.3       1   100</span></span>
<span id="cb11-22"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 4 R0.3C2   0.3       2    60</span></span>
<span id="cb11-23"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 5 R0.3C3   0.3       3    50</span></span>
<span id="cb11-24"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## 6 R0.3C4   0.3       4    30</span></span></code></pre></div></div>
<p>Each node needs a unique ID which I have made by combining the resolution and cluster number. We also record the number of cells in each cluster.</p>
<p>Now we can build the graph we will use as the starting point for our plot. Some of the possible edges between clusters will have no cells travelling along them so we filter them out. We also remove edges that correspond to a small proportion (&lt; 2%) of cells in the higher resolution cluster.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1">graph <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> edges <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb12-2">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Remove edges without any cell...</span></span>
<span id="cb12-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(TransCount <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb12-4">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># ...or making up only a small proportion of the new cluster</span></span>
<span id="cb12-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(TransPropTo <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.02</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb12-6">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Rename the nodes</span></span>
<span id="cb12-7">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">FromNode =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste0</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"R"</span>, FromRes, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"C"</span>, FromClust)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb12-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ToNode =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste0</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"R"</span>, ToRes, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"C"</span>, ToClust)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb12-9">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Reorder columns</span></span>
<span id="cb12-10">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(FromNode, ToNode, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">everything</span>()) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb12-11">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Build a graph using igraph</span></span>
<span id="cb12-12">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">graph_from_data_frame</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">vertices =</span> nodes)</span>
<span id="cb12-13"></span>
<span id="cb12-14"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">print</span>(graph)</span>
<span id="cb12-15"></span>
<span id="cb12-16"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## IGRAPH b1b93c3 DN-- 23 23 --</span></span>
<span id="cb12-17"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## + attr: name (v/c), Res (v/n), Cluster (v/n), Size (v/n),</span></span>
<span id="cb12-18"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## | FromClust (e/c), ToClust (e/c), FromRes (e/n), ToRes (e/n),</span></span>
<span id="cb12-19"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## | TransCount (e/n), TransPropFrom (e/n), TransPropTo (e/n)</span></span>
<span id="cb12-20"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## + edges from b1b93c3 (vertex names):</span></span>
<span id="cb12-21"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  [1] R0C0  -&gt;R0.3C0 R0C0  -&gt;R0.3C1 R0C0  -&gt;R0.3C2 R0C0  -&gt;R0.3C3</span></span>
<span id="cb12-22"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  [5] R0C0  -&gt;R0.3C4 R0.3C1-&gt;R0.6C0 R0.3C0-&gt;R0.6C1 R0.3C4-&gt;R0.6C1</span></span>
<span id="cb12-23"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  [9] R0.3C0-&gt;R0.6C2 R0.3C2-&gt;R0.6C3 R0.3C3-&gt;R0.6C4 R0.6C0-&gt;R0.9C0</span></span>
<span id="cb12-24"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## [13] R0.6C2-&gt;R0.9C1 R0.6C3-&gt;R0.9C2 R0.6C1-&gt;R0.9C3 R0.6C4-&gt;R0.9C4</span></span>
<span id="cb12-25"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## [17] R0.6C1-&gt;R0.9C5 R0.9C0-&gt;R1.2C0 R0.9C1-&gt;R1.2C1 R0.9C2-&gt;R1.2C2</span></span>
<span id="cb12-26"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## [21] R0.9C3-&gt;R1.2C3 R0.9C4-&gt;R1.2C4 R0.9C5-&gt;R1.2C5</span></span></code></pre></div></div>
</section>
<section id="plot-the-tree" class="level2">
<h2 class="anchored" data-anchor-id="plot-the-tree">Plot the tree</h2>
<p>The last step is to pass our graph to the <a href="https://github.com/thomasp85/ggraph"><code>ggraph</code></a> library for plotting.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Plot our graph using the `tree` layout</span></span>
<span id="cb13-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(graph, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">layout =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tree"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb13-3">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Plot the edges, colour is the number of cells and transparency is the</span></span>
<span id="cb13-4">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># proportion contribution to the new cluster</span></span>
<span id="cb13-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_link</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">arrow =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrow</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">length =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unit</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'mm'</span>)),</span>
<span id="cb13-6">                    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_cap =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">circle</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3.5</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mm"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">edge_width =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,</span>
<span id="cb13-7">                    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log</span>(TransCount), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">alpha =</span> TransPropTo)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb13-8">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Plot the nodes, size is the number of cells</span></span>
<span id="cb13-9">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(Res),</span>
<span id="cb13-10">                        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> Size)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb13-11">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_text</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> Cluster), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb13-12">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Adjust the scales</span></span>
<span id="cb13-13">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_size</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">range =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb13-14">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_edge_colour_gradientn</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colours =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">viridis</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb13-15">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Add legend labels</span></span>
<span id="cb13-16">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">guides</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">guide_legend</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Cluster Size"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title.position =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"top"</span>),</span>
<span id="cb13-17">            <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">guide_legend</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Clustering Resolution"</span>,</span>
<span id="cb13-18">                                    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title.position =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"top"</span>),</span>
<span id="cb13-19">            <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">edge_colour =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">guide_edge_colorbar</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Cell Count (log)"</span>,</span>
<span id="cb13-20">                                                <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title.position =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"top"</span>),</span>
<span id="cb13-21">            <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">edge_alpha =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">guide_legend</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Cluster Prop"</span>,</span>
<span id="cb13-22">                                        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title.position =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"top"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nrow =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb13-23">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Remove the axes as they don't really mean anything</span></span>
<span id="cb13-24">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_void</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb13-25">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.position =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bottom"</span>)</span></code></pre></div></div>
<p><img src="https://lazappi.id.au/posts/2017-07-19-building-a-clustering-tree/plot-tree-1.png" class="img-fluid"></p>
<p>And here is the result! We can see that see that <code>Seurat</code> finds three of the clusters easily and that these don’t change as the resolution increases. A fourth group contains most of the cells and is sub-divided as we increase resolution. Interestingly at the lowest resolution there is a small cluster which is then absorbed into one of the other branches.</p>
<p>This tree is cleaner and has less branches than what we would be likely to see with a real dataset but the process to create it would be the same. I have used <code>Seurat</code> as the clustering method in this example but it should be easy to adapt the process to any other method that allows you to adjust the number of clusters. I have found this visualisation useful in my analysis particularly for looking at which clusters are very distinct and the relationships between different clusters and clusterings.</p>
<p>Good luck creating your own clustering trees!</p>
<section id="session-information" class="level3">
<h3 class="anchored" data-anchor-id="session-information">Session information</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1">devtools<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">session_info</span>()</span>
<span id="cb14-2"></span>
<span id="cb14-3"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Session info -------------------------------------------------------------</span></span>
<span id="cb14-4"></span>
<span id="cb14-5"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  setting  value</span></span>
<span id="cb14-6"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  version  R version 3.4.1 (2017-06-30)</span></span>
<span id="cb14-7"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  system   x86_64, darwin15.6.0</span></span>
<span id="cb14-8"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  ui       RStudio (1.0.143)</span></span>
<span id="cb14-9"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  language (EN)</span></span>
<span id="cb14-10"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  collate  en_AU.UTF-8</span></span>
<span id="cb14-11"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  tz       Australia/Melbourne</span></span>
<span id="cb14-12"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  date     2017-07-19</span></span>
<span id="cb14-13"></span>
<span id="cb14-14"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">## Packages -----------------------------------------------------------------</span></span>
<span id="cb14-15"></span>
<span id="cb14-16"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  package        * version  date       source</span></span>
<span id="cb14-17"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  AnnotationDbi    1.38.1   2017-06-01 Bioconductor</span></span>
<span id="cb14-18"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  ape              4.1      2017-02-14 cran (@4.1)</span></span>
<span id="cb14-19"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  assertthat       0.2.0    2017-04-11 CRAN (R 3.4.0)</span></span>
<span id="cb14-20"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  backports        1.1.0    2017-05-22 CRAN (R 3.4.0)</span></span>
<span id="cb14-21"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  base           * 3.4.1    2017-07-07 local</span></span>
<span id="cb14-22"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  beeswarm         0.2.3    2016-04-25 CRAN (R 3.4.0)</span></span>
<span id="cb14-23"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  bindr            0.1      2016-11-13 CRAN (R 3.4.0)</span></span>
<span id="cb14-24"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  bindrcpp       * 0.2      2017-06-17 CRAN (R 3.4.0)</span></span>
<span id="cb14-25"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  Biobase        * 2.36.2   2017-05-04 Bioconductor</span></span>
<span id="cb14-26"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  BiocGenerics   * 0.22.0   2017-04-25 Bioconductor</span></span>
<span id="cb14-27"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  BiocParallel     1.10.1   2017-05-03 Bioconductor</span></span>
<span id="cb14-28"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  biomaRt          2.32.1   2017-06-09 Bioconductor</span></span>
<span id="cb14-29"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  bit              1.1-12   2014-04-09 CRAN (R 3.4.0)</span></span>
<span id="cb14-30"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  bit64            0.9-7    2017-05-08 CRAN (R 3.4.0)</span></span>
<span id="cb14-31"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  bitops           1.0-6    2013-08-17 CRAN (R 3.4.0)</span></span>
<span id="cb14-32"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  blob             1.1.0    2017-06-17 CRAN (R 3.4.0)</span></span>
<span id="cb14-33"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  broom            0.4.2    2017-02-13 CRAN (R 3.4.0)</span></span>
<span id="cb14-34"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  car              2.1-5    2017-07-04 cran (@2.1-5)</span></span>
<span id="cb14-35"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  caret            6.0-76   2017-04-18 cran (@6.0-76)</span></span>
<span id="cb14-36"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  caTools          1.17.1   2014-09-10 cran (@1.17.1)</span></span>
<span id="cb14-37"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  cellranger       1.1.0    2016-07-27 CRAN (R 3.4.0)</span></span>
<span id="cb14-38"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  checkmate        1.8.3    2017-07-03 CRAN (R 3.4.1)</span></span>
<span id="cb14-39"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  class            7.3-14   2015-08-30 CRAN (R 3.4.1)</span></span>
<span id="cb14-40"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  cluster          2.0.6    2017-03-10 CRAN (R 3.4.1)</span></span>
<span id="cb14-41"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  codetools        0.2-15   2016-10-05 CRAN (R 3.4.1)</span></span>
<span id="cb14-42"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  colorspace       1.3-2    2016-12-14 CRAN (R 3.4.0)</span></span>
<span id="cb14-43"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  compiler         3.4.1    2017-07-07 local</span></span>
<span id="cb14-44"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  cowplot        * 0.7.0    2016-10-28 cran (@0.7.0)</span></span>
<span id="cb14-45"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  data.table       1.10.4   2017-02-01 CRAN (R 3.4.0)</span></span>
<span id="cb14-46"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  datasets       * 3.4.1    2017-07-07 local</span></span>
<span id="cb14-47"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  DBI              0.7      2017-06-18 CRAN (R 3.4.0)</span></span>
<span id="cb14-48"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  DEoptimR         1.0-8    2016-11-19 cran (@1.0-8)</span></span>
<span id="cb14-49"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  devtools         1.13.2   2017-06-02 CRAN (R 3.4.0)</span></span>
<span id="cb14-50"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  digest           0.6.12   2017-01-27 CRAN (R 3.4.0)</span></span>
<span id="cb14-51"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  diptest          0.75-7   2016-12-05 cran (@0.75-7)</span></span>
<span id="cb14-52"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  dplyr          * 0.7.1    2017-06-22 CRAN (R 3.4.1)</span></span>
<span id="cb14-53"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  edgeR            3.18.1   2017-05-06 Bioconductor</span></span>
<span id="cb14-54"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  evaluate         0.10.1   2017-06-24 CRAN (R 3.4.1)</span></span>
<span id="cb14-55"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  fastICA          1.2-1    2017-06-12 cran (@1.2-1)</span></span>
<span id="cb14-56"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  flexmix          2.3-14   2017-04-28 cran (@2.3-14)</span></span>
<span id="cb14-57"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  FNN              1.1      2013-07-31 cran (@1.1)</span></span>
<span id="cb14-58"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  forcats          0.2.0    2017-01-23 CRAN (R 3.4.0)</span></span>
<span id="cb14-59"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  foreach          1.4.3    2015-10-13 cran (@1.4.3)</span></span>
<span id="cb14-60"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  foreign          0.8-69   2017-06-22 CRAN (R 3.4.1)</span></span>
<span id="cb14-61"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  fpc              2.1-10   2015-08-14 cran (@2.1-10)</span></span>
<span id="cb14-62"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  gdata            2.18.0   2017-06-06 cran (@2.18.0)</span></span>
<span id="cb14-63"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  ggbeeswarm       0.5.3    2016-12-01 CRAN (R 3.4.0)</span></span>
<span id="cb14-64"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  ggforce          0.1.1    2016-11-28 CRAN (R 3.4.0)</span></span>
<span id="cb14-65"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  ggplot2        * 2.2.1    2016-12-30 CRAN (R 3.4.0)</span></span>
<span id="cb14-66"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  ggraph         * 1.0.0    2017-02-24 CRAN (R 3.4.0)</span></span>
<span id="cb14-67"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  ggrepel          0.6.5    2016-11-24 CRAN (R 3.4.0)</span></span>
<span id="cb14-68"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  glue             1.1.1    2017-06-21 CRAN (R 3.4.1)</span></span>
<span id="cb14-69"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  gplots           3.0.1    2016-03-30 cran (@3.0.1)</span></span>
<span id="cb14-70"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  graphics       * 3.4.1    2017-07-07 local</span></span>
<span id="cb14-71"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  grDevices      * 3.4.1    2017-07-07 local</span></span>
<span id="cb14-72"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  grid             3.4.1    2017-07-07 local</span></span>
<span id="cb14-73"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  gridExtra        2.2.1    2016-02-29 CRAN (R 3.4.0)</span></span>
<span id="cb14-74"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  gtable           0.2.0    2016-02-26 CRAN (R 3.4.0)</span></span>
<span id="cb14-75"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  gtools           3.5.0    2015-05-29 cran (@3.5.0)</span></span>
<span id="cb14-76"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  haven            1.1.0    2017-07-09 CRAN (R 3.4.1)</span></span>
<span id="cb14-77"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  hms              0.3      2016-11-22 CRAN (R 3.4.0)</span></span>
<span id="cb14-78"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  htmltools        0.3.6    2017-04-28 CRAN (R 3.4.0)</span></span>
<span id="cb14-79"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  httpuv           1.3.5    2017-07-04 CRAN (R 3.4.1)</span></span>
<span id="cb14-80"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  httr             1.2.1    2016-07-03 CRAN (R 3.4.0)</span></span>
<span id="cb14-81"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  igraph         * 1.1.1    2017-07-16 CRAN (R 3.4.1)</span></span>
<span id="cb14-82"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  IRanges          2.10.2   2017-05-25 Bioconductor</span></span>
<span id="cb14-83"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  irlba            2.2.1    2017-05-17 cran (@2.2.1)</span></span>
<span id="cb14-84"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  iterators        1.0.8    2015-10-13 cran (@1.0.8)</span></span>
<span id="cb14-85"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  jsonlite         1.5      2017-06-01 CRAN (R 3.4.0)</span></span>
<span id="cb14-86"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  kernlab          0.9-25   2016-10-03 cran (@0.9-25)</span></span>
<span id="cb14-87"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  KernSmooth       2.23-15  2015-06-29 CRAN (R 3.4.1)</span></span>
<span id="cb14-88"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  knitr            1.16     2017-05-18 CRAN (R 3.4.1)</span></span>
<span id="cb14-89"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  labeling         0.3      2014-08-23 CRAN (R 3.4.0)</span></span>
<span id="cb14-90"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  lars             1.2      2013-04-24 cran (@1.2)</span></span>
<span id="cb14-91"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  lattice          0.20-35  2017-03-25 CRAN (R 3.4.1)</span></span>
<span id="cb14-92"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  lazyeval         0.2.0    2016-06-12 CRAN (R 3.4.0)</span></span>
<span id="cb14-93"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  limma            3.32.3   2017-07-16 Bioconductor</span></span>
<span id="cb14-94"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  lme4             1.1-13   2017-04-19 cran (@1.1-13)</span></span>
<span id="cb14-95"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  locfit           1.5-9.1  2013-04-20 CRAN (R 3.4.0)</span></span>
<span id="cb14-96"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  lubridate        1.6.0    2016-09-13 CRAN (R 3.4.0)</span></span>
<span id="cb14-97"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  magrittr         1.5      2014-11-22 CRAN (R 3.4.0)</span></span>
<span id="cb14-98"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  MASS             7.3-47   2017-02-26 CRAN (R 3.4.1)</span></span>
<span id="cb14-99"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  Matrix           1.2-10   2017-05-03 CRAN (R 3.4.1)</span></span>
<span id="cb14-100"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  MatrixModels     0.4-1    2015-08-22 cran (@0.4-1)</span></span>
<span id="cb14-101"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  matrixStats      0.52.2   2017-04-14 CRAN (R 3.4.0)</span></span>
<span id="cb14-102"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  mclust           5.3      2017-05-21 cran (@5.3)</span></span>
<span id="cb14-103"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  memoise          1.1.0    2017-04-21 CRAN (R 3.4.0)</span></span>
<span id="cb14-104"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  methods        * 3.4.1    2017-07-07 local</span></span>
<span id="cb14-105"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  mgcv             1.8-17   2017-02-08 CRAN (R 3.4.1)</span></span>
<span id="cb14-106"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  mime             0.5      2016-07-07 CRAN (R 3.4.0)</span></span>
<span id="cb14-107"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  minqa            1.2.4    2014-10-09 cran (@1.2.4)</span></span>
<span id="cb14-108"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  mixtools         1.1.0    2017-03-10 cran (@1.1.0)</span></span>
<span id="cb14-109"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  mnormt           1.5-5    2016-10-15 CRAN (R 3.4.0)</span></span>
<span id="cb14-110"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  ModelMetrics     1.1.0    2016-08-26 cran (@1.1.0)</span></span>
<span id="cb14-111"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  modelr           0.1.0    2016-08-31 CRAN (R 3.4.0)</span></span>
<span id="cb14-112"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  modeltools       0.2-21   2013-09-02 cran (@0.2-21)</span></span>
<span id="cb14-113"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  munsell          0.4.3    2016-02-13 CRAN (R 3.4.0)</span></span>
<span id="cb14-114"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  mvtnorm          1.0-6    2017-03-02 cran (@1.0-6)</span></span>
<span id="cb14-115"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  nlme             3.1-131  2017-02-06 CRAN (R 3.4.1)</span></span>
<span id="cb14-116"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  nloptr           1.0.4    2014-08-04 cran (@1.0.4)</span></span>
<span id="cb14-117"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  nnet             7.3-12   2016-02-02 CRAN (R 3.4.1)</span></span>
<span id="cb14-118"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  numDeriv         2016.8-1 2016-08-27 cran (@2016.8-)</span></span>
<span id="cb14-119"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  parallel       * 3.4.1    2017-07-07 local</span></span>
<span id="cb14-120"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  pbapply          1.3-3    2017-07-04 cran (@1.3-3)</span></span>
<span id="cb14-121"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  pbkrtest         0.4-7    2017-03-15 cran (@0.4-7)</span></span>
<span id="cb14-122"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  pkgconfig        2.0.1    2017-03-21 CRAN (R 3.4.0)</span></span>
<span id="cb14-123"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  plyr             1.8.4    2016-06-08 CRAN (R 3.4.0)</span></span>
<span id="cb14-124"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  prabclus         2.2-6    2015-01-14 cran (@2.2-6)</span></span>
<span id="cb14-125"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  psych            1.7.5    2017-05-03 CRAN (R 3.4.1)</span></span>
<span id="cb14-126"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  purrr          * 0.2.2.2  2017-05-11 CRAN (R 3.4.0)</span></span>
<span id="cb14-127"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  quantreg         5.33     2017-04-18 cran (@5.33)</span></span>
<span id="cb14-128"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  R6               2.2.2    2017-06-17 CRAN (R 3.4.0)</span></span>
<span id="cb14-129"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  ranger           0.8.0    2017-06-20 cran (@0.8.0)</span></span>
<span id="cb14-130"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  RColorBrewer     1.1-2    2014-12-07 CRAN (R 3.4.0)</span></span>
<span id="cb14-131"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  Rcpp             0.12.12  2017-07-15 CRAN (R 3.4.1)</span></span>
<span id="cb14-132"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  RCurl            1.95-4.8 2016-03-01 CRAN (R 3.4.0)</span></span>
<span id="cb14-133"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  readr          * 1.1.1    2017-05-16 CRAN (R 3.4.0)</span></span>
<span id="cb14-134"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  readxl           1.0.0    2017-04-18 CRAN (R 3.4.0)</span></span>
<span id="cb14-135"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  reshape2         1.4.2    2016-10-22 CRAN (R 3.4.0)</span></span>
<span id="cb14-136"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  rhdf5            2.20.0   2017-04-25 Bioconductor</span></span>
<span id="cb14-137"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  rjson            0.2.15   2014-11-03 CRAN (R 3.4.0)</span></span>
<span id="cb14-138"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  rlang            0.1.1    2017-05-18 CRAN (R 3.4.0)</span></span>
<span id="cb14-139"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  rmarkdown        1.6      2017-06-15 CRAN (R 3.4.1)</span></span>
<span id="cb14-140"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  robustbase       0.92-7   2016-12-09 cran (@0.92-7)</span></span>
<span id="cb14-141"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  ROCR             1.0-7    2015-03-26 cran (@1.0-7)</span></span>
<span id="cb14-142"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  rprojroot        1.2      2017-01-16 CRAN (R 3.4.0)</span></span>
<span id="cb14-143"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  RSQLite          2.0      2017-06-19 CRAN (R 3.4.1)</span></span>
<span id="cb14-144"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  Rtsne            0.13     2017-04-14 cran (@0.13)</span></span>
<span id="cb14-145"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  rvest            0.3.2    2016-06-17 CRAN (R 3.4.0)</span></span>
<span id="cb14-146"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  S4Vectors        0.14.3   2017-06-03 Bioconductor</span></span>
<span id="cb14-147"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  scales           0.4.1    2016-11-09 CRAN (R 3.4.0)</span></span>
<span id="cb14-148"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  scater         * 1.4.0    2017-04-25 Bioconductor</span></span>
<span id="cb14-149"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  segmented        0.5-2.1  2017-06-14 cran (@0.5-2.1)</span></span>
<span id="cb14-150"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  Seurat         * 1.4.0.16 2017-07-19 Github (satijalab/seurat@3bd092a)</span></span>
<span id="cb14-151"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  shiny            1.0.3    2017-04-26 CRAN (R 3.4.0)</span></span>
<span id="cb14-152"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  shinydashboard   0.6.1    2017-06-14 CRAN (R 3.4.0)</span></span>
<span id="cb14-153"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  sn               1.5-0    2017-02-10 cran (@1.5-0)</span></span>
<span id="cb14-154"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  SparseM          1.77     2017-04-23 cran (@1.77)</span></span>
<span id="cb14-155"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  splatter       * 1.0.3    2017-05-27 Bioconductor</span></span>
<span id="cb14-156"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  splines          3.4.1    2017-07-07 local</span></span>
<span id="cb14-157"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  stats          * 3.4.1    2017-07-07 local</span></span>
<span id="cb14-158"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  stats4           3.4.1    2017-07-07 local</span></span>
<span id="cb14-159"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  stringi          1.1.5    2017-04-07 CRAN (R 3.4.0)</span></span>
<span id="cb14-160"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  stringr          1.2.0    2017-02-18 CRAN (R 3.4.0)</span></span>
<span id="cb14-161"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  survival         2.41-3   2017-04-04 CRAN (R 3.4.1)</span></span>
<span id="cb14-162"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  tclust           1.2-7    2017-06-30 cran (@1.2-7)</span></span>
<span id="cb14-163"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  tibble         * 1.3.3    2017-05-28 CRAN (R 3.4.0)</span></span>
<span id="cb14-164"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  tidyr          * 0.6.3    2017-05-15 CRAN (R 3.4.0)</span></span>
<span id="cb14-165"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  tidyverse      * 1.1.1    2017-01-27 CRAN (R 3.4.0)</span></span>
<span id="cb14-166"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  tools            3.4.1    2017-07-07 local</span></span>
<span id="cb14-167"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  trimcluster      0.1-2    2012-10-29 cran (@0.1-2)</span></span>
<span id="cb14-168"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  tsne             0.1-3    2016-07-15 cran (@0.1-3)</span></span>
<span id="cb14-169"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  tweenr           0.1.5    2016-10-10 CRAN (R 3.4.0)</span></span>
<span id="cb14-170"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  tximport         1.4.0    2017-04-25 Bioconductor</span></span>
<span id="cb14-171"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  udunits2         0.13     2016-11-17 CRAN (R 3.4.0)</span></span>
<span id="cb14-172"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  units            0.4-5    2017-06-15 CRAN (R 3.4.0)</span></span>
<span id="cb14-173"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  utils          * 3.4.1    2017-07-07 local</span></span>
<span id="cb14-174"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  VGAM             1.0-3    2017-01-11 cran (@1.0-3)</span></span>
<span id="cb14-175"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  vipor            0.4.5    2017-03-22 CRAN (R 3.4.0)</span></span>
<span id="cb14-176"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  viridis        * 0.4.0    2017-03-27 CRAN (R 3.4.0)</span></span>
<span id="cb14-177"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  viridisLite    * 0.2.0    2017-03-24 CRAN (R 3.4.0)</span></span>
<span id="cb14-178"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  withr            1.0.2    2016-06-20 CRAN (R 3.4.0)</span></span>
<span id="cb14-179"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  XML              3.98-1.9 2017-06-19 CRAN (R 3.4.1)</span></span>
<span id="cb14-180"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  xml2             1.1.1    2017-01-24 CRAN (R 3.4.0)</span></span>
<span id="cb14-181"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  xtable           1.8-2    2016-02-05 CRAN (R 3.4.0)</span></span>
<span id="cb14-182"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  yaml             2.1.14   2016-11-12 CRAN (R 3.4.0)</span></span>
<span id="cb14-183"><span class="do" style="color: #5E5E5E;
background-color: null;
font-style: italic;">##  zlibbioc         1.22.0   2017-04-25 Bioconductor</span></span></code></pre></div></div>


</section>
</section>

 ]]></description>
  <category>R</category>
  <category>scrna-seq</category>
  <category>clustering tree</category>
  <guid>https://lazappi.id.au/posts/2017-07-19-building-a-clustering-tree/</guid>
  <pubDate>Tue, 18 Jul 2017 22:00:00 GMT</pubDate>
  <media:content url="https://lazappi.id.au/posts/2017-07-19-building-a-clustering-tree/featured.png" medium="image" type="image/png" height="103" width="144"/>
</item>
<item>
  <title>PyConAU 2016</title>
  <link>https://lazappi.id.au/posts/2016-08-18-pyconau-2016/</link>
  <description><![CDATA[ 





<p>Over the weekend I attended <a href="https://2016.pycon-au.org/">PyCon Australia</a>. This was my first time at a purely tech conference and I couldn’t help but compare it to my previous experiences at scientific conferences.</p>
<p><strong>DISCLAIMER:</strong> Like I said this was my first tech conference and my scientific conference experience is also fairly limited so some of the comments I make might be generalisations that don’t always apply.</p>
<p>PyCon started with miniconfs on Friday and continued coding sprints on Monday and Tuesday. I didn’t attend any of these so my experience was only of the main conference on Saturday and Sunday. Here are some of the highlights for me in terms of presentations:</p>
<ul>
<li>Andrew Lonsdale - <a href="https://www.youtube.com/watch?v=PCZS9wqBUuE">Python for science, side projects and stuff!</a></li>
<li>Alexander Hogue - <a href="https://www.youtube.com/watch?v=MkSkqMvGBuo">Graphing when your Facebook friends are awake</a> - Story of discovering a hidden Facebook API and using it to track when your friends are online. Thoroughly entertaining while still providing the technical details.</li>
<li>Rachel Bunder - <a href="https://www.youtube.com/watch?v=cy5n6XAtA-w">I wish I learnt that earlier!</a> - Description of some of the slightly more advanced features available in Python. Could be a great intro for someone new to Python.</li>
<li>Russell Keith-Magee - <a href="https://www.youtube.com/watch?v=1sDyVJm3Ht0">Python All the Things</a> <img src="https://lazappi.id.au/posts/2016-08-18-pyconau-2016/python_all_the_things.jpg" class="img-fluid" alt="Python All the Things"></li>
<li>Sebastian Vetter - <a href="https://www.youtube.com/watch?v=bsJFMtQ5MZU">Click: A Pleasure To Write, A Pleasure To Use</a> - Click is an argument parsing library with additional features beyond argparse. Also apparently becoming the standard at Facebook (didn’t learn that at the conference, but it’s a fun fact).</li>
<li>Justin Warren - <a href="https://www.youtube.com/watch?v=qjTc5q7MsMg">Predicting the TripleJ Hottest 100 With Python</a> - Overview of predicting the Hottest 100 for the last few years, starting with the method used by the <a href="http://warmest100.com.au/2013/index.html">Warmest 100</a> and continuing on how to extract and process information from Instagram.</li>
<li>Jackson Fairchild - <a href="https://www.youtube.com/watch?v=Rdc06jpjVIY">Hitting the Wall and How to Get Up Again - Tackling Burnout and Strategies for Self Care</a> <img src="https://lazappi.id.au/posts/2016-08-18-pyconau-2016/tackling_burnout.jpg" class="img-fluid" alt="Hitting the Wall and How to Get Up Again - Tackling Burnout and Strategies for Self Care"></li>
</ul>
<p>(Full schedules for <a href="https://2016.pycon-au.org/programme/schedule/saturday?_code=301">Saturday</a> and <a href="https://2016.pycon-au.org/programme/schedule/sunday?_code=301">Sunday</a> and links to videos are available on the PyCon website)</p>
<p>Overall I was really impressed by the quality of the talks. There were a couple that I thought could be improved a bit or where I wasn’t that interested in the content but there were no flat-out bad talks like you often see in the scientific context. It was clear that the presenters had put a lot of effort into planning what they were going to say and how to make that interesting and engaging for an audience that might be new to the topic. I don’t think I saw any slides that were walls of text or full of multiple plots. On the other had there was lots of code in slides, including live snippets. I’m not usually a fan of this but in the context it makes sense, particularly as you can assume that everyone has a basic grasp of Python. There were also lots of live demos, some of which were pretty impressive, and I don’t think I saw any fail.</p>
<p>What struck me as being the biggest differences at PyCon compared to a scientific conference was the sense of a community and awareness of wider social issues. There was a big effort to be inclusive to all genders, sexualities, ethnic groups etc. and several of the talks touched on ethical issues or the speaker’s own experience in the community. While I would hope that a scientific gathering wouldn’t be discriminatory I can’t see diversity being embraced in the same way, but hopefully that will continue to improve. There was a sense of everyone being in it together and it was common for speakers to praise work that they hadn’t been involved in, but thought was interesting or useful. I didn’t see anyone described using their titles and it seemed that someone who had learned Python in the last year was as valued as someone who had been a major contributor for the last 10 years (although there may had been power dynamics that I wasn’t aware of).</p>
<p>I think that a lot of the differences come from the work/volunteer divide. While PyCon was an opportunity to network or advertise your work the focus seemed to be on contributing to the community and the speakers were enthusiastic and keen to present. In contrast a scientific conference is a professional opportunity. As a scientist you are judged on your ability to get a talk which means more competition and sometimes speakers who aren’t interested in presenting. Every talk is a demonstration of your worth which makes it hard to present unfinished work and encourages people to try and fit to much in. It would be great for scientific conferences to spend more time discussing issues around thecommunities they represent but to do so they might have to sacrificeopportunities. For example it would be great to see a talk about mental health issues like Jackson Fairchild’s but that would mean taking away a spot from someone that might need it to progress their career. Personally I think we could maybe do with less talks from whichever well known person is doing the rounds in favour of some outside experts.</p>
<p>Overall I enjoyed my time at PyCon. It was a bit different to a scientific conference and I think there are probably things they can learn from each other. Congratulations to all the speakers and everyone involved in organising. Given that it is in Melbourne again I hope to be back next year.</p>



 ]]></description>
  <category>python</category>
  <category>conference</category>
  <category>thoughts</category>
  <guid>https://lazappi.id.au/posts/2016-08-18-pyconau-2016/</guid>
  <pubDate>Wed, 17 Aug 2016 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Gantt charts in R</title>
  <link>https://lazappi.id.au/posts/2016-06-13-gantt-charts-in-r/</link>
  <description><![CDATA[ 





<p>Gantt charts are a project management tool designed to visualise the tasks in a project, how long they will take and what order they must be completed. If you haven’t seen one before essentially they look like a modified horizontal bar chart. Along the horizontal axis is time with tasks along the vertical. Each task consists of a bar where the ends are the start and end times. Often there are also arrows indicating dependencies and a line showing the current date.</p>
<p>As part of the proposal for my PhD project I wanted to include a Gantt chart, both as a way of showing what I planned to do and as a way of keeping track of my progress. I expected there to be a simple template for Excel or Google Sheets but there wasn’t much and they didn’t quite fit what I wanted. Looking elsewhere didn’t turn up much either. What I wanted was a tool where I could enter tasks and dates in text format and produce a relatively attractive chart that I could easily update. In the end I turned to faithful old R, which had the added advantage that I could easily incorporate the chart into <a href="http://rmarkdown.rstudio.com/">R Markdown</a> documents.</p>
<p>There are a couple of packages that can make Gantt charts in R including <a href="https://cran.r-project.org/web/packages/plotrix/index.html">plotrix</a> and <a href="https://cran.r-project.org/web/packages/plan/index.html">plan</a> but in the end I went with <a href="https://rich-iannone.github.io/DiagrammeR/">DiagrammeR</a>. The Gantt functionality of DiagrammeR depends on <a href="https://knsv.github.io/mermaid/">Mermaid</a> which has a simple, almost markdown-like syntax.</p>
<pre><code>gantt
dateFormat  YYYY-MM-DD
title My Gantt chart

section First section
Task 1            :done,    des1, 2014-01-06, 2014-01-08
Task 2            :active,  des2, 2014-01-09, 3d
Task 3            :         des3, after des2, 5d
Task 4            :         des4, after des3, 5d</code></pre>
<p>Basically each task is written as:</p>
<pre><code>Task name         :status, label, start_date, end_date</code></pre>
<p>Where the start and end dates can also include durations or references to other tasks.</p>
<p>While this format is easy to use I prefer to use a standard delimited format which is easier to edit and read into R. To this end I created some functions which will take a CSV or XLSX file and produce a Gantt chart.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"magrittr"</span>)</span>
<span id="cb3-2"></span>
<span id="cb3-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Take a data.frame containing tasks and build a Mermaid string</span></span>
<span id="cb3-4">tasks2string <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(tasks) {</span>
<span id="cb3-5"></span>
<span id="cb3-6">    tasks.list <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">split</span>(tasks,</span>
<span id="cb3-7">                        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(tasks<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>Section, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">levels =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unique</span>(tasks<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>Section)))</span>
<span id="cb3-8"></span>
<span id="cb3-9">    strings <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sapply</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">names</span>(tasks.list),</span>
<span id="cb3-10">                      <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(section) {</span>
<span id="cb3-11">                          tasks.list[[section]] <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-12">                              dplyr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>Section) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-13">                              tidyr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unite</span>(Part1, Task, Priority,</span>
<span id="cb3-14">                                           <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">sep =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">": "</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-15">                              tidyr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unite</span>(String, Part1, Status, Name, Start,</span>
<span id="cb3-16">                                           End, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">sep =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">", "</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-17">                              magrittr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">use_series</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"String"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-18">                              <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">collapse =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb3-19">                              <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gsub</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">" ,"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>, .) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Remove empty columns</span></span>
<span id="cb3-20">                          }</span>
<span id="cb3-21">                      )</span>
<span id="cb3-22"></span>
<span id="cb3-23">    string <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span></span>
<span id="cb3-24"></span>
<span id="cb3-25">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span>(section <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">names</span>(strings)) {</span>
<span id="cb3-26">        string <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste0</span>(string, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>,</span>
<span id="cb3-27">                         <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"section "</span>, section, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>,</span>
<span id="cb3-28">                         strings[section],</span>
<span id="cb3-29">                         <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb3-30">    }</span>
<span id="cb3-31"></span>
<span id="cb3-32">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(string)</span>
<span id="cb3-33">}</span>
<span id="cb3-34"></span>
<span id="cb3-35"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Produce a Gantt chart from data.frame of tasks</span></span>
<span id="cb3-36"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Adds the Mermaid header to the tasks string</span></span>
<span id="cb3-37">buildGantt <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(tasks) {</span>
<span id="cb3-38"></span>
<span id="cb3-39">    gantt.string <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste0</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"gantt"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>,</span>
<span id="cb3-40">                           <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dateformat YYYY-MM-DD"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>,</span>
<span id="cb3-41">                           <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"title My Gantt Chart"</span>,</span>
<span id="cb3-42">                           <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb3-43"></span>
<span id="cb3-44">    gantt.string <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste0</span>(gantt.string, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tasks2string</span>(tasks))</span>
<span id="cb3-45"></span>
<span id="cb3-46">    gantt <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> DiagrammeR<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mermaid</span>(gantt.string)</span>
<span id="cb3-47"></span>
<span id="cb3-48">    gantt<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>x<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>config <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ganttConfig =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(</span>
<span id="cb3-49">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Make sure the axis labels are formatted correctly</span></span>
<span id="cb3-50">        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">axisFormatter =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(</span>
<span id="cb3-51">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"%m-%y"</span>, <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># New data format</span></span>
<span id="cb3-52">            htmlwidgets<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">JS</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'function(d){ return d}'</span>) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Select dates to format</span></span>
<span id="cb3-53">        ))</span>
<span id="cb3-54">    ))</span>
<span id="cb3-55"></span>
<span id="cb3-56">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(gantt)</span>
<span id="cb3-57">}</span>
<span id="cb3-58"></span>
<span id="cb3-59"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Read a file and return a Gantt chart</span></span>
<span id="cb3-60">buildGanttFromFile <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(tasks.file, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">format =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"csv"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"xlsx"</span>)) {</span>
<span id="cb3-61"></span>
<span id="cb3-62">    format <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">match.arg</span>(format)</span>
<span id="cb3-63"></span>
<span id="cb3-64">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">switch</span>(format,</span>
<span id="cb3-65">           <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">csv =</span> {</span>
<span id="cb3-66">               tasks <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">read.csv</span>(tasks.file, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">stringsAsFactors =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb3-67">           },</span>
<span id="cb3-68">           <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xlsx =</span> {</span>
<span id="cb3-69">               tasks <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> gdata<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">read.xls</span>(tasks.file)</span>
<span id="cb3-70">           })</span>
<span id="cb3-71"></span>
<span id="cb3-72">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">buildGantt</span>(tasks))</span>
<span id="cb3-73">}</span></code></pre></div></div>
<p>I can now construct my tasks by editing a CSV file and produce a Gantt chart directly from that by calling a single function. You may wonder why I have included XLSX as an input option? Surely using Excel is against the principles of data science? Firstly I’m not that opposed to Excel (when it is used correctly) but the reason in this case it is to get around one of the limitations of DiagrammeR. The Mermaid syntax allows you to define a task as starting after another task but you can’t say that a task ends before another. There are often situations where you have a hard end deadline (such as a PhD committee meeting) and you need to work backwards from that. By using Excel I can use simple formulas to calculate the dates which are then passed to R. I could do this programmatically in R (and I might at some stage) but Excel was a quicker solution that let me get on to writing.</p>



 ]]></description>
  <category>R</category>
  <category>project management</category>
  <category>gantt chart</category>
  <category>excel</category>
  <guid>https://lazappi.id.au/posts/2016-06-13-gantt-charts-in-r/</guid>
  <pubDate>Sun, 12 Jun 2016 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Bioconductor 3.3 packages</title>
  <link>https://lazappi.id.au/posts/2016-05-03-bioconductor-3-3-packages/</link>
  <description><![CDATA[ 





<p>Bioconductor 3.3 has just been released. You can find the complete list of new packages (and changes to existing packages) <a href="https://bioconductor.org/news/bioc_3_3_release/">here</a> but here are a few I thought might be interesting based on the description. I might have more to say once I’ve had time to try a few out.</p>
<ul>
<li><strong>debrowser</strong> – Interactive plots and tables for differential expression</li>
<li><strong>DEFormats</strong> – convert between differential expression formats</li>
<li><strong>EBSEA</strong> – exon based differential expression</li>
<li><strong>EmpiricalBrownsMethod</strong> – combining dependent p-values</li>
<li><strong>Linnorm</strong> – normalisation for parametric tests, simulation of RNA-seq data</li>
<li><strong>multiClust</strong> – feature selection and clustering analysis for transcriptomic data</li>
<li><strong>RGraph2js</strong> – interactive network visualisations with D3</li>
<li><strong>tximport</strong> – import and summarise transcript-level estimates</li>
</ul>
<section id="single-cell" class="level2">
<h2 class="anchored" data-anchor-id="single-cell">Single-cell</h2>
<p>These packages are specific to single-cell RNA-seq analysis. A couple of them I am already familiar with, particularly <strong>scater</strong>.</p>
<ul>
<li><strong>cellity</strong> - identifying low-quality cells</li>
<li><strong>celTree</strong> - model the relationship between individual cells over time or space.</li>
<li><strong>scater</strong> - tools for analysis of single-cell RNA-seq data (particularly QC)</li>
<li><strong>scde</strong> - single-cell differential expression</li>
<li><strong>scran</strong> - normalisation, cell-cycle assignment, gene detection</li>
</ul>


</section>

 ]]></description>
  <category>R</category>
  <category>bioconductor</category>
  <guid>https://lazappi.id.au/posts/2016-05-03-bioconductor-3-3-packages/</guid>
  <pubDate>Mon, 02 May 2016 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Extracting alignment statistics using Python</title>
  <link>https://lazappi.id.au/posts/2016-03-30-extracting-alignment-statistics-using-python/</link>
  <description><![CDATA[ 





<p>Recently <a href="http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0888-1">this paper</a> by Ilicic et al.&nbsp;suggested a method for assessing the quality of individual cells in a single-cell RNA-seq experiment. The basic idea is to extract various biological and technical features from each the reads for each cell, then use PCA with outlier detection or a SVM to classify cells as “high” or “low” quality. There are two pieces of software associated with the paper: <code>cellity</code>, an R package that performs the classification and <code>Celloline</code>, a Python script that performs alignment, summarisation and extraction of alignment statistics such as the number of reads aligned to exons, introns, intergenic regions etc. I was interested in using <code>cellity</code> but I didn’t want to change my whole workflow to use the <code>Celloline</code> pipeline, so instead I decided to take the part responsible for extracting alignment statistics (available <a href="https://github.com/Teichlab/celloline/blob/master/lib/stats.py">here</a>) and convert it to a stand-alone Python script.</p>
<p>The core processing remains the same (except I have removed read counting which I do with <code>featureCounts</code>), but I have added a few features:</p>
<ol type="1">
<li>Multiple files - paths to multiple alignment files can now be provided as arguments on the command line.</li>
<li>BAM files - the script can now handle BAM files as well as SAM using <a href="https://github.com/pysam-developers/pysam">pysam</a>. It will work if the BAM is unsorted, but the output can be slightly different.</li>
<li>Index - reading the GTF annotation file can take a significant amount of time, particularly for a single-cell experiment where there are a large number of files with relatively few reads. To limit this overhead the object holding the annotation can be pickled to disk for future use.</li>
<li>Parallel - multiple files can now be processed in parallel using <a href="https://pythonhosted.org/joblib/">joblib</a>. This is fairly crude but it is a significant improvment, particularly when combined with a pickled index.</li>
<li>Argument handling - now performed by <a href="https://docs.python.org/3/library/argparse.html">argparse</a>, complete with handy help message.</li>
<li>Logging - progress and error messages.</li>
</ol>
<p>Putting it all together I can now extract alignment statistics from multiple BAM files in parallel with a single command:</p>
<pre><code>alignStats -o stats.csv -g annotation.gtf -i annotation.index -t bam -p 10 *.bam</code></pre>
<p>The script is available on <a href="https://github.com/lazappi/binf-scripts/blob/master/alignStats.py">Github</a>.</p>



 ]]></description>
  <category>python</category>
  <category>alignment</category>
  <category>statistics</category>
  <guid>https://lazappi.id.au/posts/2016-03-30-extracting-alignment-statistics-using-python/</guid>
  <pubDate>Tue, 29 Mar 2016 22:00:00 GMT</pubDate>
</item>
<item>
  <title>My Markdown thesis</title>
  <link>https://lazappi.id.au/posts/2015-09-13-my-markdown-thesis/</link>
  <description><![CDATA[ 





<p>It’s come to the stage in my Master’s where I have to start thinking about writing my thesis. Apart from all the analysis I have to do before I can do that there is also the question of what I am going to use to construct the document itself.</p>
<p>For the last year or so I have been writing using Markdown which is converted to Tex using <a href="pandoc.org">Pandoc</a> then used to produce a PDF. I have found this a really good way to work combining the speed and clarity of Markdown with the ability to include LaTeX directly when I need extra flexibility. I have been using the <a href="https://sbrosinski.github.io/uberdoc/">Uberdoc</a> tool to set up projects and combine multiple Markdown files but unfortunately it’s not quite flexible enough for a complex document like a thesis.</p>
<p>I wanted to be able to be able to incorporate my Tex, particularly so I could use John Papandriopoulos’ <a href="http://jpap.org/projects.html">thesis template</a>. Ideally I wanted to build my own tool (probably in Python or Perl) that would manage projects, including git commits, as well as produce statistics but time doesn’t permit so I have ended up with a Make based solution.</p>
<p>The setup allows me to be flexible with how I set up my directory as the whole project is searched for Markdown files which are converted to LaTeX in a build directory. The directory structure is flattened at this stage which means I don’t have to write the full path when including files. Figures are treated similarly and there are folders for additional LaTeX files (such as styles and templates) and bibliography files. I also have a core Tex file which is used to tie everything together. The PDF is constructed using <a href="https://www.ctan.org/pkg/latexmk/?lang=en">latexmk</a> and I can use <a href="http://app.uio.no/ifi/texcount/">texcount</a> for keeping track of my word count. So when I run <code>make</code> for the first time the following steps occur:</p>
<ol type="1">
<li>The build directory is created with the necessary subdirectories.</li>
<li>The project directory is searched for Markdown files which are converted to TeX files in the build directory.</li>
<li>TeX files are copied from the template directory to the build directory.</li>
<li>All files are copied from the style directory to a style subdirectory inside the build directory.</li>
<li>All files are copied from the bibliography directory to a bibliography subdirectory inside the build directory.</li>
<li>The figures directory is searched for image files which are copied to a figures subdirectory inside the build directory.</li>
<li><code>latexmk</code> is used to build the output file in the build directory.</li>
<li>The output PDF is copied to the main directory.</li>
</ol>
<p>It’s not perfect, for example there is a bug that means <code>make</code> needs to be run more than once when you add a new file which isn’t ideal, but it mostly does what I want and hopefully it will get me through. If you want to check it out the code is available on <a href="https://github.com/lazappi/thesis-template">Github</a>.</p>



 ]]></description>
  <category>writing</category>
  <category>markdown</category>
  <category>thesis</category>
  <category>latex</category>
  <guid>https://lazappi.id.au/posts/2015-09-13-my-markdown-thesis/</guid>
  <pubDate>Sat, 12 Sep 2015 22:00:00 GMT</pubDate>
</item>
</channel>
</rss>
