Sfaira accelerates data and model reuse in single cell genomics
single-cell
rna-seq
database
website
Abstract
Single-cell RNA-seq datasets are often first analyzed independently without harnessing model fits from previous studies, and are then contextualized with public data sets, requiring time-consuming data wrangling. We address these issues with sfaira, a single-cell data zoo for public data sets paired with a model zoo for executable pre-trained models. The data zoo is designed to facilitate contribution of data sets using ontologies for metadata. We propose an adaption of cross-entropy loss for cell type classification tailored to datasets annotated at different levels of coarseness. We demonstrate the utility of sfaira by training models across anatomic data partitions on 8 million cells.
Citation
BibTeX citation:
@article{s_fischer2021,
author = {S Fischer, David and Dony, Leander and König, Martin and
Moeed, Abdul and Zappia, Luke and Heumos, Lukas and Tritschler,
Sophie and Holmberg, Olle and Aliee, Hananeh and J Theis, Fabian},
title = {Sfaira Accelerates Data and Model Reuse in Single Cell
Genomics},
journal = {Genome biology},
volume = {22},
number = {1},
pages = {248},
date = {2021-08-25},
url = {https://lazappi.id.au/publications/2021-fischer-sfaira/},
doi = {10.1186/s13059-021-02452-6},
issn = {1465-6906},
langid = {en},
abstract = {Single-cell RNA-seq datasets are often first analyzed
independently without harnessing model fits from previous studies,
and are then contextualized with public data sets, requiring
time-consuming data wrangling. We address these issues with sfaira,
a single-cell data zoo for public data sets paired with a model zoo
for executable pre-trained models. The data zoo is designed to
facilitate contribution of data sets using ontologies for metadata.
We propose an adaption of cross-entropy loss for cell type
classification tailored to datasets annotated at different levels of
coarseness. We demonstrate the utility of sfaira by training models
across anatomic data partitions on 8 million cells.}
}
For attribution, please cite this work as:
S Fischer, David, Leander Dony, Martin König, Abdul Moeed, Luke Zappia,
Lukas Heumos, Sophie Tritschler, Olle Holmberg, Hananeh Aliee, and
Fabian J Theis. 2021. “Sfaira Accelerates Data and Model Reuse in
Single Cell Genomics.” Genome Biology 22 (1): 248. https://doi.org/10.1186/s13059-021-02452-6.