Sfaira accelerates data and model reuse in single cell genomics
single-cell
rna-seq
database
website
Abstract
Single-cell RNA-seq datasets are often first analyzed independently without harnessing model fits from previous studies, and are then contextualized with public data sets, requiring time-consuming data wrangling. We address these issues with sfaira, a single-cell data zoo for public data sets paired with a model zoo for executable pre-trained models. The data zoo is designed to facilitate contribution of data sets using ontologies for metadata. We propose an adaption of cross-entropy loss for cell type classification tailored to datasets annotated at different levels of coarseness. We demonstrate the utility of sfaira by training models across anatomic data partitions on 8 million cells.
Citation
BibTeX citation:
@article{fischer2021,
author = {Fischer, David S and Dony, Leander and König, Martin and
Moeed, Abdul and Zappia, Luke and Heumos, Lukas and Tritschler,
Sophie and Holmberg, Olle and Aliee, Hananeh and Theis, Fabian J},
title = {Sfaira Accelerates Data and Model Reuse in Single Cell
Genomics},
journal = {Genome biology},
volume = {22},
number = {1},
pages = {248},
date = {2021-08-25},
url = {https://doi.org/10.1186/s13059-021-02452-6},
doi = {10.1186/s13059-021-02452-6},
issn = {1465-6906},
langid = {en},
abstract = {Single-cell RNA-seq datasets are often first analyzed
independently without harnessing model fits from previous studies,
and are then contextualized with public data sets, requiring
time-consuming data wrangling. We address these issues with sfaira,
a single-cell data zoo for public data sets paired with a model zoo
for executable pre-trained models. The data zoo is designed to
facilitate contribution of data sets using ontologies for metadata.
We propose an adaption of cross-entropy loss for cell type
classification tailored to datasets annotated at different levels of
coarseness. We demonstrate the utility of sfaira by training models
across anatomic data partitions on 8 million cells.}
}
For attribution, please cite this work as:
Fischer, D. S., Dony, L., König, M., Moeed, A.,
Zappia, L., Heumos, L., Tritschler, S., Holmberg, O., Aliee, H. &
Theis, F. J. Sfaira
accelerates data and model reuse in single cell genomics. Genome
biology 22, 248 (2021).