Getting Started

Installation

For most users, we suggest making use of our prepared Docker image. This comes with all required dependencies for standard use cases.

Docker

  1. Pull the image.

docker pull chbeltz/compocyte:latest
  1. Start an interactive container.

docker run -it --rm chbeltz/compocyte:latest bash

If you want to work with your own data, mount a local directory into the container:

docker run -it --rm \
  -v /path/to/your/data:/data \
  chbeltz/compocyte:latest bash

Your files will then be accessible inside the container at /data.

Manual installation

Alternatively, you can install Python 3.14 using micromamba or another environment manager, and then install Compocyte and its dependencies from source/PyPI.

micromamba create -n compocyte_python314 python=3.14
micromamba activate compocyte_python314
micromamba install gcc gxx graphviz pygraphviz
pip install Compocyte

Pretrained model files

Pretrained Compocyte models are available on Zenodo.

They can also be loaded from within Compocyte the following way:

import Compocyte
pbmc_hc = Compocyte.pretrained.pbmc_pretrained()
til_hc = Compocyte.pretrained.til_pretrained()

Inference

You can try out our pretrained models to infer cell type predictions on the included tumor-infiltrating leukocyte test dataset in the following way:

import Compocyte
from Compocyte.core.hierarchical_classifier import HierarchicalClassifier
from Compocyte.pretrained import til_pretrained, pbmc_pretrained

hc = til_pretrained()
adata = Compocyte.data.sample_data()
hc.load_adata(adata)

hc.predict_all_child_nodes('blood')
print(hc.adata.obs)

Because the prediction process is hierarchical in nature we need to specify the root node for our inference run. Don’t be confused by our choice of root node above.

The fact that the TIL hierarchy starts with “blood” will be patched in future versions.

Alternatively you can do the same on the sample PBMC dataset included in scanpy.

import scanpy as sc

hc = pbmc_pretrained()
adata = sc.datasets.pbmc3k()
hc.load_adata(adata)
hc.predict_all_child_nodes('Blood')
print(hc.adata.obs)