Cumulus WDL workflows and Dockerfiles
All of our docker images are publicly available on Quay and Docker Hub. Our workflows use Quay as the
default Docker registry. Users can use Docker Hub as the Docker registry by entering cumulusprod
for the workflow
input “docker_registry”, or enter a custom registry name of their own choice.
If you use Cumulus in your research, please consider citing:
Li, B., Gould, J., Yang, Y. et al. “Cumulus provides cloud-based data analysis for large-scale single-cell and single-nucleus RNA-seq”. Nat Methods 17, 793–798 (2020). https://doi.org/10.1038/s41592-020-0905-x
Release Highlights in Current Stable
3.0.0 March 7, 2025
Overall highlights:
For data localization/delocalization on GCP, now use gcloud storage commands instead of gsutil, as gsutil is deprecated, and gcloud storage achieves some speed improvement.
Users no longer need to specify
backend
input for AWS or local backend, and it is now automatically figured out fromoutput_directory
location.Remove support on mkfastq in Cellranger and Spaceranger workflows, as it will soon be removed from Cell Ranger and Space Ranger. Users need to run BCL Convert themselves first to generate FASTQ data.
For Cellranger and Spaceranger workflows, provide better support for shared computing environments like AWS Batch, GCP Batch and sHPC.
For resources like prebuilt references, they are now held in a single-region bucket
gs://cumulus-ref
in US-CENTRAL1 to reduce potentially higher network cost at users’ side from the previous multi-region bucket.
Cellranger workflow:
Upgrade cellranger_version default to
9.0.1
.Remove mkfastq related workflow inputs.
Remove run_count input as it’s always
true
.Support input data format as a TAR file containing FASTQ files.
Change FeatureBarcodeFile column header to AuxFile (for backward compatibility, FeatureBarcodeFile is still accepted but not recommended).
Support all the 3 Sample Multiplexing methods provided since Cell Ranger v9.0. See details
For single-cell and single-nucleus RNA-seq:
Add new genome reference
mRatBN7.2-2024-A
Remove Target Gene Expression-related inputs, as it’s no longer supported since Cell Ranger v7.2.0.
For feature barcoding:
Upgrade cumulus_feature_barcoding_version to 1.0.0.
The workflow now can automatically detect the chemistry type of the data:
auto
by default: The workflow checks all possible assay types to decide the correct one.threeprime
: The workflow checks all 3’ assay types to decide the correct one.fiveprime
: The workflow checks all 5’ assay types to decide the correct one.
Reorg the keywords in Chemistry column of sample sheet: 5’ now only has 2 types,
SC5Pv2
andSC5Pv3
for 5’ v2 and v3 chemistries where only R2 is used for alignment.Support the new format of 10x cell barcode inclusion lists provided in Cell Ranger v9.0+.
Fix issue in processing UTF-encoded feature barcode files
For immune profiling:
Add types
vdj_t
,vdj_b
andvdj_t_gd
for DataType column. See 10x 5’ Immune Profiling Kit for details.Remove
chain
input, as it is now automatically decided by user-specified DataType types.For
vdj_t_gd
type samples, support the feature of specifying primer sequences used to enrich cDNA for V(D)J sequences. To enable it, provide a.txt
file in AuxFile column of the sample, and it will be passed to--inner-enrichment-primers
option of cellranger vdj in execution.
For Flex Gene Expression:
Remove ProbeSet column. The probe set is now automatically decided based on user-specified Reference name.
Support Flex probe sets v1.1 which are associated with 2024-A genome references.
For CellPlex using CMO:
Remove
cmo_set
input. If using custom CMOs in your experiment, just provide the custom feature reference file in AuxFile column of thecmo
type sample.
Spaceranger workflow:
Upgrade spaceranger_version default to
3.1.3
.Remove mkfastq related workflow inputs.
Remove run_count input as it’s always
true
.Remove support on Targeted Gene Expression analysis, as it’s no longer supported since Space Ranger v2.1.1.