Cumulus WDL workflows and Dockerfiles
All of our docker images are publicly available on Quay and Docker Hub. Our workflows use Quay as the
default Docker registry. Users can use Docker Hub as the Docker registry by entering cumulusprod for the workflow
input “docker_registry”, or enter a custom registry name of their own choice.
If you use Cumulus in your research, please consider citing:
Li, B., Gould, J., Yang, Y. et al. “Cumulus provides cloud-based data analysis for large-scale single-cell and single-nucleus RNA-seq”. Nat Methods 17, 793–798 (2020). https://doi.org/10.1038/s41592-020-0905-x
Release Highlights in Current Stable
3.0.0 March 7, 2025
Overall highlights:
For data localization/delocalization on GCP, now use gcloud storage commands instead of gsutil, as gsutil is deprecated, and gcloud storage achieves some speed improvement.
Users no longer need to specify
backendinput for AWS or local backend, and it is now automatically figured out fromoutput_directorylocation.Remove support on mkfastq in Cellranger and Spaceranger workflows, as it will soon be removed from Cell Ranger and Space Ranger. Users need to run BCL Convert themselves first to generate FASTQ data.
For Cellranger and Spaceranger workflows, provide better support for shared computing environments like AWS Batch, GCP Batch and sHPC.
For resources like prebuilt references, they are now held in a single-region bucket
gs://cumulus-refin US-CENTRAL1 to reduce potentially higher network cost at users’ side from the previous multi-region bucket.
Cellranger workflow:
Upgrade cellranger_version default to
9.0.1.Remove mkfastq related workflow inputs.
Remove run_count input as it’s always
true.Support input data format as a TAR file containing FASTQ files.
Change FeatureBarcodeFile column header to AuxFile (for backward compatibility, FeatureBarcodeFile is still accepted but not recommended).
Support all the 3 Sample Multiplexing methods provided since Cell Ranger v9.0. See details
For single-cell and single-nucleus RNA-seq:
Add new genome reference
mRatBN7.2-2024-ARemove Target Gene Expression-related inputs, as it’s no longer supported since Cell Ranger v7.2.0.
For feature barcoding:
Upgrade cumulus_feature_barcoding_version to 1.0.0.
The workflow now can automatically detect the chemistry type of the data:
autoby default: The workflow checks all possible assay types to decide the correct one.threeprime: The workflow checks all 3’ assay types to decide the correct one.fiveprime: The workflow checks all 5’ assay types to decide the correct one.
Reorg the keywords in Chemistry column of sample sheet: 5’ now only has 2 types,
SC5Pv2andSC5Pv3for 5’ v2 and v3 chemistries where only R2 is used for alignment.Support the new format of 10x cell barcode inclusion lists provided in Cell Ranger v9.0+.
Fix issue in processing UTF-encoded feature barcode files
For immune profiling:
Add types
vdj_t,vdj_bandvdj_t_gdfor DataType column. See 10x 5’ Immune Profiling Kit for details.Remove
chaininput, as it is now automatically decided by user-specified DataType types.For
vdj_t_gdtype samples, support the feature of specifying primer sequences used to enrich cDNA for V(D)J sequences. To enable it, provide a.txtfile in AuxFile column of the sample, and it will be passed to--inner-enrichment-primersoption of cellranger vdj in execution.
For Flex Gene Expression:
Remove ProbeSet column. The probe set is not automatically decided based on user-specified Reference name.
Support Flex probe sets v1.1 which are associated with 2024-A genome references.
For CellPlex using CMO:
Remove
cmo_setinput. If using custom CMOs in your experiment, just provide the custom feature reference file in AuxFile column of thecmotype sample.
Spaceranger workflow:
Upgrade spaceranger_version default to
3.1.3.Remove mkfastq related workflow inputs.
Remove run_count input as it’s always
true.Remove support on Targeted Gene Expression analysis, as it’s no longer supported since Space Ranger v2.1.1.