Release notes
Version 3.0
3.0.0 March 7, 2025
Overall highlights:
For data localization/delocalization on GCP, now use gcloud storage commands instead of gsutil, as gsutil is deprecated, and gcloud storage achieves some speed improvement.
Users no longer need to specify
backend
input for AWS or local backend, and it is now automatically figured out fromoutput_directory
location.Remove support on mkfastq in Cellranger and Spaceranger workflows, as it will soon be removed from Cell Ranger and Space Ranger. Users need to run BCL Convert themselves first to generate FASTQ data.
For Cellranger and Spaceranger workflows, provide better support for shared computing environments like AWS Batch, GCP Batch and sHPC.
For resources like prebuilt references, they are now held in a single-region bucket
gs://cumulus-ref
in US-CENTRAL1 to reduce potentially higher network cost at users’ side from the previous multi-region bucket.
Cellranger workflow:
Upgrade cellranger_version default to
9.0.1
.Remove mkfastq related workflow inputs.
Remove run_count input as it’s always
true
.Support input data format as a TAR file containing FASTQ files.
Change FeatureBarcodeFile column header to AuxFile (for backward compatibility, FeatureBarcodeFile is still accepted but not recommended).
Support all the 3 Sample Multiplexing methods provided since Cell Ranger v9.0. See details
For single-cell and single-nucleus RNA-seq:
Add new genome reference
mRatBN7.2-2024-A
Remove Target Gene Expression-related inputs, as it’s no longer supported since Cell Ranger v7.2.0.
For feature barcoding:
Upgrade cumulus_feature_barcoding_version to 1.0.0.
The workflow now can automatically detect the chemistry type of the data:
auto
by default: The workflow checks all possible assay types to decide the correct one.threeprime
: The workflow checks all 3’ assay types to decide the correct one.fiveprime
: The workflow checks all 5’ assay types to decide the correct one.
Reorg the keywords in Chemistry column of sample sheet: 5’ now only has 2 types,
SC5Pv2
andSC5Pv3
for 5’ v2 and v3 chemistries where only R2 is used for alignment.Support the new format of 10x cell barcode inclusion lists provided in Cell Ranger v9.0+.
Fix issue in processing UTF-encoded feature barcode files
For immune profiling:
Add types
vdj_t
,vdj_b
andvdj_t_gd
for DataType column. See 10x 5’ Immune Profiling Kit for details.Remove
chain
input, as it is now automatically decided by user-specified DataType types.For
vdj_t_gd
type samples, support the feature of specifying primer sequences used to enrich cDNA for V(D)J sequences. To enable it, provide a.txt
file in AuxFile column of the sample, and it will be passed to--inner-enrichment-primers
option of cellranger vdj in execution.
For Flex Gene Expression:
Remove ProbeSet column. The probe set is now automatically decided based on user-specified Reference name.
Support Flex probe sets v1.1 which are associated with 2024-A genome references.
For CellPlex using CMO:
Remove
cmo_set
input. If using custom CMOs in your experiment, just provide the custom feature reference file in AuxFile column of thecmo
type sample.
Spaceranger workflow:
Upgrade spaceranger_version default to
3.1.3
.Remove mkfastq related workflow inputs.
Remove run_count input as it’s always
true
.Remove support on Targeted Gene Expression analysis, as it’s no longer supported since Space Ranger v2.1.1.
Version 2.6
2.6.3 August 2, 2024
Update Demultiplexing workflow to work with Stratocumulus v0.2.4.
2.6.2 June 19, 2024
Solve the issue of Cellranger workflow with cellranger-arc.
cellranger_arc_version
default is now “2.0.2.strato” which is compatible with workflow v2.6.1 or later.
2.6.1 May 8, 2024
In Cellranger workflow: Add new genome references for single-cell/nucleus RNA-Seq: GRCh38-2024-A for human, GRCm39-2024-A for mouse, and GRCh38_and_GRCm39-2024-A for human and mouse.
In Spaceranger workflow: Add a new probe set mouse_probe_v2 for mouse.
Some underlying workflow improvement.
2.6.0 April 22, 2024
Update:
In Cellranger workflow:
Upgrade cellranger_version default to
8.0.0
.Upgrade cumulus_feature_barcoding version default to
0.11.3
.
In Spaceranger workflow:
Upgrade spaceranger_version default to
3.0.0
.Support Visium HD data.
Version 2.5
2.5.0 February 3, 2024
Improvements:
In Cellranger workflow:
Add multi_disk_space input for specifying disk size used by Fixed RNA Profiling or 10x multiome jobs, which usually take large amount.
In Cellbender workflow:
With
0.3.0
default Cell Bender version, users don’t need to always specify expected_cells or total_droplets_included input.
Updates:
In Cellranger workflow:
Upgrade cellranger_version default to
7.2.0
.Upgrade cumulus_feature_barcoding version default to
0.11.2
.Add GRCh38 VDJ v7.1.0 reference:
GRCh38_vdj_v7.1.0
.
In Spaceranger workflow:
Upgrade spaceranger_version default to
2.1.1
.
In Demultiplexing workflow:
Upgrade souporcell_version default to
2.5
.
In Cellbender workflow:
Upgrade cellbender_version default to
0.3.0
.
Version 2.4
2.4.1 May 30, 2023
Improve:
In Cellranger workflow:
Fixed RNA Profiling now accepts custom probe set references.
In STARsolo workflow:
Add limitBAMsortRAM and outBAMsortingBinsN inputs to handle out-of-memory error in BAM sorting phase.
In GeoMx_fastq_to_dcc workflow:
Support multiple FASTQ folders as input.
Updates:
In Demultiplexing workflow:
Upgrade souporcell_version to
2022.12
, which is based on commit 9fb527 on 2022/12/13.
In STARsolo workflow:
Upgrade star_version to
2.7.10b
.
Bug Fixes:
In Spaceranger workflow:
Fix the image localization issue for CytAssist samples.
2.4.0 January 28, 2023
Updates:
In Cellranger workflow:
Upgrade cellranger_version default to
7.1.0
.Add Mouse probe set v1.0 for Fixed RNA Profiling analysis.
Add probe set v1.0.1 of both Human and Mouse for Fixed RNA Profiling analysis.
Upgrade cumulus_feature_barcoding version default to
0.11.1
.
In Cellranger_create_reference workflow: upgrade cellranger_version default to
7.1.0
.In Cellranger_vdj_create_reference workflow: upgrade cellranger_version default to
7.1.0
.In Spaceranger workflow: upgrade spaceranger_version default to
2.0.1
.
Version 2.3
2.3.0 October 30, 2022
New Features:
Add support for Fixed RNA Profiling to Cellranger workflow.
Updates:
In Cellranger workflow:
Upgrade cellranger_version default to
7.0.1
.Upgrade cellranger_arc_version default to
2.0.2
.
In Cellranger_create_reference workflow: upgrade cellranger_version default to
7.0.1
.In Cellranger_vdj_create_reference workflow: upgrade cellranger_version default to
7.0.1
.
Version 2.2
2.2.0 October 4, 2022
New Features:
Add Nanostring GeoMx DSP workflows. It consists of two steps:
GeoMx_fastq_to_dcc workflow to convert FASTQ files into DCC files by wrapping Nanostring GeoMx Digital Spatial NGS Pipeline.
GeoMx_dcc_to_count_matrix workflow to generate probe count matrix from DCC files with pathologists’ annotation.
Updates:
Spaceranger workflow:
Add support on 10x Space Ranger v2.0.0.
Add
human_probe_v2
Probe Set for FFPE samples, which is compatible with CytAssist FFPE samples.
Upgrade
cumulus_feature_barcoding_version
default to v0.11.0 for Feature Barcoding in Cellranger workflow.
API Changes:
Across all workflows, for AWS backend:
All workflows now have
awsQueueArn
input, which is used for explicitly specifying the Arn string of an AWS Compute Environment.Remove
awsMaxRetries
input for all workflows. Namely, use Cromwell’s default value0
.
Bug Fix:
Fix the issue on localizing GCP folders in Cellranger workflow for ATAC-Seq and 10x Multiome data.
Version 2.1
2.1.1 July 18, 2022
Make cumulus workflow work with Cromwell v81+.
2.1.0 July 13, 2022
New Features:
Add CellBender workflow for ambient RNA removal.
CellRanger:
For ATAC-Seq data, add
ARC-v1
chemistry keyword for analyzing only the ATAC part of 10x multiome data. See CellRanger scATAC-seq sample sheet section for details.For antibody/hashing/citeseq/crispr data, add
multiome
chemistry keyword for the feature barcoding on 10x multiome data.
STARsolo:
In workflow output, besides
mtx
format gene-count matrices, the workflow also generates matrices in 10x-compatiblehdf5
format.
Improvements:
CellRanger: For antibody/hashing/citeseq/crispr data,
cumulus_feature_barcoding v0.9.0+ now supports multi-threading and faster gzip file I/O.
Workflows check if
output_directory
is a valid Cloud URI based on the givenbackend
value before execution. (Feature request #322 )
Updates:
Genome Reference:
Add Cellranger VDJ v7.0.0 genome references:
GRCh38_vdj_v7.0.0
andGRCm38_vdj_v7.0.0
in CellRanger scIR-seq sample sheet section.
Default version upgrade:
Version 2.0
2.0.0 March 14, 2022
Overall:
Cumulus workflows are now released on Dockstore:
Add the tutorial on importing Cumulus workflows to Terra.
Archive the legacy versions on Broad Method Registry.
Add support on multiple platforms via backend input:
gcp
for Google Cloud,aws
for Amazon AWS,local
for local machine. Enable Google Cloud support by default.For Amazon AWS backend, add awsMaxRetries input to set the maximum retries allowed for job execution at runtime. By default, use
5
.Update the command-line job submission tutorial to work with Altocumulus v2.0.0 or later.
On Examples:
Update gene expression, hashing and CITE-Seq example tutorial.
Add tutorial on 10x CellPlex analysis using Cumulus workflows on Cloud.
Workflow-specific:
Add STARsolo_create_reference workflow to build genome references for STARsolo counting. See its documentation for details.
On Cellranger workflow:
Add support for 10x Cell Ranger version
6.1.1
and6.1.2
, and use6.1.2
by default. See Cell Ranger v6.1 release notes.Add support for 10x Cell Ranger ARC version
2.0.1
, and use it by default. See Cell Ranger ARC v2.0 release notes for the release notes.Upgrade cumulus_feature_barcoding to version
0.7.0
to allow manually set barcode starting position (via input crispr_barcode_pos).Add support for non 10x CRISPR assays. See the description of
crispr
DataType value in this section for details.For input data consisting of fastq files, it’s able to handle folder structure of both flat (all fastq files in one folder) and nested (one subfolder per sample listed in the input sample sheet) forms.
Add fastq_outputs to workflow output, which contains mkfastq step output folders for samples listed in the input sample sheet.
Add count_outputs to workflow output, which contains count step output folderrs for samples listed in the input sample sheet.
On Spaceranger workflow:
Add support for 10x Space Ranger version
1.3.0
and1.3.1
, and use1.3.1
by default. See Space Ranger v1.3 release notes for the release notes.For input data consisting of fastq files, it’s able to handle folder structure of both flat (all fastq files in one folder) and nested (one subfolder per library) forms.
Add output section for the workflow. See here for details.
Retire old genome references:
Keep
GRCh38-2020-A
andmm10-2020-A
.Retire
GRCh38
,mm10
,GRCh38-2020-A-premrna
andmm10-2020-A-premrna
. Users can still reach out to Cumulus team to ask for URIs to these old references, but they are not provided by default.
In the description of ReorientImages field of input sample sheet, add the information on its valid values.
On STARsolo workflow:
Add support for STAR version
2.7.9a
, and use it by default. See STAR v2.7.9a release notes for the release notes.Reorganize the workflow by exposing more inputs to users.
Add support on more protocols: 10x multiome, 10x 5’ (both SC5P-R2 and SC5P-PE), Slide-Seq and Share-Seq. See here <./starsolo.html#prepare-a-sample-sheet> for details.
Use input read1_fastq_pattern and read2_fastq_pattern to support fastq files generated by Cell Ranger or SeqWell, as well as Sequence Read Archive (SRA) data.
For input data consisting of fastq files, it’s able to handle folder structure of both flat (all fastq files in one folder) and nested (one subfolder per library) forms.
Do not attach filename prefix to output files to avoid the incorrect SJ raw feature.tsv symlink error, which would cause the folder delocalization fail. (see discussion with STAR team)
Add STAR log file to workflow output. This is the Log.out file if running STAR locally, which can be used for tracking the process and sharing with STAR team when opening an issue there.
Retire old genome references:
Keep
GRCh38-2020-A
,mm10-2020-A
, andGRCh38-and-mm10-2020-A
.Retire old references listed here. Users can still reach out to Cumulus team to ask for URIs to them, but they are not provided by default.
On Demultiplexing workflow:
Upgrade demuxEM to version
0.1.7
for bug fix.
On Cellranger_create_reference workflow:
Add the generated reference file to the workflow output.
Bug fix in using input memory.
Update documentation to suggest only using Cell Ranger version
6.1.1
or later for building reference, as v6.0.1 has issues which leave the job running without terminating.
On Cellranger_atac_create_reference workflow:
Add the generated reference file to the workflow output.
On Cellranger_vdj_create_reference workflow:
Add the generated reference file to the workflow output.
Version 1.x
Version 1.5.1 September 15, 2021
Fix the issue of WDLs after Terra platform updates the Cromwell engine.
Version 1.5.0 July 20, 2021
- On demultiplexing workflow
Update demuxEM to v0.1.6.
- On cumulus workflow
Add Nonnegative Matrix Factorization (NMF) feature:
run_nmf
andnmf_n
inputs.Add integrative NMF (iNMF) data integration method:
inmf
option incorrection_method
input; the number of expected factors is also specified bynmf_n
input.When NMF or iNMF is enabled, word cloud plots and gene program UMAP plots of NMF/iNMF results will be generated.
Update Pegasus to v1.4.2.
Version 1.4.0 May 17, 2021
- On cellranger workflow
Add support for multiomics analysis using linked samples, cellranger-arc count, cellranger multi and cellranger count will be automatically triggered based on the sample sheet
Add support for cellranger version 6.0.1 and 6.0.0
Add support for cellranger-arc version 2.0.0, 1.0.1, 1.0.0
Add support for cellranger-atac version 2.0.0
Add support for cumulus_feature_barcoding version 0.6.0, which handles CellPlex CMO tags
Add GRCh38-2020-A_arc_v2.0.0, mm10-2020-A_arc_v2.0.0, GRCh38-2020-A_arc_v1.0.0 and mm10-2020-A_arc_v1.0.0 references for cellranger-arc.
Fixed bugs in cellranger_atac_create_reference
Add delete undetermined FASTQs option for mkfastq
- On demultiplexing workflow
Replace demuxlet with popscle, which includes both demuxlet and freemuxlet
- On cumulus workflow
Fixed bug that
remap_singlets
andsubset_singlets
don’t work when input is in sample sheet format.
Modified workflows to remove trailing spaces and support spaces within output_directory
Version 1.3.0 February 2, 2021
- On cumulus workflow:
Change
cumulus_version
topegasus_version
to avoid confusion.Update to use Pegasus v1.3.0 for analysis.
Version 1.2.0 January 19, 2021
- Add spaceranger workflow:
Wrap up spaceranger version 1.2.1
- On cellranger workflow:
Fix workflow WDL to support both single index and dual index
Add support for cellranger version 5.0.1 and 5.0.0
Add support for targeted gene expression analysis
Add support for
--include-introns
and--no-bam
options for cellranger countRemove
--force-cells
option for cellranger vdj as noted in cellranger 5.0.0 release noteAdd GRCh38_vdj_v5.0.0 and GRCm38_vdj_v5.0.0 references
Bug fix on cumulus workflow.
Reorganize the sidebar of Cumulus documentation website.
Version 1.1.0 December 28, 2020
- On cumulus workflow:
Add CITE-Seq data analysis back. (See section Run CITE-Seq analysis for details)
Add doublet detection. (See
infer_doublets
,expected_doublet_rate
, anddoublet_cluster_attribute
input fields)For tSNE visualization, only support FIt-SNE algorithm. (see
run_tsne
andplot_tsne
input fields)Improve efficiency on log-normalization and DE tests.
Support multiple marker JSON files used in cell type annotation. (see
organism
input field)More preset gene sets provided in gene score calculation. (see
calc_signature_scores
input field)
- Add star_solo workflow (see STARsolo section for details):
Use STARsolo to generate count matrices from FASTQ files.
Support chemistry protocols such as 10X-V3, 10X-V2, DropSeq, and SeqWell.
Update the example of analyzing hashing and CITE-Seq data (see Example section) with the new workflows.
Bug fix.
Version 1.0.0 September 23, 2020
Add demultiplexing workflow for cell-hashing/nucleus-hashing/genetic-pooling analysis.
Add support on CellRanger version
4.0.0
.- Update cumulus workflow with Pegasus version
1.0.0
: Use
zarr
file format to handle data, which has a better I/O performance in general.Support focus analysis on Unimodal data, and appending other Unimodal data to it. (
focus
andappend
inputs in cluster step).Quality-Control: Change
percent_mito
default from10.0
to20.0
; by default remove bounds on UMIs (min_umis
andmax_umis
inputs in cluster step).Quality-Control: Automatically figure out name prefix of mitochondrial genes for
GRCh38
andmm10
genome reference data.Support signature / gene module score calculation. (
calc_signature_scores
input in cluster step)Add Scanorama method to batch correction. (
correction_method
input in cluster step).Calculate UMAP embedding by default, instead of FIt-SNE.
Differential Expression (DE) analysis: remove inputs
mwu
andauc
as they are calculated by default. And cell-type annotation uses MWU test result by default.
- Update cumulus workflow with Pegasus version
Remove cumulus_subcluster workflow.
Version 0.x
Version 0.15.0 May 6, 2020
Update all workflows to OpenWDL version 1.0.
Cumulus now supports multi-job execution from Terra data table input.
Cumulus generates Cirrocumulus input in
.cirro
folder, instead of a huge.parquet
file.
Version 0.14.0 February 28, 2020
Added support for gene-count matrices generation using alternative tools (STARsolo, Optimus, Salmon alevin, Kallisto BUStools).
Cumulus can process demultiplexed data with remapped singlets names and subset of singlets.
Update VDJ related inputs in Cellranger workflow.
SMART-Seq2 and Count workflows are in OpenWDL version 1.0.
Version 0.13.0 February 7, 2020
Added support for aggregating scATAC-seq samples.
Cumulus now accepts mtx format input.
Version 0.12.0 December 14, 2019
Added support for building references for sc/snRNA-seq, scATAC-seq, single-cell immune profiling, and SMART-Seq2 data.
Version 0.11.0 December 4, 2019
Reorganized Cumulus documentation.
Version 0.10.0 October 2, 2019
scCloud is renamed to Cumulus.
Cumulus can accept either a sample sheet or a single file.
Version 0.7.0 Feburary 14, 2019
Added support for 10x genomics scATAC assays.
scCloud runs FIt-SNE as default.
Version 0.6.0 January 31, 2019
Added support for 10x genomics V3 chemistry.
Added support for extracting feature matrix for Perturb-Seq data.
Added R script to convert output_name.seurat.h5ad to Seurat object. Now the raw.data slot stores filtered raw counts.
Added min_umis and max_umis to filter cells based on UMI counts.
Added QC plots and improved filtration spreadsheet.
Added support for plotting UMAP and FLE.
Now users can upload their JSON file to annotate cell types.
Improved documentation.
Added lightGBM based marker detection.
Version 0.5.0 November 18, 2018
Added support for plated-based SMART-Seq2 scRNA-Seq data.
Version 0.4.0 October 26, 2018
Added CITE-Seq module for analyzing CITE-Seq data.
Version 0.3.0 October 24, 2018
Added the demuxEM module for demultiplexing cell-hashing/nuclei-hashing data.
Version 0.2.0 October 19, 2018
Added support for V(D)J and CITE-Seq/cell-hashing/nuclei-hashing.
Version 0.1.0 July 27, 2018
KCO tools released!