Cumulus WDL workflows and Dockerfiles

Release License Docs

All of our docker images are publicly available on Quay and Docker Hub. Our workflows use Quay as the default Docker registry. Users can use Docker Hub as the Docker registry by entering cumulusprod for the workflow input “docker_registry”, or enter a custom registry name of their own choice.

If you use Cumulus in your research, please consider citing:

Li, B., Gould, J., Yang, Y. et al. “Cumulus provides cloud-based data analysis for large-scale single-cell and single-nucleus RNA-seq”. Nat Methods 17, 793–798 (2020). https://doi.org/10.1038/s41592-020-0905-x

Release Highlights in Current Stable

2.0.0 March 14, 2022

Overall:

Workflow-specific:

  • Add STARsolo_create_reference workflow to build genome references for STARsolo counting. See its documentation for details.

  • On Cellranger workflow:

    • Add support for 10x Cell Ranger version 6.1.1 and 6.1.2, and use 6.1.2 by default. See Cell Ranger v6.1 release notes.
    • Add support for 10x Cell Ranger ARC version 2.0.1, and use it by default. See Cell Ranger ARC v2.0 release notes for the release notes.
    • Upgrade cumulus_feature_barcoding to version 0.7.0 to allow manually set barcode starting position (via input crispr_barcode_pos).
    • Add support for non 10x CRISPR assays. See the description of crispr DataType value in this section for details.
    • For input data consisting of fastq files, it’s able to handle folder structure of both flat (all fastq files in one folder) and nested (one subfolder per sample listed in the input sample sheet) forms.
    • Add fastq_outputs to workflow output, which contains mkfastq step output folders for samples listed in the input sample sheet.
    • Add count_outputs to workflow output, which contains count step output folderrs for samples listed in the input sample sheet.
  • On Spaceranger workflow:

    • Add support for 10x Space Ranger version 1.3.0 and 1.3.1, and use 1.3.1 by default. See Space Ranger v1.3 release notes for the release notes.

    • For input data consisting of fastq files, it’s able to handle folder structure of both flat (all fastq files in one folder) and nested (one subfolder per library) forms.

    • Add output section for the workflow. See here for details.

    • Retire old genome references:

      • Keep GRCh38-2020-A and mm10-2020-A.
      • Retire GRCh38, mm10, GRCh38-2020-A-premrna and mm10-2020-A-premrna. Users can still reach out to Cumulus team to ask for URIs to these old references, but they are not provided by default.
    • In the description of ReorientImages field of input sample sheet, add the information on its valid values.

  • On STARsolo workflow:

    • Add support for STAR version 2.7.9a, and use it by default. See STAR v2.7.9a release notes for the release notes.

    • Reorganize the workflow by exposing more inputs to users.

    • Add support on more protocols: 10x multiome, 10x 5’ (both SC5P-R2 and SC5P-PE), Slide-Seq and Share-Seq. See here <./starsolo.html#prepare-a-sample-sheet> for details.

    • Use input read1_fastq_pattern and read2_fastq_pattern to support fastq files generated by Cell Ranger or SeqWell, as well as Sequence Read Archive (SRA) data.

    • For input data consisting of fastq files, it’s able to handle folder structure of both flat (all fastq files in one folder) and nested (one subfolder per library) forms.

    • Do not attach filename prefix to output files to avoid the incorrect SJ raw feature.tsv symlink error, which would cause the folder delocalization fail. (see discussion with STAR team)

    • Add STAR log file to workflow output. This is the Log.out file if running STAR locally, which can be used for tracking the process and sharing with STAR team when opening an issue there.

    • Retire old genome references:

      • Keep GRCh38-2020-A, mm10-2020-A, and GRCh38-and-mm10-2020-A.
      • Retire old references listed here. Users can still reach out to Cumulus team to ask for URIs to them, but they are not provided by default.
  • On Demultiplexing workflow:

    • Upgrade demuxEM to version 0.1.7 for bug fix.
  • On Cellranger_create_reference workflow:

    • Add the generated reference file to the workflow output.
    • Bug fix in using input memory.
    • Update documentation to suggest only using Cell Ranger version 6.1.1 or later for building reference, as v6.0.1 has issues which leave the job running without terminating.
  • On Cellranger_atac_create_reference workflow:

    • Add the generated reference file to the workflow output.
  • On Cellranger_vdj_create_reference workflow:

    • Add the generated reference file to the workflow output.