Nanostring GeoMx DSP

This section contains two workflows: geomxngs_fastq_to_dcc and geomxngs_dcc_to_count_matrix.

geomxngs_fastq_to_dcc workflow wraps Nanostring GeoMx Digital Spatial NGS Pipeline and can convert FASTQ files into DCC files.

geomxngs_dcc_to_count_matrix workflow takes the DCC zip file from geomxngs_fastq_to_dcc and other files produced by the GeoMx DSP machine as inputs, and outputs an area of illumination (AOI) by probe count matrix with pathologists’ annotation.


Convert FASTQ files into DCC files by the Nanostring GeoMx Digital Spatial NGS Pipeline

The geomxngs_fastq_to_dcc workflow converts FASTQ files to DCC files by wrapping the Nanostring GeoMx Digital Spatial NGS Pipeline. After generating DCC files, use the geomxngs_dcc_to_count_matrix workflow to generate an area of interest by probe count matrix.

Workflow Input

Relevant workflow inputs are described below (required inputs in bold)

Name Description Example Default
fastq_directory FASTQ directory URL “gs://foo/bar/fastqs” or “s3://foo/bar/fastqs”  
ini Configuration file in INI format, containing pipeline processing parameters “gs://foo/bar/config.ini”  
output_directory URL to write results “gs://foo/bar/out” or “s3://foo/bar/out”  
fastq_rename Optional 2 column TSV file with no header used to map original FASTQ names to FASTQ names that GeoMX recognizes. “gs://foo/bar/fastq_rename.tsv”  
delete_fastq_directory Whether to delete the input fastqs upon successful completion true false
geomxngs_version Version of the geomx software, currently only “2.3.3.10”. “2.3.3.10” “2.3.3.10”
docker_registry

Docker registry to use for this workflow. Options:

  • “quay.io/cumulus” for images on Red Hat registry;
  • “cumulusprod” for backup images on Docker Hub.
“quay.io/cumulus” “quay.io/cumulus”
backend
Backend for computation. Available options:
  • “gcp” for Google Cloud
  • “aws” for Amazon AWS
  • “local” for local machine
“aws” “gcp”
zones Google cloud zones “us-central1-a” “us-central1-a us-central1-b us-central1-c us-central1-f”
preemptible Number of preemptible tries 2 2
memory Memory string “64GB” “64GB”
cpu Number of CPUs 4 4
disk_space Disk space in GB 500 500
aws_queue_arn The arn URI of the AWS job queue to be used. Only works when backend is aws. “arn:aws:batch:us-east-1:xxx:job-queue/priority-gwf” “”

Workflow Output

Name Description Type
dcc_zip URL to the output DCC zip file String
geomxngs_output URL to the output of geomxngspipeline; the DCC zip file is part of the output here String

Generate probe count matrix with pathologists’ annotation

The geomxngs_dcc_to_count_matrix workflow generates an area of illumination (AOI) by probe count matrix with patholgoists’ annotation from the output of the geomxngs_fastq_to_dcc workflow and user inputs.

Workflow Input

Workflow inputs are described below (required inputs in bold).

Name Description Example Default
dcc_zip DCC zip file from geomxngs_fastq_to_dcc workflow output “gs://foo/bar/out/DCC-20221001.zip”  
ini Configuration file in INI format, containing pipeline processing parameters “gs://foo/bar/config.ini”  
lab_worksheet A text file containing library setups “gs://foo/bar/LabWorksheet.txt”  
dataset Data QC and annotation file (Excel) downloaded from instrument after uploading DCC zip file; we only use the first tab (SegmentProperties) “gs://foo/bar/BioprobeQC.xlsx”  
pkc GeoMx DSP configuration file to associate assay targets with GeoMx HybCode barcodes and Seq Code primers. Options: - CTA_v1.0-4 for Cancer Transcriptome Atlas - COVID-19_v1.0 for COVID-19 Immune Response Atlas - Human_WTA_v1.0 for Human Whole Transcriptome Atlas - Mouse_WTA_v1.0 for Mouse Whole Transcriptome Atlas If your configuration file is not listed, you can provide a URL to a PKC zip file or PKC file instead. “Human_WTA_v1.0”  
output_directory URL to write results “gs://foo/bar/out” or “s3://foo/bar/out”  
backend Backend for computation. Available options: - “gcp” for Google Cloud - “aws” for Amazon AWS - “local” for local machine “aws” “gcp”
docker_registry

Docker registry to use for this workflow. Options:

  • “quay.io/cumulus” for images on Red Hat registry;
  • “cumulusprod” for backup images on Docker Hub.
“quay.io/cumulus” “quay.io/cumulus”
docker_version Docker image version. “1.0.0” “1.0.0”
preemptible Number of preemptible tries 2 2
memory Memory string “8GB” “8GB”
cpu Number of CPUs 1 1
extra_disk_space Extra disk space in GB. 5 5
aws_queue_arn The arn URI of the AWS job queue to be used. Only works when backend is aws “arn:aws:batch:us-east-1:xxx:job-queue/priority-gwf” “”

Workflow Output

Name Description Type
count_matrix_h5ad URL to a count matrix in h5ad format. X contains the count matrix, obs contains AOI information, and .var contains probe metadata String
count_matrix_text URL to a count matrix in text format. Each row is one probe and each column is one AOI. First column is RTS_ID (Readout Tag Sequence-ID (RTS-ID)). Second column is Gene (if multiple probes map to the same gene, their values are the same). Third columns is Probe (if multiple probes map to the same gene, values are different control_1, control_2). Starting from column 4, we have counts. String
count_matrix_metadata URL to a count matrix metadata in text format. All columns from dataset file are included; each row describes one AOI (area of illumination) String