Nanostring GeoMx DSP
This section contains two workflows: geomxngs_fastq_to_dcc and geomxngs_dcc_to_count_matrix.
geomxngs_fastq_to_dcc workflow wraps Nanostring GeoMx Digital Spatial NGS Pipeline and can convert FASTQ files into DCC files.
geomxngs_dcc_to_count_matrix workflow takes the DCC zip file from geomxngs_fastq_to_dcc and other files produced by the GeoMx DSP machine as inputs, and outputs an area of illumination (AOI) by probe count matrix with pathologists’ annotation.
Convert FASTQ files into DCC files by the Nanostring GeoMx Digital Spatial NGS Pipeline
The geomxngs_fastq_to_dcc workflow converts FASTQ files to DCC files by wrapping the Nanostring GeoMx Digital Spatial NGS Pipeline. After generating DCC files, use the geomxngs_dcc_to_count_matrix workflow to generate an area of interest by probe count matrix.
Workflow Input
Relevant workflow inputs are described below (required inputs in bold)
Name |
Description |
Example |
Default |
---|---|---|---|
fastq_directory |
FASTQ directory URL |
“gs://foo/bar/fastqs” or “s3://foo/bar/fastqs” |
|
ini |
Configuration file in INI format, containing pipeline processing parameters |
“gs://foo/bar/config.ini” |
|
output_directory |
URL to write results |
“gs://foo/bar/out” or “s3://foo/bar/out” |
|
fastq_rename |
Optional 2 column TSV file with no header used to map original FASTQ names to FASTQ names that GeoMX recognizes. |
“gs://foo/bar/fastq_rename.tsv” |
|
delete_fastq_directory |
Whether to delete the input fastqs upon successful completion |
true |
false |
geomxngs_version |
Version of the geomx software, currently only “2.3.3.10”. |
“2.3.3.10” |
“2.3.3.10” |
docker_registry |
Docker registry to use for this workflow. Options:
|
“quay.io/cumulus” |
“quay.io/cumulus” |
backend |
|
“aws” |
“gcp” |
zones |
Google cloud zones |
“us-central1-a” |
“us-central1-a us-central1-b us-central1-c us-central1-f” |
preemptible |
Number of preemptible tries |
2 |
2 |
memory |
Memory string |
“64GB” |
“64GB” |
cpu |
Number of CPUs |
4 |
4 |
disk_space |
Disk space in GB |
500 |
500 |
aws_queue_arn |
The arn URI of the AWS job queue to be used. Only works when backend is aws. |
“arn:aws:batch:us-east-1:xxx:job-queue/priority-gwf” |
“” |
Workflow Output
Name |
Description |
Type |
---|---|---|
dcc_zip |
URL to the output DCC zip file |
String |
geomxngs_output |
URL to the output of geomxngspipeline; the DCC zip file is part of the output here |
String |
Generate probe count matrix with pathologists’ annotation
The geomxngs_dcc_to_count_matrix workflow generates an area of illumination (AOI) by probe count matrix with patholgoists’ annotation from the output of the geomxngs_fastq_to_dcc workflow and user inputs.
Workflow Input
Workflow inputs are described below (required inputs in bold).
Name |
Description |
Example |
Default |
---|---|---|---|
dcc_zip |
DCC zip file from geomxngs_fastq_to_dcc workflow output |
“gs://foo/bar/out/DCC-20221001.zip” |
|
ini |
Configuration file in INI format, containing pipeline processing parameters |
“gs://foo/bar/config.ini” |
|
lab_worksheet |
A text file containing library setups |
“gs://foo/bar/LabWorksheet.txt” |
|
dataset |
Data QC and annotation file (Excel) downloaded from instrument after uploading DCC zip file; we only use the first tab (SegmentProperties) |
“gs://foo/bar/BioprobeQC.xlsx” |
|
pkc |
GeoMx DSP configuration file to associate assay targets with GeoMx HybCode barcodes and Seq Code primers. Options: - CTA_v1.0-4 for Cancer Transcriptome Atlas - COVID-19_v1.0 for COVID-19 Immune Response Atlas - Human_WTA_v1.0 for Human Whole Transcriptome Atlas - Mouse_WTA_v1.0 for Mouse Whole Transcriptome Atlas If your configuration file is not listed, you can provide a URL to a PKC zip file or PKC file instead. |
“Human_WTA_v1.0” |
|
output_directory |
URL to write results |
“gs://foo/bar/out” or “s3://foo/bar/out” |
|
backend |
Backend for computation. Available options: - “gcp” for Google Cloud - “aws” for Amazon AWS - “local” for local machine |
“aws” |
“gcp” |
docker_registry |
Docker registry to use for this workflow. Options:
|
“quay.io/cumulus” |
“quay.io/cumulus” |
docker_version |
Docker image version. |
“1.0.0” |
“1.0.0” |
preemptible |
Number of preemptible tries |
2 |
2 |
memory |
Memory string |
“8GB” |
“8GB” |
cpu |
Number of CPUs |
1 |
1 |
extra_disk_space |
Extra disk space in GB. |
5 |
5 |
aws_queue_arn |
The arn URI of the AWS job queue to be used. Only works when backend is aws |
“arn:aws:batch:us-east-1:xxx:job-queue/priority-gwf” |
“” |
Workflow Output
Name |
Description |
Type |
---|---|---|
count_matrix_h5ad |
URL to a count matrix in h5ad format. X contains the count matrix, obs contains AOI information, and .var contains probe metadata |
String |
count_matrix_text |
URL to a count matrix in text format. Each row is one probe and each column is one AOI. First column is RTS_ID (Readout Tag Sequence-ID (RTS-ID)). Second column is Gene (if multiple probes map to the same gene, their values are the same). Third columns is Probe (if multiple probes map to the same gene, values are different control_1, control_2). Starting from column 4, we have counts. |
String |
count_matrix_metadata |
URL to a count matrix metadata in text format. All columns from dataset file are included; each row describes one AOI (area of illumination) |
String |