Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add final integration tests #15

Merged
merged 10 commits into from
Jun 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 5 additions & 20 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,30 +3,15 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## In-development
## 0.1.0 - 2024/06/18

- Fixed nf-core tools linting failures introduced in version 2.12.1.
- Added phac-nml prefix to nf-core config

## 1.0.3 - 2024/02/23

- Pinned [email protected] plugin

## 1.0.2 - 2023/12/18

- Removed GitHub workflows that weren't needed.
- Adding additional parameters for testing purposes.

## 1.0.1 - 2023/12/06

Allowing non-gzipped FASTQ files as input. Default branch is now main.

## 1.0.0 - 2023/11/30

Initial release of phac-nml/gasnomenclature, created with the [nf-core](https://nf-co.re/) template.
Initial release of the Genomic Address Nomenclature pipeline to be used to assign cluster addresses to samples based on an existing cluster designations.

### `Added`

- Input of cg/wgMLST allele calls produced from [locidex](https://github.com/phac-nml/locidex).
- Output of assigned cluster addresses for any **query** samples using [profile_dists](https://github.com/phac-nml/profile_dists) and [gas call](https://github.com/phac-nml/genomic_address_service).

### `Fixed`

### `Dependencies`
Expand Down
12 changes: 12 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,18 @@

## Pipeline tools

- [locidex](https://github.com/phac-nml/locidex) (in-development, citation subject to change)

> Robertson, James, Wells, Matthew, Christy-Lynn, Peterson, Kyrylo Bessonov, Reimer, Aleisha, Schonfeld, Justin. LOCIDEX: Distributed allele calling engine. 2024. https://github.com/phac-nml/locidex

- [profile_dists](https://github.com/phac-nml/profile_dists) (in-development, citation subject to change)

> Robertson, James, Wells, Matthew, Schonfeld, Justin, Reimer, Aleisha. Profile Dists: Convenient package for comparing genetic similarity of samples based on allelic profiles. 2023. https://github.com/phac-nml/profile_dists

- [genomic_address_service (GAS)](https://github.com/phac-nml/genomic_address_service) (in-development, citation subject to change)

> Robertson, James, Wells, Matthew, Schonfeld, Justin, Reimer, Aleisha. Genomic Address Service: Convenient package for de novo clustering and sample assignment to existing clusters. 2023. https://github.com/phac-nml/genomic_address_service

## Software packaging/containerisation tools

- [Anaconda](https://anaconda.com)
Expand Down
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) Aaron Petkau
Copyright (c) Government of Canada

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
5 changes: 3 additions & 2 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ The IRIDA Next-compliant JSON output file will be named `iridanext.output.json.g

The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps:

- [Input check](#input-check) - Performs a validation check on the samplesheet inputs to ensure that the sampleID precisely matches the MLST JSON key.
- [Input assure](#input-assure) - Performs a validation check on the samplesheet inputs to ensure that the sampleID precisely matches the MLST JSON key and enforces necessary changes where discrepancies are found.
- [Locidex merge](#locidex-merge) - Merges MLST profile JSON files into a single profiles file for reference and query samples.
- [Profile dists](#profile-dists) - Computes pairwise distances between genomes using MLST allele differences.
- [Cluster file](#cluster-file) - Generates the expected_clusters.txt file from reference sample addresses for use in GAS_call.
Expand All @@ -29,13 +29,14 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
- [IRIDA Next Output](#irida-next-output) - Generates a JSON output file that is compliant with IRIDA Next
- [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution

### Input Check
### Input Assure

<details markdown="1">
<summary>Output files</summary>

- `input/`
- `sampleID_error_report.csv`
- `sampleID.mlst.json.gz`

</details>

Expand Down
5 changes: 5 additions & 0 deletions tests/data/called/expected_results_count-missing.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
id address level_1
sample1 1 1
sample2 1 1
sample3 2 2
sampleQ 1 1
5 changes: 5 additions & 0 deletions tests/data/called/expected_results_loci-missing.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
id address level_1
sample1 1 1
sample2 1 1
sample3 2 2
sampleQ 1 1
5 changes: 5 additions & 0 deletions tests/data/called/expected_results_missing.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
id address level_1
sample1 1 1
sample2 1 1
sample3 2 2
sampleQ 1 1
5 changes: 5 additions & 0 deletions tests/data/called/expected_results_thresh_1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
id address level_1
sample1 1 1
sample2 1 1
sample3 1 1
sampleQ 1 1
5 changes: 5 additions & 0 deletions tests/data/called/expected_results_thresh_1_0.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
id address level_1 level_2
sample1 1.1 1 1
sample2 1.1 1 1
sample3 1.1 1 1
sampleQ 1.2 1 2
4 changes: 4 additions & 0 deletions tests/data/clusters/expected_clusters_missing.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
id address level_1
sample1 1 1
sample2 1 1
sample3 2 2
1 change: 1 addition & 0 deletions tests/data/columns/keep-zero-loci-empty-file.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

5 changes: 5 additions & 0 deletions tests/data/distances/expected_dists_count-missing.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
query_id ref_id dist
sampleQ sampleQ 0
sampleQ sample1 1
sampleQ sample2 2
sampleQ sample3 3
5 changes: 5 additions & 0 deletions tests/data/distances/expected_dists_loci-missing.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
query_id ref_id dist
sampleQ sampleQ 0
sampleQ sample1 1
sampleQ sample2 2
sampleQ sample3 3
5 changes: 5 additions & 0 deletions tests/data/distances/expected_dists_missing.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
query_id ref_id dist
sampleQ sampleQ 0
sampleQ sample1 1
sampleQ sample2 1
sampleQ sample3 2
5 changes: 5 additions & 0 deletions tests/data/distances/expected_dists_thresh_1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
query_id ref_id dist
sampleQ sampleQ 0
sampleQ sample1 1
sampleQ sample2 1
sampleQ sample3 2
5 changes: 5 additions & 0 deletions tests/data/distances/expected_dists_thresh_1_0.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
query_id ref_id dist
sampleQ sampleQ 0
sampleQ sample1 1
sampleQ sample2 1
sampleQ sample3 2
13 changes: 13 additions & 0 deletions tests/data/irida/count-missing_iridanext.output.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"files": {
"global": [],
"samples": {}
},
"metadata": {
"samples": {
"sampleQ": {
"address": "1"
}
}
}
}
13 changes: 13 additions & 0 deletions tests/data/irida/loci-missing_iridanext.output.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"files": {
"global": [],
"samples": {}
},
"metadata": {
"samples": {
"sampleQ": {
"address": "1"
}
}
}
}
13 changes: 13 additions & 0 deletions tests/data/irida/missing_iridanext.output.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"files": {
"global": [],
"samples": {}
},
"metadata": {
"samples": {
"sampleQ": {
"address": "1"
}
}
}
}
13 changes: 13 additions & 0 deletions tests/data/irida/thresh1.0_iridanext.output.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"files": {
"global": [],
"samples": {}
},
"metadata": {
"samples": {
"sampleQ": {
"address": "1.2"
}
}
}
}
13 changes: 13 additions & 0 deletions tests/data/irida/thresh1_iridanext.output.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"files": {
"global": [],
"samples": {}
},
"metadata": {
"samples": {
"sampleQ": {
"address": "1"
}
}
}
}
5 changes: 5 additions & 0 deletions tests/data/profiles/expected-profile_missing1.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
sample_id l1 l2 l3
sampleQ 1 2 1
sample1 1 1 1
sample2 - 1 1
sample3 - 1 2
2 changes: 2 additions & 0 deletions tests/data/profiles/expected-profile_missing2.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
sample_id l1 l2 l3
sampleQ 1 2 1
7 changes: 7 additions & 0 deletions tests/data/reports/sample2_missing.mlst.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"sample2": {
"l1": "-",
"l2": "1",
"l3": "1"
}
}
7 changes: 7 additions & 0 deletions tests/data/reports/sample3_missing.mlst.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"sample3": {
"l1": "-",
"l2": "1",
"l3": "2"
}
}
5 changes: 5 additions & 0 deletions tests/data/samplesheets/samplesheet-hash_missing.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
sample,mlst_alleles,address
sampleQ,https://raw.githubusercontent.com/phac-nml/gasnomenclature/dev/tests/data/reports/sampleQ.mlst.json,
sample1,https://raw.githubusercontent.com/phac-nml/gasnomenclature/dev/tests/data/reports/sample1.mlst.json,1
sample2,https://raw.githubusercontent.com/phac-nml/gasnomenclature/add_tests/tests/data/reports/sample2_missing.mlst.json,1
sample3,https://raw.githubusercontent.com/phac-nml/gasnomenclature/add_tests/tests/data/reports/sample3_missing.mlst.json,2
5 changes: 5 additions & 0 deletions tests/data/samplesheets/samplesheet_thresh_1.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
sample,mlst_alleles,address
sampleQ,https://raw.githubusercontent.com/phac-nml/gasnomenclature/dev/tests/data/reports/sampleQ.mlst.json,
sample1,https://raw.githubusercontent.com/phac-nml/gasnomenclature/dev/tests/data/reports/sample1.mlst.json,1
sample2,https://raw.githubusercontent.com/phac-nml/gasnomenclature/dev/tests/data/reports/sample2.mlst.json,1
sample3,https://raw.githubusercontent.com/phac-nml/gasnomenclature/dev/tests/data/reports/sample3.mlst.json,1
5 changes: 5 additions & 0 deletions tests/data/samplesheets/samplesheet_thresh_1_0.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
sample,mlst_alleles,address
sampleQ,https://raw.githubusercontent.com/phac-nml/gasnomenclature/dev/tests/data/reports/sampleQ.mlst.json,
sample1,https://raw.githubusercontent.com/phac-nml/gasnomenclature/dev/tests/data/reports/sample1.mlst.json,1.1
sample2,https://raw.githubusercontent.com/phac-nml/gasnomenclature/dev/tests/data/reports/sample2.mlst.json,1.1
sample3,https://raw.githubusercontent.com/phac-nml/gasnomenclature/dev/tests/data/reports/sample3.mlst.json,1.1
Loading
Loading