Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 1.0.0 #8

Merged
merged 54 commits into from
Aug 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
db6492d
Release 1.0.0
charles-plessy Jul 19, 2024
eb13134
Remove the PSEUDO seed from the schema.
charles-plessy Jul 23, 2024
b58818a
Correct duplicated text
charles-plessy Jul 24, 2024
94545b9
Reduce redundancy with `doc/output.md` as suggested in PR #9.
charles-plessy Jul 24, 2024
18a2ba1
Also cite the paper describing the original implementation.
charles-plessy Jul 24, 2024
c9d2be8
Correct the list of accepted file suffixes
charles-plessy Jul 24, 2024
adda591
Remove mention of unported parameters.
charles-plessy Jul 24, 2024
27b0c0a
Improve wording of docs/usage.md
charles-plessy Jul 24, 2024
a58961b
Mention --input explicitely
charles-plessy Jul 24, 2024
2ed941b
Fix typo
charles-plessy Jul 24, 2024
157e675
Fix markdown formatting.
charles-plessy Jul 24, 2024
5463b6b
modified the pipeline logo, now png formatted
U13bs1125 Jul 24, 2024
6df876d
modified the pipeline logo, now png formatted
U13bs1125 Jul 24, 2024
8eefc01
added the new svg formatted pipeline logo/map
U13bs1125 Jul 24, 2024
cd92f37
Merge pull request #10 from oist/devop
charles-plessy Jul 24, 2024
3476a81
Merge branch 'dev' of github.com:nf-core/pairgenomealign into dev
charles-plessy Jul 24, 2024
2940837
Indent workflows/pairgenomealign.nf
charles-plessy Jul 24, 2024
2baf4b4
Remove un-needed example
charles-plessy Jul 24, 2024
054e436
Remove dangling filename.
charles-plessy Jul 24, 2024
f279c5c
Show the full sample sheet as an example.
charles-plessy Jul 24, 2024
11d1457
Multi-query example
charles-plessy Jul 24, 2024
c5ec40a
Slim nextflow.config
charles-plessy Jul 24, 2024
5d36b75
Move LAST output to `alignment/`
charles-plessy Jul 24, 2024
7fb4f49
Merge branch 'dev' of github.com:nf-core/pairgenomealign into dev
charles-plessy Jul 24, 2024
850605e
Put `seqtk cutN` output in `cutn/` and document it.
charles-plessy Jul 24, 2024
cabd248
Remove mention of FastQC
charles-plessy Jul 24, 2024
d3e2e86
[automated] Fix code linting
nf-core-bot Jul 24, 2024
0bb9ad7
Remove duplicated documentation.
charles-plessy Jul 24, 2024
d2406ac
Merge branch 'dev' of github.com:nf-core/pairgenomealign into dev
charles-plessy Jul 24, 2024
a93536f
Fix typo
charles-plessy Jul 24, 2024
c029ae4
Add a human–monkey alignment as example.
charles-plessy Jul 25, 2024
3e349a9
Update workflows/pairgenomealign.nf
charles-plessy Jul 25, 2024
340d9c6
Rename the custom module and document its output.
charles-plessy Jul 25, 2024
6949ad5
Merge branch 'dev' of github.com:nf-core/pairgenomealign into dev
charles-plessy Jul 25, 2024
ab93bb4
Revert "Update workflows/pairgenomealign.nf"
charles-plessy Jul 25, 2024
420a929
Polish parameter description.
charles-plessy Jul 25, 2024
0b417aa
Move tube map to docs/ hoping it solves display problem.
charles-plessy Jul 25, 2024
e9fb4bd
Add an example dot-plot
charles-plessy Jul 25, 2024
eca7b83
Remove FASTQC examples.
charles-plessy Jul 25, 2024
823bcdc
Add new multiqc examples
charles-plessy Jul 25, 2024
591ee73
Merge branch 'dev' of github.com:nf-core/pairgenomealign into dev
charles-plessy Jul 25, 2024
057a097
Display example MultiQC plots
charles-plessy Jul 25, 2024
123d9dc
prettier
charles-plessy Jul 25, 2024
44fff18
modified the logomap again as advised by nfcore team
U13bs1125 Jul 25, 2024
cc8234a
Merge pull request #13 from oist/devlogo
charles-plessy Jul 26, 2024
dd56788
Add a codename
charles-plessy Jul 26, 2024
d5df279
Fix filename
charles-plessy Jul 26, 2024
3737d98
Merge branch 'dev' of github.com:nf-core/pairgenomealign into dev
charles-plessy Jul 26, 2024
872991d
Use a Markdown link instead of HTML.
charles-plessy Jul 26, 2024
57224ea
pre-commit fixes
charles-plessy Jul 26, 2024
2d4f08c
Rename and document some table columns
charles-plessy Aug 7, 2024
830d557
Thank Martin and teammates
charles-plessy Aug 7, 2024
a298b19
Remove mention of lastdb -P because it does not impact the alignment …
charles-plessy Aug 8, 2024
c493be3
Update release date in CHANGELOG.md
charles-plessy Aug 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 1 addition & 9 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,6 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v1.0dev - [date]
## v1.0.0 "Sweet potato" - [August 27th, 2024]

Initial release of nf-core/pairgenomealign, created with the [nf-core](https://nf-co.re/) template.

### `Added`

### `Fixed`

### `Dependencies`

### `Deprecated`
4 changes: 4 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@

> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.

## Pipeline design

> Charles Plessy, Michael J. Mansfield, Aleksandra Bliznina, Aki Masunaga, Charlotte West, Yongkai Tan, Andrew W. Liu, Jan Grašič, María Sara del Río Pisula, Gaspar Sánchez-Serna, Marc Fabrega-Torrus, Alfonso Ferrández-Roldán, Vittoria Roncalli, Pavla Navratilova, Eric M. Thompson, Takeshi Onuma, Hiroki Nishida, Cristian Cañestro, Nicholas M. Luscombe. Extreme genome scrambling in marine planktonic Oikopleura dioica cryptic species. Genome Res. 2024. 34: 426-440; doi: [10.1101/2023.05.09.539028](https://doi.org/10.1101/gr.278295.123). PubMed ID: [38621828](https://pubmed.ncbi.nlm.nih.gov/38621828/)

## Pipeline tools

- [LAST](https://gitlab.com/mcfrith/last/)
Expand Down
15 changes: 7 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,9 @@

**nf-core/pairgenomealign** is a bioinformatics pipeline that aligns one or more _query_ genomes to a _target_ genome, and plots pairwise representations.

<img src= "assets/tube_map.svg">
![Tubemap workflow summary](docs/images/pairgenomealign-tubemap.png "Tubemap workflow summary")

The pipeline can generate four kinds of outputs, depending on whether sequences of one genome can match the other genome multiple times or not.

- _**many-to-many**_ (M2M): Every computed alignments between the _target_ and a _query_ genome.
- _**many-to-one**_ (M2O): Alignments where regions of the _target_ genome are matched at most once by a _query_ genome.
- _**one-to-many**_ (M2O): Alignments where regions of a _query_ genome are matched at most once by the _target_ genome.
- _**one-to-one**_ (O2O) Alignment where regions of the _target_ and _query_ genomes are used at most once.
The pipeline can generate four kinds of outputs, called _many-to-many_, _many-to-one_, _one-to-many_ and _one-to-one_, depending on whether sequences of one genome are allowed match the other genome multiple times or not.

These alignments are output in [MAF](https://genome.ucsc.edu/FAQ/FAQformat.html#format5) format, and optional line plot representations are output in PNG format.

Expand Down Expand Up @@ -77,7 +72,11 @@ For more details about the output files and reports, please refer to the

We thank the following people for their extensive assistance in the development of this pipeline:

- [Mahdi Mohammed](https://github.com/U13bs1125): ported the original pipeline to _nf-core_ template 2.14.x.
- [Mahdi Mohammed](https://github.com/U13bs1125) ported the original pipeline to _nf-core_ template 2.14.x.
- [Martin Frith](https://github.com/mcfrith/), the author of LAST, gave us extensive feedback and advices.
- [Michael Mansfield](https://github.com/mjmansfi) tested the pipeline and provided critical comments.
- [Aleksandra Bliznina](https://github.com/aleksandrabliznina) contributed to the creation of the initial `last/*` modules.
- [Jiashun Miao](https://github.com/miaojiashun) and [Huyen Pham](https://github.com/ngochuyenpham) tested the pipeline on vertebrate genomes.

## Contributions and Support

Expand Down
33 changes: 31 additions & 2 deletions assets/multiqc_config.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
report_comment: >
This report has been generated by the <a href="https://github.com/nf-core/pairgenomealign/tree/dev" target="_blank">nf-core/pairgenomealign</a>
This report has been generated by the <a href="https://github.com/nf-core/pairgenomealign/releases/tag/1.0.0" target="_blank">nf-core/pairgenomealign</a>
analysis pipeline. For information about how to interpret these results, please see the
<a href="https://nf-co.re/pairgenomealign/dev/docs/output" target="_blank">documentation</a>.
<a href="https://nf-co.re/pairgenomealign/1.0.0/docs/output" target="_blank">documentation</a>.
report_section_order:
"nf-core-pairgenomealign-methods-description":
order: -1000
Expand All @@ -19,10 +19,39 @@ custom_data:
file_format: "tsv"
section_name: "Training parameter statistics"
plot_type: "table"
headers:
id:
title: "ID"
description: "target___query"
substitution_percent_identity:
title: "Substitution Percent Identity"
"last -t":
title: "Temperature"
description: "Parameter for converting between scores and probability ratios. This affects the column ambiguity estimates. A score is converted to a probability ratio by this formula: exp(score / TEMPERATURE). The default value is 1/lambda, where lambda is the scale factor of the scoring matrix, which is calculated by the method of Yu and Altschul (YK Yu et al. 2003, PNAS 100(26):15688-93)."
"last -a":
title: "Gap existence"
description: "Gap existence cost (lastal -a)"
"last -b":
title: "Gap extension"
description: "Gap extension cost (lastal -b)"
"last -A":
title: "Insertion existence"
description: "Insertion existence cost (lastal -A)"
"last -B":
title: "Insertion extension"
description: "Insertion extension cost (lastal -B)"
last_o2o:
file_format: "tsv"
section_name: "Alignment statistics"
plot_type: "table"
headers:
id:
title: "ID"
description: "target__query"
TotalAlignmentLength:
title: "Total alignment length"
PercentSimilarity:
title: "Percent similarity"

sp:
last_o2o:
Expand Down
2 changes: 1 addition & 1 deletion assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
"format": "file-path",
"exists": true,
"pattern": "^\\S+\\.f(ast|n)?a(\\.gz)?$",
"errorMessage": "Fasta file for genomes must be provided, cannot contain spaces and must have extension '.fa', '.fa.gz', '.fna', '.fna.gz', '.fasta' or '.fasta.gz'"
"errorMessage": "Fasta file for genomes must be provided, cannot contain spaces and must have extension `.fa`, `.fa.gz`, `.fna`, `.fna.gz`, `.fasta` or `.fasta.gz`"
}
},
"required": ["sample", "fasta"]
Expand Down
Loading
Loading