Running Metalign

Preliminaries

For the example below, we assume a directory called "/my/directory/Metalign/" where Metalign's repo has been installed (according to the wiki page) and your reads file is called "my_reads.fq" and is in the same directory. We assume you are in "/my/directory/Metalign/" when running these commands. In this example, the output file will be /my/directory/Metalign/metalign_results.tsv.

In reality, you do not need to run Metalign from within its install directory; just make sure you update your file paths correctly.

The Easy Way

Metalign comes with a wrapper script called metalign.py that runs both stages of the method. The quick usage is:

python3 metalign.py my_reads.fq data/ --output metalign_results.tsv

For more sensitive or precise results, you can try running this with the "--sensitive" or "--precise" flags added.

Running the two steps separately

You can run the two steps of Metalign (pre-filtering the database, then alignment+profiling) separately. One possible use case of this is to retain the pre-filtered database for other uses. Another is if you already have a SAM file and you want to pass this into the alignment+profiling stage (which will then only perform the profiling step). Finally, this allows you to tinker with the parameters more.

The most simple case will look something like this:

python3 select_db.py my_reads.fq data/ --db cmash_db.fna
python3 map_and_profile.py my_reads.fq data/ --db cmash_db.fna --output metalign_results.tsv

There are other options that can be explored, including modulating the filtering parameters for the pre-filtering stage and the profiling stage, normalizing abundances by genome length, and more. For more details, see the "Command line option descriptions" wiki page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running Metalign

Preliminaries

The Easy Way

Running the two steps separately

Clone this wiki locally