Skip to content

Arcadia-Science/2024-worm-tracking

Repository files navigation

Analyzing C. elegans motility phenotypes with Tierpsy tracker

run with conda

Purpose

This repository accompanies the pub, "An experimental and computational workflow to characterize nematode motility behavior". This repository implements an automated approach for analyzing worm motility phenotypes. This pipeline is designed to assess worm motility phenotypes from images captured on an upright wide field microscope. Our images have the following profile:

  • 30 second acquisitions
  • 24.5 frames per second
  • Field of view:
    • 1976 x 1976 pixels
    • 1.625 microns per pixel
  • Worms plated on agar without a bacterial lawn (OP50).

The image analysis pipeline produces statistical estimates of motility phenotype differences between two strains (typically wild type and mutant).

Installation and Setup

This repository primarily uses conda to manage software environments and installations. You can find operating system-specific instructions for installing miniconda here. After installing conda and mamba, run the following command to create the pipeline run environment.

mamba env create -n wormmotility --file envs/dev.yml
conda activate wormmotility

In addition, the tool Tierpsy tracker recommends/requires installation via Docker. Because of the way the Docker container is configured, we had trouble running it with Singularity inside of snakemake (see this issue). We came up with a workaround where the Docker container runs in the background and then commands are executed in the Docker container by Snakemake. This requires starting the Docker container before running the pipeline. This is a sub-par solution, but we decided this was the best approach given time and bandwidth limitations.

To enable Tierpsy tracker execution within the Docker container and via snakemake, start by installing Docker Desktop according to your operating system. (Linux Ubuntu instructions are available here). Also note that even when installed via Docker, Tierpsy tracker will not work on Mac computers with ARM-based processors (Apple silicon, M* chips). Once Docker is installed, you may need to adjust user permissions to allow Docker to run without invoking sudo privileges for every command. We used the following commands to configure Docker for this.

# Check if you're in the Docker group. Among other words, this should print docker
groups

# If the above doesn't say docker, add user to docker group
sudo usermod -aG docker $USER

# Export docker host to env variable
export DOCKER_HOST=unix:///var/run/docker.sock

Once configured, start Docker. This command will need sudo privileges still.

sudo systemctl start docker

Once Docker is started, pull the Tierpsy tracker Docker container. We provide a modified Docker container that does not launch the graphical user interface upon launch.

docker pull arcadiascience/tierpsy-tracker-no-gui:fc691a090d8a

Then, start a non-interactive session where the Docker container will run in the background.

bash scripts/tierpsy_linux_no_gui_background.sh

Check that the container is running using the command below. There should be a row in the returned table with the name my_tierpsy_container.

docker ps

We're now ready to execute the pipeline using Snakemake. Snakemake itself is installed in the main development conda environment as specified in the dev.yml file.

To start the pipeline, run:

snakemake -j 1 \
    --software-deployment-method conda \
    --rerun-incomplete \
    --config input_dirpath=/path/to/raw/dataset/dir input_prefix=/prefix/to/remove/from/input_dirpath output_dirpath=outputs

Where:

  • -j: designates the number of cores used by Snakemake to parallelize rules.
  • --software-deployment-method: tells Snakemake to launch each rule in a conda environment where specified.
  • --rerun-incomplete: tells Snakemake to check that all file are completely written and to re-run those that are not.
  • --config: feeds pipeline-specific configuration parameters to snakemake.
    • input_dirpath: The directory where input files are located. If files are located in subdirectories, this is the root filepath for all directories to be analyzed by the snakemake run.
    • input_prefix: Portion of input_dirpath to omit from output file names. A file's absolute path is used as the identifier by this pipeline. When an input_prefix is supplied, the prefix will be removed from the output filepath (for example, instead of having /home/username/ in every output file path, this prefix would be removed). This removes non-identifying information from the output filepaths so that they directory structure doesn't become unnecessarily deep.
    • output_dirpath: Directory path to write output files.

Data

This pipeline is designed to run on videos (time series of images collected from a single field of view) of live adult C. elegans. Importantly, the videos should have a relatively homogenous background (i.e., little variation in intensity or contrast). It takes raw image files (in Nikon's ND2 format) as input and outputs motility phenotypes for the worms, statistical analysis comparing strains, and quality control reports.

The raw data (N2 format) we analyzed as well as the Tierpsy Tracker outputs are available for download from the BioImage Archive (accession S-BIAD1563).

Overview

This repository implements a simple workflow to estimate and compare worm motility phenotypes. It is centered around Tierpsy tracker, a method that processes and tracks worms and produces motility data for those worms (publication). The pipeline uses the per-worm, per-frame time series motility estimates to generate simple features (ex. mean length, mean speed, mean tail speed, etc.) that are then compared between strains.

Description of the folder structure

Folders and files in this repository

  • conf/: Configuration files for the tools executed by the pipeline, mainly Tierpsy tracker.
  • docker/: Tierpsy tracker needs to be installed by Docker. We provide a Dockerfile documenting changes we made to the Tierpsy tracker image to allow the image to start without a GUI.
  • envs/: This repository uses conda to manage software installations and versions. Other than Tierpsy tracker, all software installations are managed by environment files in this directory.
  • scripts/: Python, R and bash scripts used by the Snakefile in this repository.
  • LICENSE: License specifying the re-use terms for the code in this repository.
  • README.md: File outlining the contents of this repository and how to use the image analysis pipeline.
  • Snakefile: The snakemake workflow file that orchestrates the full image analysis pipeline.
  • .github/, .vscode/, .gitignore, .pre-commit-config.yaml, Makefile, pyproject.toml: Files that control the development environment of the repository.

Folders and files output by the workflow

In the user-specified output directory, snakemake creates the following intermediate folders: dogfilter_mov/, dogfilter_projection/, and dogfilter_tiff/.

The Tierpsy tracker results are in the folder tierpsy_out/. The HDF5 files in the masks directory are intermediate files that record a mask that shows which worms are tracked. These files are used by the quality control portion of the pipeline. The HDF5 files in the results directory record the motility information for worms. See the Tierpsy tracker documentation for more information about these outputs.

Methods

The Snakemake file in this repository orchestrates the analysis of raw time series images (videos) for extracting and comparing motility phenotypes between strains of C. elegans. The pipeline follows the following steps.

Motility analysis and comparison:

  1. Convert raw images from Nikon's ND2 format to TIFF format.
  2. Apply a difference of gaussian (DoG) filter to the TIFF images. This detects differences between the background and foreground. It retains the foreground (the worms) while masking out the background.
  3. Convert the TIFF image series to MOV files, as MOV is the format required by Tierpsy tracker.
  4. Run the Tierpsy tracker analysis to produce motility estimates for each worm.
  5. Process the Tierpsy tracker raw motility estimates and perform statistical analysis to compare strains.

Quality control:

  1. Make a projection from the DoG-filtered TIFF. This creates a summary PNG where all TIFF files from a single time series are overlaid, so that all movement of the worms over the 30 second acquisition is summarized in a single image.
  2. Compare the Tierpsy tracker mask to the raw image for a single frame.

Compute Specifications

We executed this pipeline on a Linux Ubuntu machine. While the machine has 64 cores and 512 GB RAM, we ran the pipeline on a single core using a small fraction of the available RAM. While many of the components of the pipeline would be run on a Mac with an Intel chip, we have tailored the pipeline for Ubuntu (Tierpsy Tracker installation and launch).

Contributing

See how we recognize feedback and contributions to our code.