Skip to content

Commit

Permalink
update readme and add docker
Browse files Browse the repository at this point in the history
  • Loading branch information
brentp committed Dec 17, 2018
1 parent 6456512 commit 67f198e
Show file tree
Hide file tree
Showing 2 changed files with 71 additions and 5 deletions.
36 changes: 31 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,31 @@
# slivar: filter/annotate variants in VCF/BCF format with simple expressions

slivar finds all trios in a VCF, PED pair and let's the user specify an expression with indentifiers
of `kid`, `mom`, `dad` that is applied to each possible trio. samples that pass that filter have the id
of the kid added to the INFO field.
of `kid`, `mom`, `dad` that is applied to each possible trio. For example, a simple expression to call
*de novo* variants:

```javascript
variant.FILTER == 'PASS' && \ #
variant.call_rate > 0.95 && \ # genotype must be known for most of cohort.
INFO.gnomad_af < 0.001 && \ # rare in gnomad (must be in INFO)
kid.alts == 1 && mom.alts == 0 && dad.alts == 0 && \ # alts are 0:hom_ref, 1:het, 2:hom_alt, -1:unknown
kid.DP > 7 && mom.DP > 7 && dad.DP > 7 \ # sufficient depth in all
&& (mom.AD[0] + dad.AD[0]) < 10 # no evidence in the parents
```
bpbio slivar \
--pass-only \ # output only variants that pass one of the filters.

This requires passing variants that are rare in gnomad that have the expected genotypes and do
not have any evidence in the parents. If there are 200 trios in the `ped::vcf` given, then this expression
will be tested on each of those 200 trios.

The expressions are javascript so the user can make these as complex as needed.


```
slivar \
--pass-only \ # output only variants that pass one of the filters (default is to output all variants)
--vcf $vcf \
--ped $ped \
--load functions.js \
--load functions.js \ # any valid javascript is allowed here.
--out-vcf annotated.bcf \
--info "variant.call_rate > 0.9" \ # this filter is applied before the trio filters and can speed evaluation if it is stringent.
--trio "denovo:kid.alts == 1 && mom.alts == 0 && dad.alts == 0 \
Expand All @@ -20,6 +36,8 @@ bpbio slivar \
--trio "informative:kid.GQ > 20 && dad.GQ > 20 && mom.GQ > 20 && kid.alts == 1 && ((mom.alts == 1 && dad.alts == 0) || (mom.alts == 0 && dad.alts == 1))" \
--trio "recessive:recessive_func(kid, mom, dad)"
```

Note that `slivar` does not give direct access to the genotypes, instead exposing `alts` where 0 is homozygous reference, 1 is heterozygous, 2 is
homozygous alternate and -1 when the genotype is unknown. It is recommended to **decompose** a VCF before sending to `slivar`

Expand All @@ -32,3 +50,11 @@ homozygous alternate and -1 when the genotype is unknown. It is recommended to *

+ sample attributes (via `kid`, `mom`, `dad`) include in the FORMAT. available as e.g. kid.AD[1]
+ sample attributes from the ped for `affected`, `sex` are available as, e.g. kid.sex.


## How it works

`slivar` embeds the [duktape javascript engine](https://duktape.org/) to allow the user to specify expressions.
For each variant, each trio (and each sample), it fills the appropriate `attributes`. This can be intensive for
VCFs with many samples, but this is done **as efficiently as possible** such that `slivar` can evaluate 10's of
thousand of variants per second even with dozens of trios.
40 changes: 40 additions & 0 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
FROM centos:centos6

RUN yum install -y git curl wget zlib-devel xz-devel bzip2-devel openssl-devel libcurl-devel && \
wget http://people.centos.org/tru/devtools-2/devtools-2.repo -O /etc/yum.repos.d/devtools-2.repo && \
yum install -y devtoolset-2-gcc devtoolset-2-binutils devtoolset-2-gcc-c++ && \
source scl_source enable devtoolset-2 && \
echo "source scl_source enable devtoolset-2" >> ~/.bashrc && \
echo "source scl_source enable devtoolset-2" >> ~/.bash_profile && \
wget --quiet https://ftp.gnu.org/gnu/m4/m4-1.4.18.tar.gz && \
tar xzf m4-1.4.18.tar.gz && cd m4* && ./configure && make && make install && cd .. && rm -rf m4* && \
wget --quiet http://ftp.gnu.org/gnu/autoconf/autoconf-2.69.tar.gz && \
tar xzf autoconf-2.69.tar.gz && \
cd autoconf* && ./configure && make && make install && cd .. && rm -rf autoconf* && \
git clone --depth 1 https://github.com/ebiggers/libdeflate.git && \
cd libdeflate && make -j 2 CFLAGS='-fPIC -O3' libdeflate.a && \
cp libdeflate.a /usr/local/lib && cp libdeflate.h /usr/local/include && \
cd .. && rm -rf libdeflate && \
git clone https://github.com/samtools/htslib && \
cd htslib && git checkout 1.9 && autoheader && autoconf && \
./configure --enable-libcurl --with-libdeflate && \
cd .. && make -j4 CFLAGS="-fPIC -O3" -C htslib install && \
echo "/usr/local/lib" >> etc/ld.so.conf && \
ldconfig

RUN cd / && \
git clone -b devel --depth 1000 git://github.com/nim-lang/nim nim && \
cd nim && \
chmod +x ./build_all.sh && \
scl enable devtoolset-2 ./build_all.sh && \
export PATH=/nim/bin:$PATH && \
echo 'PATH=/nim/bin:$PATH' >> ~/.bashrc && \
echo 'PATH=/nim/bin:$PATH' >> ~/.bash_profile && \
cd / && \
git clone --depth 1 git://github.com/brentp/slivar.git && \
cd slivar && \
nimble install -y hts && \
source scl_source enable devtoolset-2 && \
nim c -d:release -o:/usr/bin/slivar --passC:-flto src/slivar && \
rm -rf /slivar && slivar -h

0 comments on commit 67f198e

Please sign in to comment.