vcfparser

Python code to analyze a vcf file

Introduction

The python script parses out the position, alternate alleles at each position. It also looks for number of reads that are supporting the presence of the alternalte allele. This is calulated by counting the percentage of the number of left (5' end, RPL tag) and right (3' end, RPR tag) and dividing against the total number of reads covering that particular locus. Then it searches for the allele frequency and the dbSNP ID in the ExAC database using its REST API. If no allele frequency is found it returns "NA" and if no dbSNP ID is found "." is returned. The output is stored in the form of a table in the file annotated.{vcf filename}.txt

Dependencies

Python3.6+
Python libaries used: a) requests (for REST API) b) json (for parsing json records) c) argparse (for passing inputs from bash into python) d) time (calculating execution time)

Syntax

python3 vcfparser.py --vcf {vcf filename}

Example

python3 vcfparser.py --vcf "Challenge_data (1).vcf"

Output: annotated.Challenge_data (1).csv

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Challenge_data (1).vcf		Challenge_data (1).vcf
LICENSE		LICENSE
README.md		README.md
annotated.Challenge_data (1).csv		annotated.Challenge_data (1).csv
vcfparser.py		vcfparser.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vcfparser

Introduction

Dependencies

Syntax

Example

About

Releases

Packages

Languages

License

robinpaul85/vcfparser

Folders and files

Latest commit

History

Repository files navigation

vcfparser

Introduction

Dependencies

Syntax

Example

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages