![Logo](/Gero1999/code/raw/main/R/LD-proxy/LD_plot.png)
If one variant is not present takes the closest alternative!
Explore the docs »
·
This project introduces a powerful algorithm that leverages PLINK reference data and the snpStats package to identify the closest variant to a given genetic marker. LD-Proxy is a crucial tool in genetic association studies, enabling researchers to efficiently select proxy variants that capture the same genetic signal as the target variant.
LD-Proxy, short for Linkage Disequilibrium Proxy, is a method used to estimate the genetic correlation between two variants based on their linkage disequilibrium (LD) patterns. In genetics, LD refers to the non-random association of alleles at different loci due to genetic linkage. By identifying proxy variants in high LD with the target variant, researchers can perform association analyses more efficiently, as proxy variants are often more readily available or less expensive to genotype.
This comprehensive LD-Proxy Algorithm in R leverages PLINK reference data and the snpStats package to execute the following steps:
-
Data Preprocessing: The algorithm takes as input the target genetic marker and a set of variants to consider as proxies. It preprocesses the PLINK reference data and performs data manipulation tasks to prepare the data for analysis.
-
LD Calculation: Using the snpStats package, LD-Proxy calculates the pairwise linkage disequilibrium (LD) between the target variant and each potential proxy variant. This step ensures that only variants in high LD with the target are selected as proxies.
-
Proxy Variant Selection: The algorithm identifies the variant with the highest LD value as the closest proxy variant to the target. This proxy variant is then recommended for downstream genetic association analyses.
To use the LD-Proxy Algorithm, you'll need to have R installed on your system. Additionally, ensure that the snpStats package is installed to handle the LD calculations.