Major update of Taiji with enhanced downstream analysis, featuring:
-
Cell-state specificity analysis
-
TF’s edgeweight per locus calculation
-
TF transcriptional wave
-
TF-regulatee analysis
-
TF-TF interaction network
-
Perturb seq integration/heuristic score calculation.
First install Taiji. Check Taiji github
curl -L https://github.com/Taiji-pipeline/Taiji/releases/latest/download/taiji-CentOS-x86_64 -o taiji
chmod +x taiji
./taiji --help
Then prepare configure file and input file following instructions in Taiji website
taiji run --config config.yml -n 3 +RTS -N3
To replicate the paper's results, use the configure file and input file in this repo.
After running Taiji, you will have GeneRanks.tsv
file in /some_path_to_Taiji/output/
, which stores the PageRank scores of TFs across samples. This will be the major data in the following downstream analysis.
Additionally, gene expression (normalized by TPM) file expression_profile.tsv
is also available in folder /some_path_to_Taiji/output/RNASeq/
In addition to PageRank scores and gene expression matrices, group file with cell state annotation is also needed. Check the demo group file for example.
Follow the tutorial step by step. Required R packages are listed in the beginning. Alternatively, you can download the original code for further customization.
TF transcriptional waves are different patterns patterns of TF groups which govern different differentiation pathways. To construct waves, you need to have pre-defined layout for differentiation path. The example dataframe is shown below, where x
and y
are coordinates of each cell state; labelposx
and labelposy
are label coordinates.
wavedf <- data.frame(x = c(1,2,2,3,3,4,4,5,5),
y = c(3,5,1,3,1,3,0,4,2),
samplename = c("Naive","TE","TexProg","MP","TexInt","TRM","TexTerm","TEM","TCM"),
labelposx = c(1,2,2,3,3,4,4,5,5),
labelposy = c(3,5,1,3,1.4,3,0,4,2)-0.4)
Follow the tutorial step by step. Alternatively, you can download the original code for further customization.
Taiji generated the regulatory network showing the regulatory relationship between TF and its target genes (regulatee) with edge weight, which represents the regulatory strength. The TF-regulatee network files "/some_path_to_taiji/output/Network/sample_name/edges_combined.csv". Each sample has its own network file.
Based on this, we can calculate the log2 fold change of edge weights between two cell states, in our case, between TexTerm and TRM. See the example file
Follow the tutorial step by step. Alternatively, you can download the original code for further customization.
First, we constructed the TF communities. From the above TF-regulatee network, we can derive TF-TF correlation based on the edge weight profile. Check the file format here.
Follow the tutorial step by step. Alternatively, you can download the original code for further customization.
Next, we visualized the TF communities following the tutorial. The original code can be downloaded here
Functional analysis can be performed as well. Check the tutorial and raw code
Heuristic regulatory scores can be calculated by combining perturb-seq data with the TF-gene edge weights calculated by Taiji. Check out the tutorial and raw code.
GSEA and module score calculations were also performed to analyze perturb-seq data. See the following tutorials: Figure 2h, Figure 3i, and Figure 4f
- paer website
- Taiji website
- download Seurat objects here and move to appropriate folders to run tutorials
Chung, H. Kay, Cong Liu, Ming Sun, Eduardo Casillas, Timothy Chen, Brent Chick, Jun Wang et al. "Multiomics atlas-assisted discovery of transcription factors enables specific cell state programming." bioRxiv (2023): 2023-01. https://www.biorxiv.org/content/10.1101/2023.01.03.522354v3.abstract.