-
Notifications
You must be signed in to change notification settings - Fork 18
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
9c08722
commit 582dda6
Showing
1 changed file
with
10 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,10 @@ | ||
# TODO provide GENA LM paramteres here | ||
Trained in this study,Public name,Layers/Heads/Hiddens,Number of parameters,Architechture,Positional information,Pre-LN,Pre-training SeqLen (tokens),Pre-training task,Vocabulary size,Tokenizer type,Training dataset,Learning rate,Warm-up steps,Optimizer,LR Scheduler,init from,Public link | ||
no,DNABERT,12/12/768,,BERT - Full Attention,BERT absolute position embeddings,FALSE,512,,-,kmer,GRCh38.p13,,,,,,"https://academic.oup.com/bioinformatics/article/37/15/2112/6128680 , trained by authors" | ||
yes,gena-lm-bert-base,12/12/768,110M,BERT - Full Attention,BERT absolute position embeddings,"TRUE, w/o the last layer norm",512,MLM+NSP,32000,BPE,"T2T, spit v1",1e-04,10000,AdamW,constant,,https://huggingface.co/AIRI-Institute/gena-lm-bert-base | ||
yes,gena-lm-bigbird-base-sparse,12/12/768,110M,BigBird - Sparse Attention (DeepSpeed),RoPE position embeddings,"TRUE, w/o the last layer norm",4096,MLM+NSP,32000,BPE,"T2T, spit v1",1e-04,10000,FusedAdamW,constant,,https://huggingface.co/AIRI-Institute/gena-lm-bigbird-base-sparse | ||
yes,gena-lm-bert-base-t2t,12/12/768,110M,BERT - Full Attention,BERT absolute position embeddings,"TRUE, w/o the last layer norm",512,MLM,32000,BPE,"T2T, augment. 1000G SNPs",1e-04,10000,FusedAdamW,constant,,https://huggingface.co/AIRI-Institute/gena-lm-bert-base-t2t | ||
yes,gena-lm-bert-base-t2t-multi,12/12/768,110M,BERT - Full Attention,BERT absolute position embeddings,"TRUE, w/o the last layer norm",512,MLM,32000,BPE,"T2T, augment. 1000G SNPs, Multispieces",1e-04,0,FusedAdamW,constant,gena-lm-bert-base-t2t,https://huggingface.co/AIRI-Institute/gena-lm-bert-base-t2t-multi | ||
yes,gena-lm-bigbird-base-sparse-t2t,12/12/768,110M,BigBird - Sparse Attention (DeepSpeed),RoPE position embeddings,TRUE,4096,MLM,32000,BPE,"T2T, augment. 1000G SNPs",1e-04,10000,FusedAdamW,linear,,https://huggingface.co/AIRI-Institute/gena-lm-bigbird-base-sparse-t2t | ||
yes,gena-lm-bigbird-base-t2t,12/12/768,110M,BigBird - Sparse Attention (HuggingFace),BERT absolute position embeddings,FALSE,4096,MLM,32000,BPE,"T2T, augment. 1000G SNPs",1e-04,10000,FusedAdamW,linear,,https://huggingface.co/AIRI-Institute/gena-lm-bigbird-base-t2t | ||
yes,gena-lm-bert-large-t2t,24/16/1024,336M,BERT-large - Full Attention,BERT absolute position embeddings,TRUE,512,MLM,32000,BPE,"T2T, augment. 1000G SNPs",1e-04,10000,FusedAdamW,constant,,https://huggingface.co/AIRI-Institute/gena-lm-bert-large-t2t | ||
yes,gena-lm-bert-base-lastln-t2t,12/12/768,110M,BERT - Full Attention,BERT absolute position embeddings,TRUE,512,MLM,32000,BPE,"T2T, augment. 1000G SNPs",1e-04,0,FusedAdamW,linear,,https://huggingface.co/AIRI-Institute/gena-lm-bert-base-lastln-t2t |