You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We plan to release the official FoT large-scale continual pre-training (FoT finetuning) code within two weeks. This code will be in JAX. The instruction fine-tuning code does not use FoT (in fact, it uses a modified version with cross_batch=1, but this is not the version used to tune the base models, for more see #12).
def mem_attn_layer(Ql , Kl , Vl , Cl , Km , Vm , Kp , Vp , attn_scf, mode ):
The text was updated successfully, but these errors were encountered: