replaced call to _prepare_decoder_attention_mask()
with _prepare_4d_causal_attention_mask()
#2550
Job | Run time |
---|---|
12s | |
11s | |
23s |