Add _compute_score
method to PPOTrainer
#2560
+64
−7
Draft