Using "beam search" strategy while generating the responses #2534

SachinVashisth · 2024-12-31T22:05:22Z

Hi

I am using flan-t5-xl to generate the output.
When I use the function ppo_trainer.generate(....), it gives me the desired output but I guess it is the top beam or the best output.
I am trying to generate output for 4 beams (currently using this custom generate function):

def select(list_data, model, tokenizer, strategy = "beam"):
    device = 0 if torch.cuda.is_available() else "cpu"
    beams, num_seq = 5, 4
    batch = tokenizer(list_data, return_tensors="pt", padding=True)
    if strategy == "beam":
        generated = model.generate(
            input_ids = batch["input_ids"].to(device), attention_mask = batch["attention_mask"].to(device),
            num_beams=beams, early_stopping=True, num_return_sequences=num_seq, max_new_tokens=60, min_length=5
        )
    final_list = [tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=True).lower() for g in generated]
    return final_list

Is it possible to generate output for multiple beams using the ppo_trainer.generate(....) function?

The text was updated successfully, but these errors were encountered:

edbeeching · 2025-01-20T09:02:23Z

While we do not expose this functionality at the moment, if you fork trl you should be able to add your beam search options to the GenerationConfig here:

trl/trl/trainer/ppo_trainer.py

Line 682 in 88514d5

generation_config = GenerationConfig(

If you find this change beneficial, we would welcome a PR to expose the options. I will close the issue but feel free to reopen if needed.

August-murr added 🙋 help from community wanted Open invitation for community members to contribute 🏋 PPO Related to PPO labels Jan 1, 2025

edbeeching closed this as completed Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using "beam search" strategy while generating the responses #2534

Using "beam search" strategy while generating the responses #2534

SachinVashisth commented Dec 31, 2024

edbeeching commented Jan 20, 2025

Using "beam search" strategy while generating the responses #2534

Using "beam search" strategy while generating the responses #2534

Comments

SachinVashisth commented Dec 31, 2024

edbeeching commented Jan 20, 2025