Code for zero-shot arxiv evaluation #10

bronyayang · 2023-07-17T17:14:07Z

Hi,

Can you provide the code or more detail into how you zero-shot evaluate Arxiv dataset?
I cannot get a good result when trying the arxiv summarization. I guess it is because I don't know the prompt or the model size is not 7B?

syzymon · 2023-07-20T16:27:23Z

Hi,

Thanks for interest in our work! In our paper, the only results we give on arxiv are language modeling perplexity numbers for small models. We do not evaluate LongLLaMA on arxiv summarization downstream task. Note that our model is not instruction tuned, which means that it cannot really do zero-shot summarization. You could try few-shot summarization (not quite sure if a 3B model could really do that), or prompt engineering to match the format of your target document. Also, please stay tuned for the upcoming instruction-tuned models which will definitely be able to do some summarization!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code for zero-shot arxiv evaluation #10

Code for zero-shot arxiv evaluation #10

bronyayang commented Jul 17, 2023

syzymon commented Jul 20, 2023 •

edited

Loading

Code for zero-shot arxiv evaluation #10

Code for zero-shot arxiv evaluation #10

Comments

bronyayang commented Jul 17, 2023

syzymon commented Jul 20, 2023 • edited Loading

syzymon commented Jul 20, 2023 •

edited

Loading