Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code for zero-shot arxiv evaluation #10

Open
bronyayang opened this issue Jul 17, 2023 · 1 comment
Open

Code for zero-shot arxiv evaluation #10

bronyayang opened this issue Jul 17, 2023 · 1 comment

Comments

@bronyayang
Copy link

Hi,

Can you provide the code or more detail into how you zero-shot evaluate Arxiv dataset?
I cannot get a good result when trying the arxiv summarization. I guess it is because I don't know the prompt or the model size is not 7B?

@syzymon
Copy link
Collaborator

syzymon commented Jul 20, 2023

Hi,

Thanks for interest in our work! In our paper, the only results we give on arxiv are language modeling perplexity numbers for small models. We do not evaluate LongLLaMA on arxiv summarization downstream task. Note that our model is not instruction tuned, which means that it cannot really do zero-shot summarization. You could try few-shot summarization (not quite sure if a 3B model could really do that), or prompt engineering to match the format of your target document. Also, please stay tuned for the upcoming instruction-tuned models which will definitely be able to do some summarization!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants