-
Notifications
You must be signed in to change notification settings - Fork 132
models Rouge Score Evaluator
github-actions[bot] edited this page Sep 25, 2024
·
5 revisions
Score range | Float [0-1] |
What is this metric? | Measures the quality of the generated text by comparing it to a reference text using n-gram recall, precision, and F1-score. |
How does it work? | The ROUGE score (Recall-Oriented Understudy for Gisting Evaluation) evaluates the similarity between the generated text and reference text based on n-gram overlap, including ROUGE-N (unigram, bigram, etc.), and ROUGE-L (longest common subsequence). It calculates precision, recall, and F1 scores to capture how well the generated text matches the reference text. |
When to use it? | Use the ROUGE score when you need a robust evaluation metric for text summarization, machine translation, and other natural language processing tasks, especially when focusing on recall and the ability to capture relevant information from the reference text. |
What does it need as input? | Ground Truth Response, Generated Response |
Version: 1
Preview
View in Studio: https://ml.azure.com/registries/azureml/models/Rouge-Score-Evaluator/version/1
is-promptflow: True
is-evaluator: True
show-artifact: True
_default-display-file: ./evaluator/_rouge.py