Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs/updating the LangSmith docs #1828

Binary file added docs/_static/langsmith_dashboard.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
75 changes: 0 additions & 75 deletions docs/howtos/integrations/_langsmith.md

This file was deleted.

104 changes: 104 additions & 0 deletions docs/howtos/integrations/langsmith.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# LangSmith

[LangSmith](https://docs.smith.langchain.com/) is an advanced tool designed to enhance the development and deployment of applications utilizing large language models (LLMs). It provides a comprehensive framework for tracing, analyzing, and optimizing LLM workflows, making it easier for developers to manage complex interactions within their applications.

This tutorial explains how to log traces of Ragas evaluations using LangSmith. Since Ragas is built on LangChain, you only need to set up LangSmith, and it will handle logging the traces automatically.

## Setting Up LangSmith

To set up LangSmith, make sure you set the following environment variables (refer to the [LangSmith documentation](https://docs.smith.langchain.com/#quick-start) for more details):

```bash
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
export LANGCHAIN_API_KEY=<your-api-key>
export LANGCHAIN_PROJECT=<your-project> # Defaults to "default" if not set
```

## Getting the Dataset

When creating evaluation dataset or evaluating instance, ensure the terminology matches the schema used in `SingleTurnSample` or `MultiTurnSample`.


```python
from ragas import EvaluationDataset


dataset = [
{
"user_input": "Which CEO is widely recognized for democratizing AI education through platforms like Coursera?",
"retrieved_contexts": [
"Andrew Ng, CEO of Landing AI, is known for his pioneering work in deep learning and for democratizing AI education through Coursera."
],
"response": "Andrew Ng is widely recognized for democratizing AI education through platforms like Coursera.",
"reference": "Andrew Ng, CEO of Landing AI, is known for democratizing AI education through Coursera.",
},
{
"user_input": "Who is Sam Altman?",
"retrieved_contexts": [
"Sam Altman, CEO of OpenAI, has advanced AI research and advocates for safe, beneficial AI technologies."
],
"response": "Sam Altman is the CEO of OpenAI and advocates for safe, beneficial AI technologies.",
"reference": "Sam Altman, CEO of OpenAI, has advanced AI research and advocates for safe AI.",
},
{
"user_input": "Who is Demis Hassabis and how did he gain prominence?",
"retrieved_contexts": [
"Demis Hassabis, CEO of DeepMind, is known for developing systems like AlphaGo that master complex games."
],
"response": "Demis Hassabis is the CEO of DeepMind, known for developing systems like AlphaGo.",
"reference": "Demis Hassabis, CEO of DeepMind, is known for developing AlphaGo.",
},
{
"user_input": "Who is the CEO of Google and Alphabet Inc., praised for leading innovation across Google's product ecosystem?",
"retrieved_contexts": [
"Sundar Pichai, CEO of Google and Alphabet Inc., leads innovation across Google's product ecosystem."
],
"response": "Sundar Pichai is the CEO of Google and Alphabet Inc., praised for leading innovation across Google's product ecosystem.",
"reference": "Sundar Pichai, CEO of Google and Alphabet Inc., leads innovation across Google's product ecosystem.",
},
{
"user_input": "How did Arvind Krishna transform IBM?",
"retrieved_contexts": [
"Arvind Krishna, CEO of IBM, transformed the company by focusing on cloud computing and AI solutions."
],
"response": "Arvind Krishna transformed IBM by focusing on cloud computing and AI solutions.",
"reference": "Arvind Krishna, CEO of IBM, transformed the company through cloud computing and AI.",
},
]

evaluation_dataset = EvaluationDataset.from_list(dataset)
```

## Tracing ragas metrics

Run the Ragas evaluations on your dataset, and the traces will appear in your LangSmith dashboard under the specified project name or "default."


```python
from ragas import evaluate
from ragas.llms import LangchainLLMWrapper
from langchain_openai import ChatOpenAI
from ragas.metrics import LLMContextRecall, Faithfulness, FactualCorrectness

llm = ChatOpenAI(model="gpt-4o-mini")
evaluator_llm = LangchainLLMWrapper(llm)

result = evaluate(
dataset=evaluation_dataset,
metrics=[LLMContextRecall(), Faithfulness(), FactualCorrectness()],
llm=evaluator_llm,
)

result
```

Output
```
Evaluating: 0%| | 0/15 [00:00<?, ?it/s]

{'context_recall': 1.0000, 'faithfulness': 0.9333, 'factual_correctness': 0.8520}
```

## LangSmith Dashboard
![jpeg](../../_static/langsmith_dashboard.png)
3 changes: 2 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -98,10 +98,11 @@ nav:
- Debug LLM Based Metrics: howtos/applications/_metrics_llm_calls.md
- Integrations:
- howtos/integrations/index.md
- LlamaIndex: howtos/integrations/_llamaindex.md
- Arize: howtos/integrations/_arize.md
- LangChain: howtos/integrations/langchain.md
- LangGraph: howtos/integrations/_langgraph_agent_evaluation.md
- LangSmith: howtos/integrations/langsmith.md
- LlamaIndex: howtos/integrations/_llamaindex.md
- Migrations:
- From v0.1 to v0.2: howtos/migrations/migrate_from_v01_to_v02.md
- 📖 References:
Expand Down
2 changes: 1 addition & 1 deletion src/ragas/metrics/_answer_relevance.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from __future__ import annotations

import asyncio
import logging
import typing as t
from dataclasses import dataclass, field
Expand All @@ -16,7 +17,6 @@
SingleTurnMetric,
)
from ragas.prompt import PydanticPrompt
import asyncio

logger = logging.getLogger(__name__)

Expand Down
2 changes: 1 addition & 1 deletion src/ragas/sdk.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,4 +91,4 @@ def check_api_response(response: requests.Response) -> None:


def build_evaluation_app_url(app_url: str, run_id: str) -> str:
return f"{app_url}/dashboard/alignment/evaluation/{run_id}"
return f"{app_url}/dashboard/alignment/evaluation/{run_id}"
2 changes: 1 addition & 1 deletion src/ragas/testset/synthesizers/testset_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
SingleTurnSample,
)
from ragas.exceptions import UploadException
from ragas.sdk import upload_packet, get_app_url
from ragas.sdk import get_app_url, upload_packet


class TestsetSample(BaseSample):
Expand Down
Loading