Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readme + Docs #10

Merged
merged 19 commits into from
Feb 6, 2025
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 55 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,66 @@
# Cleanlab Codex [![Build Status](https://github.com/cleanlab/cleanlab-codex/actions/workflows/ci.yml/badge.svg)](https://github.com/cleanlab/cleanlab-codex/actions/workflows/ci.yml) [![PyPI - Version](https://img.shields.io/pypi/v/cleanlab-codex.svg)](https://pypi.org/project/cleanlab-codex) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/cleanlab-codex.svg)](https://pypi.org/project/cleanlab-codex)
# Cleanlab Codex - Closing the AI Knowledge Gap

## Table of Contents
[![Build Status](https://github.com/cleanlab/cleanlab-codex/actions/workflows/ci.yml/badge.svg)](https://github.com/cleanlab/cleanlab-codex/actions/workflows/ci.yml) [![PyPI - Version](https://img.shields.io/pypi/v/cleanlab-codex.svg)](https://pypi.org/project/cleanlab-codex) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/cleanlab-codex.svg)](https://pypi.org/project/cleanlab-codex)

- [Installation](#installation)
- [License](#license)
Codex enables you to seamlessly leverage knowledge from Subject Matter Experts (SMEs) to improve your RAG/Agentic applications.

## Installation
The `cleanlab-codex` library provides a simple interface to integrate Codex's capabilities into your RAG application.
See immediate impact with just a few lines of code!

## Demo

Install the package:

```console
pip install cleanlab-codex
```

Integrating Codex into your RAG application as a tool is as simple as:

```python
from cleanlab_codex import CodexTool

def rag(question, system_prompt, tools) -> str:
"""Your RAG/Agentic code here"""
...

# Initialize the Codex tool
codex_tool = CodexTool.from_access_key("your-access-key")

# Update your system prompt to include information on how to use the Codex tool
system_prompt = f"""Answer the user's Question based on the following Context. If the Context doesn't adequately address the Question, use the {codex_tool.tool_name} tool to ask an outside expert."""

# Convert the Codex tool to a framework-specific tool
framework_specific_codex_tool = codex_tool.to_<framework_name>_tool() # i.e. codex_tool.to_llamaindex_tool(), codex_tool.to_openai_tool(), etc.

# Pass the Codex tool to your RAG/Agentic framework
response = rag(question, system_prompt, [framework_specific_codex_tool])
```

(Note: exact code will depend on the RAG/Agentic framework you are using)
<!-- TODO: add demo video -->
<!-- Video should show Codex tool added to a RAG system, question asked that requires knowledge from an outside expert, Codex tool used to ask an outside expert, and expert response returned to the user -->

## Why Codex?
- **Identify Knowledge Gaps**: Codex captures knowledge gaps in your application so that you can easily identify which questions require expert input.
- **Efficiently Leverage SMEs**: Codex ensures the SMEs see the most critical knowledge gaps first. <!-- not sure if we should include this rn since it's not implemented yet -->
- **Easy Integration**: Integrate Codex into your RAG/Agentic application with just a few lines of code.
- **Immediate Impact**: SME responses instantly enhance your AI applications.

## How does Codex interact with my AI application?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid duplication with (planned/WIP) docs, and leave these TODO sections out of the README.

<!-- TODO: add architecture diagram w/ brief explanation -->


## What impact will I see?
<!-- TODO: benchmarks -->

## Documentation

Comprehensive documentation along with tutorials and examples can be found [here](https://help.cleanlab.ai/codex).

## Contributing
<!-- TODO: add contributing section or consider leaving out for now -->

## License

`cleanlab-codex` is distributed under the terms of the [MIT](https://spdx.org/licenses/MIT.html) license.
32 changes: 19 additions & 13 deletions src/cleanlab_codex/codex.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,24 @@
"""Client for Cleanlab Codex."""

from __future__ import annotations

from typing import TYPE_CHECKING, Optional
from typing import TYPE_CHECKING as _TYPE_CHECKING
from typing import Optional

from cleanlab_codex.internal.project import create_project, query_project
from cleanlab_codex.internal.utils import init_codex_client

if TYPE_CHECKING:
if _TYPE_CHECKING:
from cleanlab_codex.types.entry import Entry, EntryCreate
from cleanlab_codex.types.organization import Organization


class Codex:
"""
A client to interact with Cleanlab Codex.
Client for interacting with Cleanlab Codex. In order to use this client, make sure you have an account at [codex.cleanlab.ai](https://codex.cleanlab.ai).

We recommend using the [Web UI](https://codex.cleanlab.ai) to [set up Codex projects](TODO: link to docs) and then using one of our abstractions around the client such as [`CodexTool`](/reference/python/codex_tool) to integrate Codex into your RAG/Agentic system.
This client can be used to programmatically set up Codex projects. The [`query`](#method-query) method can also be used directly if none of our existing abstractions are sufficient for your use case.
"""

def __init__(self, key: str | None = None):
Expand All @@ -34,7 +40,7 @@ def list_organizations(self) -> list[Organization]:
"""List the organizations the authenticated user is a member of.

Returns:
list[Organization]: A list of organizations the authenticated user is a member of.
list[Organization]: A list of organizations the authenticated user is a member of. See [`Organization`](/reference/python/codex_types#class-organization) for more information.

Raises:
AuthenticationError: If the client is not authenticated with a user-level API Key.
Expand All @@ -47,7 +53,7 @@ def create_project(self, name: str, organization_id: str, description: Optional[
Args:
name (str): The name of the project.
organization_id (str): The ID of the organization to create the project in. Must be authenticated as a member of this organization.
description (:obj:`str`, optional): The description of the project.
description (str, optional): The description of the project.

Returns:
int: The ID of the created project.
Expand All @@ -63,7 +69,7 @@ def add_entries(self, entries: list[EntryCreate], project_id: str) -> None:
"""Add a list of entries to the Codex project.

Args:
entries (list[EntryCreate]): The entries to add to the Codex project.
entries (list[EntryCreate]): The entries to add to the Codex project. See [`EntryCreate`](/reference/python/codex_types#class-entrycreate).
project_id (int): The ID of the project to add the entries to.

Raises:
Expand All @@ -84,7 +90,7 @@ def create_project_access_key(
Args:
project_id (int): The ID of the project to create the access key for.
access_key_name (str): The name of the access key.
access_key_description (:obj:`str`, optional): The description of the access key.
access_key_description (str, optional): The description of the access key.

Returns:
str: The access key token.
Expand All @@ -99,25 +105,25 @@ def query(
self,
question: str,
*,
project_id: Optional[str] = None, # TODO: update to uuid once project IDs are changed to UUIDs
project_id: Optional[str] = None,
fallback_answer: Optional[str] = None,
read_only: bool = False,
) -> tuple[Optional[str], Optional[Entry]]:
"""Query Codex to check if the Codex project contains an answer to this question and add the question to the Codex project for SME review if it does not.

Args:
question (str): The question to ask the Codex API.
project_id (:obj:`int`, optional): The ID of the project to query.
project_id (int, optional): The ID of the project to query.
If the client is authenticated with a user-level API Key, this is required.
If the client is authenticated with a project-level Access Key, this is optional. The client will use the Access Key's project ID by default.
fallback_answer (:obj:`str`, optional): Optional fallback answer to return if Codex is unable to answer the question.
read_only (:obj:`bool`, optional): Whether to query the Codex API in read-only mode. If True, the question will not be added to the Codex project for SME review.
fallback_answer (str, optional): Optional fallback answer to return if Codex is unable to answer the question.
read_only (bool, optional): Whether to query the Codex API in read-only mode. If True, the question will not be added to the Codex project for SME review.
This can be useful for testing purposes before when setting up your project configuration.

Returns:
tuple[Optional[str], Optional[Entry]]: A tuple representing the answer for the query and the existing or new entry in the Codex project.
If Codex is able to answer the question, the first element will be the answer returned by Codex and the second element will be the existing entry in the Codex project.
If Codex is unable to answer the question, the first element will be `fallback_answer` if provided, otherwise None, and the second element will be a new entry in the Codex project.
If Codex is able to answer the question, the first element will be the answer returned by Codex and the second element will be the existing [`Entry`](/reference/python/codex_types#class-entry) in the Codex project.
If Codex is unable to answer the question, the first element will be `fallback_answer` if provided, otherwise None. The second element will be a new [`Entry`](/reference/python/codex_types#class-entry) in the Codex project.
"""
return query_project(
client=self._client,
Expand Down
56 changes: 48 additions & 8 deletions src/cleanlab_codex/codex_tool.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
"""Tool abstraction for Cleanlab Codex."""

from __future__ import annotations

from typing import Any, ClassVar, Optional
Expand Down Expand Up @@ -38,7 +40,16 @@ def from_access_key(
project_id: Optional[str] = None,
fallback_answer: Optional[str] = DEFAULT_FALLBACK_ANSWER,
) -> CodexTool:
"""Creates a CodexTool from an access key. The project ID that the CodexTool will use is the one that is associated with the access key."""
"""Creates a CodexTool from an access key. The project ID that the CodexTool will use is the one that is associated with the access key.

Args:
access_key (str): The access key for the Codex project.
project_id (str, optional): The ID of the project to use. If not provided, the project ID will be inferred from the access key. If provided, the project ID must be the ID of the project that the access key is associated with.
fallback_answer (str, optional): The fallback answer to use if the Codex project cannot answer the question.

Returns:
CodexTool: The CodexTool.
"""
return cls(
codex_client=Codex(key=access_key),
project_id=project_id,
Expand All @@ -56,6 +67,14 @@ def from_client(
"""Creates a CodexTool from a Codex client.
If the Codex client is initialized with a project access key, the CodexTool will use the project ID that is associated with the access key.
If the Codex client is initialized with a user API key, a project ID must be provided.

Args:
codex_client (Codex): The Codex client to use.
project_id (str, optional): The ID of the project to use. If not provided and the Codex client is authenticated with a project-level access key, the project ID will be inferred from the access key.
fallback_answer (str, optional): The fallback answer to use if the Codex project cannot answer the question.

Returns:
CodexTool: The CodexTool.
"""
return cls(
codex_client=codex_client,
Expand All @@ -65,17 +84,31 @@ def from_client(

@property
def tool_name(self) -> str:
"""The name to use for the tool when passing to an LLM."""
"""The name to use for the tool when passing to an LLM. This is the name the LLM will use when determining whether to call the tool.

Note: We recommend using the default tool name which we've benchmarked. Only override this if you have a specific reason."""
return self._tool_name

@tool_name.setter
def tool_name(self, value: str) -> None:
"""Sets the name to use for the tool when passing to an LLM."""
self._tool_name = value

@property
def tool_description(self) -> str:
"""The description to use for the tool when passing to an LLM."""
"""The description to use for the tool when passing to an LLM. This is the description that the LLM will see when determining whether to call the tool.

Note: We recommend using the default tool description which we've benchmarked. Only override this if you have a specific reason."""
return self._tool_description

@tool_description.setter
def tool_description(self, value: str) -> None:
"""Sets the description to use for the tool when passing to an LLM."""
self._tool_description = value

@property
def fallback_answer(self) -> Optional[str]:
"""The fallback answer to use if the Codex project cannot answer the question."""
"""The fallback answer to use if the Codex project cannot answer the question. This will be returned by the tool if the Codex project does not have an answer to the question."""
return self._fallback_answer

@fallback_answer.setter
Expand All @@ -90,12 +123,13 @@ def query(self, question: str) -> Optional[str]:
question: The question to ask the advisor. This should be the same as the original user question, except in cases where the user question is missing information that could be additionally clarified.

Returns:
The answer to the question, or None if the answer is not available.
The answer to the question if available. If no answer is available, the fallback answer is returned if provided, otherwise None is returned.
"""
return self._codex_client.query(question, project_id=self._project_id, fallback_answer=self._fallback_answer)[0]

def to_openai_tool(self) -> dict[str, Any]:
"""Converts the tool to an OpenAI tool."""
"""Converts the tool to the expected format for an [OpenAI function tool](https://platform.openai.com/docs/guides/function-calling).
See more information on defining functions for OpenAI tool calls [here](https://platform.openai.com/docs/guides/function-calling#defining-functions)."""
from cleanlab_codex.utils import format_as_openai_tool

return format_as_openai_tool(
Expand All @@ -106,7 +140,10 @@ def to_openai_tool(self) -> dict[str, Any]:
)

def to_smolagents_tool(self) -> Any:
"""Converts the tool to a smolagents tool."""
"""Converts the tool to a [smolagents tool](https://huggingface.co/docs/smolagents/reference/tools#smolagents.Tool).

Note: You must have the [`smolagents` library installed](https://github.com/huggingface/smolagents/tree/main?tab=readme-ov-file#quick-demo) to use this method.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note: You must have the [`smolagents` library installed](https://github.com/huggingface/smolagents/tree/main?tab=readme-ov-file#quick-demo) to use this method.
Note: You must have the [`smolagents` library installed](https://github.com/huggingface/smolagents) to use this method.

"""
from cleanlab_codex.utils.smolagents import CodexTool as SmolagentsCodexTool

return SmolagentsCodexTool(
Expand All @@ -117,7 +154,10 @@ def to_smolagents_tool(self) -> Any:
)

def to_llamaindex_tool(self) -> Any:
"""Converts the tool to a LlamaIndex FunctionTool."""
"""Converts the tool to a [LlamaIndex FunctionTool](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/tools/#functiontool).

Note: You must have the [`llama-index` library installed](https://docs.llamaindex.ai/en/stable/getting_started/installation/) to use this method.
"""
from llama_index.core.tools import FunctionTool

from cleanlab_codex.utils.llamaindex import get_function_schema
Expand Down