Merge pull request #167 from huggingface/module/agents-module

smolagents module in smolcourse
huggingface · Jan 13, 2025 · 1820d6b · 1820d6b
2 parents 882ba52 + b62eb3a
commit 1820d6b
Show file tree

Hide file tree

Showing 5 changed files with 472 additions and 0 deletions.
diff --git a/8_agents/README.md b/8_agents/README.md
@@ -0,0 +1,36 @@
+# Agents
+
+AI Agents are autonomous systems that can understand user requests, break them down into steps, and execute actions to accomplish tasks. They combine language models with tools and external functions to interact with their environment. This module covers how to build effective agents using the [`smolagents`](https://github.com/huggingface/smolagents) library, which provides a lightweight framework for creating capable AI agents.
+
+## Module Overview
+
+Building effective agents requires understanding three key components. First, retrieval capabilities allow agents to access and use relevant information from various sources. Second, function calling enables agents to take concrete actions in their environment. Finally, domain-specific knowledge and tooling equips agents for specialized tasks like code manipulation.
+
+## Contents
+
+### 1️⃣ [Retrieval Agents](./retrieval_agents.md)
+
+Retrieval agents combine models with knowledge bases. These agents can search and synthesize information from multiple sources, leveraging vector stores for efficient retrieval and implementing RAG (Retrieval Augmented Generation) patterns. They are great at combining web search with custom knowledge bases while maintaining conversation context through memory systems. The module covers implementation strategies including fallback mechanisms for robust information retrieval.
+
+### 2️⃣ [Code Agents](./code_agents.md)
+
+Code agents are specialized autonomous systems designed for software development tasks. These agents excel at analyzing and generating code, performing automated refactoring, and integrating with development tools. The module covers best practices for building code-focused agents that can understand programming languages, work with build systems, and interact with version control while maintaining high code quality standards.
+
+### 3️⃣ [Custom Functions](./custom_functions.md)
+
+Custom function agents extend basic AI capabilities through specialized function calls. This module explores how to design modular and extensible function interfaces that integrate directly with your application's logic. You'll learn to implement proper validation and error handling while creating reliable function-driven workflows. The focus is on building simple systems where agents can predictably interact with external tools and services.
+
+### Exercise Notebooks
+
+| Title | Description | Exercise | Link | Colab |
+|-------|-------------|----------|------|-------|
+| Building a Research Agent | Create an agent that can perform research tasks using retrieval and custom functions | 🐢 Build a simple RAG agent <br> 🐕 Add custom search functions <br> 🦁 Create a full research assistant | [Notebook](./notebooks/agents.ipynb) | <a target="_blank" href="https://colab.research.google.com/github/huggingface/smol-course/blob/main/8_agents/notebooks/building_research_agent.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> |
+
+## Resources
+
+- [SmoLAgents Documentation](https://huggingface.co/docs/smolagents) - Official docs for the SmoLAgents library
+- [Building Effective Agents](https://www.anthropic.com/research/building-effective-agents) - Research paper on agent architectures
+- [Agent Guidelines](https://huggingface.co/docs/smolagents/tutorials/building_good_agents) - Best practices for building reliable agents
+- [LangChain Agents](https://python.langchain.com/docs/modules/agents/) - Additional examples of agent implementations
+- [Function Calling Guide](https://platform.openai.com/docs/guides/function-calling) - Understanding function calling in LLMs
+- [RAG Best Practices](https://www.pinecone.io/learn/retrieval-augmented-generation/) - Guide to implementing effective RAG
diff --git a/8_agents/code_agents.md b/8_agents/code_agents.md
@@ -0,0 +1,68 @@
+# Code Agents
+
+Code agents are specialized autonomous systems that handle coding tasks like analysis, generation, refactoring, and testing. These agents leverage domain knowledge about programming languages, build systems, and version control to enhance software development workflows.
+
+## Why Code Agents?
+
+Code agents accelerate development by automating repetitive tasks while maintaining code quality. They excel at generating boilerplate code, performing systematic refactoring, and identifying potential issues through static analysis. The agents combine retrieval capabilities to access external documentation and repositories with function calling to execute concrete actions like creating files or running tests.
+
+## Building Blocks of a Code Agent
+
+Code agents are built on specialized language models fine-tuned for code understanding. These models are augmented with development tools like linters, formatters, and compilers to interact with real-world environments. Through retrieval techniques, agents maintain contextual awareness by accessing documentation and code histories to align with organizational patterns and standards. Action-oriented functions enable agents to perform concrete tasks such as committing changes or initiating merge requests.
+
+In the following example, we create a code agent that can search the web using DuckDuckGo much like the retrieval agent we built earlier.
+
+```python
+from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel
+
+agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=HfApiModel())
+
+agent.run("How many seconds would it take for a leopard at full speed to run through Pont des Arts?")
+```
+
+In the following example, we create a code agent that can get the travel time between two locations. Here, we use the `@tool` decorator to define a custom function that can be used as a tool.
+
+```python
+from smolagents import CodeAgent, HfApiModel, tool
+
+@tool
+def get_travel_duration(start_location: str, destination_location: str, departure_time: Optional[int] = None) -> str:
+    """Gets the travel time in car between two places.
+    
+    Args:
+        start_location: the place from which you start your ride
+        destination_location: the place of arrival
+        departure_time: the departure time, provide only a `datetime.datetime` if you want to specify this
+    """
+    import googlemaps # All imports are placed within the function, to allow for sharing to Hub.
+    import os
+
+    gmaps = googlemaps.Client(os.getenv("GMAPS_API_KEY"))
+
+    if departure_time is None:
+        from datetime import datetime
+        departure_time = datetime(2025, 1, 6, 11, 0)
+
+    directions_result = gmaps.directions(
+        start_location,
+        destination_location,
+        mode="transit",
+        departure_time=departure_time
+    )
+    return directions_result[0]["legs"][0]["duration"]["text"]
+
+agent = CodeAgent(tools=[get_travel_duration], model=HfApiModel(), additional_authorized_imports=["datetime"])
+
+agent.run("Can you give me a nice one-day trip around Paris with a few locations and the times? Could be in the city or outside, but should fit in one day. I'm travelling only via public transportation.")
+
+```
+
+These examples are just the beginning of what you can do with code agents. You can learn more about how to build code agents in the [SmoLAgents documentation](https://huggingface.co/docs/smolagents).
+
+SmoLAgents provides a lightweight framework for building code agents, with a core implementation of approximately 1,000 lines of code. The framework specializes in agents that write and execute Python code snippets, offering sandboxed execution for security. It supports both open-source and proprietary language models, making it adaptable to various development environments.
+
+## Further Reading
+
+- [SmoLAgents Blog](https://huggingface.co/blog/smolagents) - Introduction to SmoLAgents and code interactions
+- [SmoLAgents: Building Good Agents](https://huggingface.co/docs/smolagents/tutorials/building_good_agents) - Best practices for reliable agents
+- [Building Effective Agents - Anthropic](https://www.anthropic.com/research/building-effective-agents) - Agent design principles
diff --git a/8_agents/custom_functions.md b/8_agents/custom_functions.md
@@ -0,0 +1,69 @@
+# Custom Function Agents
+
+Custom Function Agents are AI agents that leverage specialized function calls (or “tools”) to perform tasks. Unlike general-purpose agents, Custom Function Agents focus on powering advanced workflows by integrating directly with your application's logic. For example, you can expose database queries, system commands, or any custom utility as isolated functions for the agent to invoke.
+
+## Why Custom Function Agents?
+
+- **Modular and Extensible**: Instead of building one monolithic agent, you can design individual functions that represent discrete capabilities, making your architecture more extensible.
+- **Fine-Grained Control**: Developers can carefully control the agent’s actions by specifying exactly which functions are available and what parameters they accept.
+- **Improved Reliability**: By structuring each function with clear schemas and validations, you reduce errors and unexpected behaviors.
+
+## Basic Workflow
+
+1. **Identify Functions**  
+   Determine which tasks can be transformed into custom functions (e.g., file I/O, database queries, streaming data processing).
+
+2. **Define the Interface**  
+   Use a function signature or schema that precisely outlines each function’s inputs, outputs, and expected behavior. This enforces strong contracts between your agent and its environment.
+
+3. **Register with the Agent**  
+   Your agent needs to “learn” which functions are available. Typically, you pass metadata describing each function’s interface to the language model or agent framework.
+
+4. **Invoke and Validate**  
+   Once the agent selects a function to call, run the function with the provided arguments and validate the results. If valid, feed the results back to the agent for context to drive subsequent decisions.
+
+## Example
+
+Below is a simplified example demonstrating how custom function calls might look in pseudocode. The objective is to perform a user-defined search and retrieve relevant content:
+
+```python
+# Define a custom function with clear input/output types
+def search_database(query: str) -> list:
+    """
+    Search the database for articles matching the query.
+    
+    Args:
+        query (str): The search query string
+        
+    Returns:
+        list: List of matching article results
+    """
+    try:
+        results = database.search(query)
+        return results
+    except DatabaseError as e:
+        logging.error(f"Database search failed: {e}")
+        return []
+
+# Register the function with the agent
+agent.register_function(
+    name="search_database",
+    function=search_database,
+    description="Searches database for articles matching a query"
+)
+
+# Example usage
+def process_search():
+    query = "Find recent articles on AI"
+    results = agent.invoke("search_database", query)
+
+    if results:
+        agent.process_results(results)
+    else:
+        logging.info("No results found for query")
+```
+
+## Further Reading
+
+- [SmoLAgents Blog](https://huggingface.co/blog/smolagents) - Learn about the latest advancements in AI agents and how they can be applied to custom function agents.
+- [Building Good Agents](https://huggingface.co/docs/smolagents/tutorials/building_good_agents) - A comprehensive guide on best practices for developing reliable and effective custom function agents.
diff --git a/8_agents/notebooks/agents.ipynb b/8_agents/notebooks/agents.ipynb
@@ -0,0 +1,191 @@
+{
+    "cells": [
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "# Building AI Agents\n",
+                "\n",
+                "This notebook contains exercises to help you learn how to build different types of agents using the `smolagents` library. We'll progress from basic to more complex implementations.\n",
+                "\n",
+                "## Setup\n",
+                "\n",
+                "First, let's install the required packages:"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "!pip install smolagents\n",
+                "\n",
+                "# Install the requirements in Google Colab\n",
+                "# !pip install transformers datasets trl huggingface_hub\n",
+                "\n",
+                "# Authenticate to Hugging Face\n",
+                "from huggingface_hub import login\n",
+                "\n",
+                "login()"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## 🐢 Exercise 1: Basic Code Agent\n",
+                "\n",
+                "Let's start by creating a simple code agent that can answer programming-related questions using web search."
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel\n",
+                "\n",
+                "# Initialize the agent\n",
+                "agent = CodeAgent(\n",
+                "    tools=[DuckDuckGoSearchTool()],\n",
+                "    model=HfApiModel()\n",
+                ")\n",
+                "\n",
+                "# Test the agent\n",
+                "response = agent.run(\"What's the difference between a list and a tuple in Python?\")\n",
+                "print(response)"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "### 🤔 Exercise 1 Challenge\n",
+                "Try asking the agent to explain different programming concepts and evaluate its responses. How well does it handle:\n",
+                "1. Basic syntax questions\n",
+                "2. Language-specific features\n",
+                "3. Code examples"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## 🐕 Exercise 2: Agent with Custom Functions\n",
+                "\n",
+                "Now let's create an agent that can perform specific tasks using custom functions. We'll implement a simple calculator tool."
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "from smolagents import CodeAgent, tool\n",
+                "from typing import Union, List\n",
+                "\n",
+                "@tool\n",
+                "def calculate(operation: str, numbers: List[float]) -> float:\n",
+                "    \"\"\"Performs basic mathematical operations on a list of numbers.\n",
+                "    \n",
+                "    Args:\n",
+                "        operation: One of 'sum', 'average', 'multiply', 'min', 'max'\n",
+                "        numbers: List of numbers to operate on\n",
+                "    \n",
+                "    Returns:\n",
+                "        float: Result of the operation\n",
+                "    \"\"\"\n",
+                "    if operation == 'sum':\n",
+                "        return sum(numbers)\n",
+                "    elif operation == 'average':\n",
+                "        return sum(numbers) / len(numbers)\n",
+                "    elif operation == 'multiply':\n",
+                "        result = 1\n",
+                "        for n in numbers:\n",
+                "            result *= n\n",
+                "        return result\n",
+                "    elif operation == 'min':\n",
+                "        return min(numbers)\n",
+                "    elif operation == 'max':\n",
+                "        return max(numbers)\n",
+                "    else:\n",
+                "        raise ValueError(f\"Unknown operation: {operation}\")\n",
+                "\n",
+                "# Create agent with custom tool\n",
+                "math_agent = CodeAgent(\n",
+                "    tools=[calculate],\n",
+                "    model=HfApiModel()\n",
+                ")\n",
+                "\n",
+                "# Test the agent\n",
+                "response = math_agent.run(\"What is the average of 10, 15, 20, 25, and 30?\")\n",
+                "print(response)"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "### 🤔 Exercise 2 Challenge\n",
+                "1. Add more mathematical operations to the calculator tool\n",
+                "2. Create a new custom tool (e.g., for string manipulation or date calculations)\n",
+                "3. Combine multiple custom tools in one agent"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## 🦁 Exercise 3: Advanced Retrieval Agent\n",
+                "\n",
+                "Finally, let's build a more sophisticated agent that combines web search with memory to maintain context across conversations."
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "from smolagents import Agent\n",
+                "from smolagents.tools import DuckDuckGoSearch\n",
+                "\n",
+                "# Initialize the agent with memory\n",
+                "research_agent = Agent(\n",
+                "    ... # TODO: Define the agent\n",
+                ")\n",
+                "\n",
+                "# Test with a multi-turn conversation\n",
+                "questions = [\n",
+                "    \"What are the main types of machine learning?\",\n",
+                "    \"Can you explain supervised learning in more detail?\",\n",
+                "    \"What are some popular algorithms for this type?\"\n",
+                "]\n",
+                "\n",
+                "# TODO: Test the agent  "
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "### 🤔 Exercise 3 Challenge\n",
+                "1. Test how well the agent maintains context across different topics\n",
+                "2. Implement a custom knowledge base tool (as shown in the retrieval_agents.md example)\n",
+                "3. Create a hybrid agent that combines code understanding with research capabilities"
+            ]
+        }
+    ],
+    "metadata": {
+        "kernelspec": {
+            "display_name": "Python 3",
+            "language": "python",
+            "name": "python3"
+        }
+    },
+    "nbformat": 4,
+    "nbformat_minor": 2
+}