Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(skills): add skills to the pandas-ai library #653

Merged
merged 12 commits into from
Oct 21, 2023
Merged
54 changes: 54 additions & 0 deletions docs/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -260,3 +260,57 @@ for question in questions:
response = agent.explain()
print(response)
```

## Add Skills to the Agent

You can add customs functions for the agent to use, allowing the agent to expand its capabilities. These custom functions can be seamlessly integrated with the agent's skills, enabling a wide range of user-defined operations.

```
import pandas as pd
from pandasai import Agent

from pandasai.llm.openai import OpenAI
from pandasai.skills import skill

employees_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Name": ["John", "Emma", "Liam", "Olivia", "William"],
"Department": ["HR", "Sales", "IT", "Marketing", "Finance"],
}

salaries_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Salary": [5000, 6000, 4500, 7000, 5500],
}

employees_df = pd.DataFrame(employees_data)
salaries_df = pd.DataFrame(salaries_data)


@skill(
name="Display employee salary",
description="Plots the employee salaries against names",
usage="Displays the plot having name on x axis and salaries on y axis",
)
def plot_salaries(merged_df: pd.DataFrame) -> str:
import matplotlib.pyplot as plt

plt.bar(merged_df["Name"], merged_df["Salary"])
plt.xlabel("Employee Name")
plt.ylabel("Salary")
plt.title("Employee Salaries")
plt.xticks(rotation=45)
plt.savefig("temp_chart.png")
plt.close()


llm = OpenAI("YOUR_API_KEY")
agent = Agent([employees_df, salaries_df], config={"llm": llm}, memory_size=10)

agent.add_skills(plot_salaries)

# Chat with the agent
response = agent.chat("Plot the employee salaries against names")
print(response)

```
109 changes: 109 additions & 0 deletions docs/skills.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Skills

You can add customs functions for the agent to use, allowing the agent to expand its capabilities. These custom functions can be seamlessly integrated with the agent's skills, enabling a wide range of user-defined operations.

## Example Usage

```python

import pandas as pd
from pandasai import Agent

from pandasai.llm.openai import OpenAI
from pandasai.skills import skill

employees_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Name": ["John", "Emma", "Liam", "Olivia", "William"],
"Department": ["HR", "Sales", "IT", "Marketing", "Finance"],
}

salaries_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Salary": [5000, 6000, 4500, 7000, 5500],
}

employees_df = pd.DataFrame(employees_data)
salaries_df = pd.DataFrame(salaries_data)


@skill(
name="Display employee salary",
description="Plots the employee salaries against names",
usage="Displays the plot having name on x axis and salaries on y axis",
)
def plot_salaries(merged_df: pd.DataFrame) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably show a less rigid implementation. What I have in mind is more something that lets the LLM orchestrate and pass the variable accordingly.

For example:

def plot_salaries(names: ..., salaries: ...) -> ...:
    plt.bar(names, salaries)
    plt.xlabel("Employee Name")
    plt.ylabel("Salary")
    plt.title("Employee Salaries")
    plt.xticks(rotation=45)
    plt.savefig("temp_chart.png")
    plt.close()

So that basically is the LLM that figures out which data to pass to the skill. The skill shouldn't be specific for one use case/df, but a more "generalist" function that can be used in many similar use cases, letting the LLM orchestrate and figuring out which one is the right one to use (if any).

import matplotlib.pyplot as plt

plt.bar(merged_df["Name"], merged_df["Salary"])
plt.xlabel("Employee Name")
plt.ylabel("Salary")
plt.title("Employee Salaries")
plt.xticks(rotation=45)
plt.savefig("temp_chart.png")
plt.close()
gventuri marked this conversation as resolved.
Show resolved Hide resolved


llm = OpenAI("YOUR_API_KEY")
agent = Agent([employees_df, salaries_df], config={"llm": llm}, memory_size=10)

agent.add_skills(plot_salaries)

# Chat with the agent
response = agent.chat("Plot the employee salaries against names")


```

## Add Streamlit Skill

```python
import pandas as pd
from pandasai import Agent

from pandasai.llm.openai import OpenAI
from pandasai.skills import skill
import streamlit as st

employees_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Name": ["John", "Emma", "Liam", "Olivia", "William"],
"Department": ["HR", "Sales", "IT", "Marketing", "Finance"],
}

salaries_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Salary": [5000, 6000, 4500, 7000, 5500],
}

employees_df = pd.DataFrame(employees_data)
salaries_df = pd.DataFrame(salaries_data)


@skill(
name="Display employee salary",
description="Plots the employee salaries against names",
usage="Displays the plot having name on x axis and salaries on y axis",
)
def plot_salaries_using_streamlit(merged_df: pd.DataFrame) -> str:
import matplotlib.pyplot as plt

plt.bar(merged_df["Name"], merged_df["Salary"])
plt.xlabel("Employee Name")
plt.ylabel("Salary")
plt.title("Employee Salaries")
plt.xticks(rotation=45)
plt.savefig("temp_chart.png")
fig = plt.gcf()
st.pyplot(fig)
gventuri marked this conversation as resolved.
Show resolved Hide resolved


llm = OpenAI("YOUR_API_KEY")
agent = Agent([employees_df, salaries_df], config={"llm": llm}, memory_size=10)

agent.add_skills(plot_salaries_using_streamlit)

# Chat with the agent
response = agent.chat("Plot the employee salaries against names")
print(response)
```
46 changes: 46 additions & 0 deletions examples/skills_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
import pandas as pd
from pandasai import Agent

from pandasai.llm.openai import OpenAI
from pandasai.skills import skill

employees_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Name": ["John", "Emma", "Liam", "Olivia", "William"],
"Department": ["HR", "Sales", "IT", "Marketing", "Finance"],
}

salaries_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Salary": [5000, 6000, 4500, 7000, 5500],
}

employees_df = pd.DataFrame(employees_data)
salaries_df = pd.DataFrame(salaries_data)


@skill(
name="Display employee salary",
description="Plots the employee salaries against names",
usage="Displays the plot having name on x axis and salaries on y axis",
)
def plot_salaries(merged_df: pd.DataFrame) -> str:
import matplotlib.pyplot as plt

plt.bar(merged_df["Name"], merged_df["Salary"])
plt.xlabel("Employee Name")
plt.ylabel("Salary")
plt.title("Employee Salaries")
plt.xticks(rotation=45)
plt.savefig("temp_chart.png")
plt.close()


llm = OpenAI("YOUR_API_KEY")
agent = Agent([employees_df, salaries_df], config={"llm": llm}, memory_size=10)

agent.add_skills(plot_salaries)

# Chat with the agent
response = agent.chat("Plot the employee salaries against names")
print(response)
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ nav:
- callbacks.md
- custom-instructions.md
- custom-prompts.md
- skills.md
- custom-whitelisted-dependencies.md
- Examples:
- examples.md
Expand Down
10 changes: 9 additions & 1 deletion pandasai/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
from .schemas.df_config import Config
from .helpers.cache import Cache
from .agent import Agent
from .skills import skill

__version__ = importlib.metadata.version(__package__ or __name__)

Expand Down Expand Up @@ -257,4 +258,11 @@ def clear_cache(filename: str = None):
cache.clear()


__all__ = ["PandasAI", "SmartDataframe", "SmartDatalake", "Agent", "clear_cache"]
__all__ = [
"PandasAI",
"SmartDataframe",
"SmartDatalake",
"Agent",
"clear_cache",
"skill",
]
8 changes: 8 additions & 0 deletions pandasai/agent/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import json
from typing import Union, List, Optional

from pandasai.skills import skill
from ..helpers.df_info import DataFrameType
from ..helpers.logger import Logger
from ..helpers.memory import Memory
Expand Down Expand Up @@ -43,6 +45,12 @@ def __init__(
self._lake = SmartDatalake(dfs, config, logger, memory=Memory(memory_size))
self._logger = self._lake.logger

def add_skills(self, *skills: List[skill]):
"""
Add Skills to PandasAI
"""
self._lake.add_skills(*skills)

def _call_llm_with_prompt(self, prompt: AbstractPrompt):
"""
Call LLM with prompt using error handling to retry based on config
Expand Down
2 changes: 1 addition & 1 deletion pandasai/assets/prompt_templates/generate_python_code.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This is the initial python code:
```python
{current_code}
```

{skills}
Use the provided dataframes (`dfs`) to update the python code within the `analyze_data` function.
If the new query from the user is not relevant with the code, rewrite the content of the `analyze_data` function from scratch.
It is very important that you do not change the params that are passed to `analyze_data`.
Expand Down
1 change: 1 addition & 0 deletions pandasai/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,4 +86,5 @@
"base64",
"scipy",
"streamlit",
"pandasai.skills",
]
Loading