Skip to content

Commit

Permalink
fix(llmobs): fix content arg extraction for vertex ai integration (#1…
Browse files Browse the repository at this point in the history
…2034)

In [MLOS-42](https://datadoghq.atlassian.net/browse/MLOS-42) a customer
was experiencing the following error:
```
ddtrace.internal.utils.ArgumentError: contents (at position 0)
```
where the `content` argument was not being extracted properly from the
list of keyword arguments inputted into the `chat.send_message` method.

This is because in the Vertex AI integration, we look for the `contents`
keyword argument. However, the content field is titled `content` in the
[send_message](https://github.com/google-gemini/generative-ai-python/blob/main/google/generativeai/generative_models.py#L514)
method and `contents` in the
[generate_content](https://github.com/google-gemini/generative-ai-python/blob/main/google/generativeai/generative_models.py#L239)
method, so it is necessary to differentiate between these two cases.

This PR is a small fix that corrects this error by differentiating
between chat and completion requests in order to extract either
`content` or `contents` respectively.

## Testing

### Automatic Testing
I edited some of the currently existing tests to use the keyword
argument extraction rather than the positional argument extraction to
get the content in order to confirm that this fix resolves the error.

### Manual Testing
Running the following code reproduced the error; furthermore, I
confirmed that with this fix, the error is no longer present and the
request completes successfully.
```
llm = GenerativeModel("gemini-1.5-flash")
chat = llm.start_chat()
resp = chat.send_message(content="hello")
```

I also verified that the following code which uses the generate_content
method is not impacted (continues to work as before) as a result of this
fix.

```
llm = GenerativeModel("gemini-1.5-flash")
resp = llm.generate_content(contents="hello")
```

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

[MLOS-42]:
https://datadoghq.atlassian.net/browse/MLOS-42?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

---------

Co-authored-by: Yun Kim <[email protected]>
(cherry picked from commit 67a5a9c)
  • Loading branch information
ncybul committed Jan 23, 2025
1 parent 88bc85b commit d4773c0
Show file tree
Hide file tree
Showing 6 changed files with 22 additions and 13 deletions.
4 changes: 2 additions & 2 deletions ddtrace/contrib/internal/vertexai/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,13 +177,13 @@ def _tag_request_content(span, integration, content, content_idx):
tag_request_content_part_google("vertexai", span, integration, part, part_idx, content_idx)


def tag_request(span, integration, instance, args, kwargs):
def tag_request(span, integration, instance, args, kwargs, is_chat):
"""Tag the generation span with request details.
Includes capturing generation configuration, system prompt, and messages.
"""
# instance is either a chat session or a model itself
model_instance = instance if isinstance(instance, GenerativeModel) else instance._model
contents = get_argument_value(args, kwargs, 0, "contents")
contents = get_argument_value(args, kwargs, 0, "content" if is_chat else "contents")
history = _get_attr(instance, "_history", [])
if history:
if isinstance(contents, list):
Expand Down
4 changes: 2 additions & 2 deletions ddtrace/contrib/internal/vertexai/patch.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ def _traced_generate(vertexai, pin, func, instance, args, kwargs, model_instance
# history must be copied since it is modified during the LLM interaction
history = getattr(instance, "history", [])[:]
try:
tag_request(span, integration, instance, args, kwargs)
tag_request(span, integration, instance, args, kwargs, is_chat)
generations = func(*args, **kwargs)
if stream:
return TracedVertexAIStreamResponse(
Expand Down Expand Up @@ -99,7 +99,7 @@ async def _traced_agenerate(vertexai, pin, func, instance, args, kwargs, model_i
# history must be copied since it is modified during the LLM interaction
history = getattr(instance, "history", [])[:]
try:
tag_request(span, integration, instance, args, kwargs)
tag_request(span, integration, instance, args, kwargs, is_chat)
generations = await func(*args, **kwargs)
if stream:
return TracedAsyncVertexAIStreamResponse(
Expand Down
7 changes: 6 additions & 1 deletion ddtrace/llmobs/_integrations/vertexai.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
from typing import Optional

from ddtrace import Span
from ddtrace.internal.utils import ArgumentError
from ddtrace.internal.utils import get_argument_value
from ddtrace.llmobs._constants import INPUT_MESSAGES
from ddtrace.llmobs._constants import METADATA
Expand Down Expand Up @@ -45,7 +46,11 @@ def _llmobs_set_tags(
metadata = llmobs_get_metadata_google(kwargs, instance)

system_instruction = get_system_instructions_from_google_model(instance)
input_contents = get_argument_value(args, kwargs, 0, "contents")
input_contents = None
try:
input_contents = get_argument_value(args, kwargs, 0, "content")
except ArgumentError:
input_contents = get_argument_value(args, kwargs, 0, "contents")
input_messages = self._extract_input_message(input_contents, history, system_instruction)

output_messages = [{"content": ""}]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
fixes:
- |
vertexai: Resolves an issue with ``chat.send_message()`` where the content keyword argument was not parsed correctly.
8 changes: 4 additions & 4 deletions tests/contrib/vertexai/test_vertexai.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ def test_vertexai_completion(vertexai):
llm = vertexai.generative_models.GenerativeModel("gemini-1.5-flash")
llm._prediction_client.responses["generate_content"].append(_mock_completion_response(MOCK_COMPLETION_SIMPLE_1))
llm.generate_content(
"Why do bears hibernate?",
contents="Why do bears hibernate?",
generation_config=vertexai.generative_models.GenerationConfig(
stop_sequences=["x"], max_output_tokens=30, temperature=1.0
),
Expand Down Expand Up @@ -118,7 +118,7 @@ def test_vertexai_completion_stream(vertexai):
(_mock_completion_stream_chunk(chunk) for chunk in MOCK_COMPLETION_STREAM_CHUNKS)
]
response = llm.generate_content(
"How big is the solar system?",
contents="How big is the solar system?",
generation_config=vertexai.generative_models.GenerationConfig(
stop_sequences=["x"], max_output_tokens=30, temperature=1.0
),
Expand Down Expand Up @@ -278,7 +278,7 @@ def test_vertexai_chat(vertexai):
llm._prediction_client.responses["generate_content"].append(_mock_completion_response(MOCK_COMPLETION_SIMPLE_1))
chat = llm.start_chat()
chat.send_message(
"Why do bears hibernate?",
content="Why do bears hibernate?",
generation_config=vertexai.generative_models.GenerationConfig(
stop_sequences=["x"], max_output_tokens=30, temperature=1.0
),
Expand Down Expand Up @@ -371,7 +371,7 @@ def test_vertexai_chat_stream(vertexai):
]
chat = llm.start_chat()
response = chat.send_message(
"How big is the solar system?",
content="How big is the solar system?",
generation_config=vertexai.generative_models.GenerationConfig(
stop_sequences=["x"], max_output_tokens=30, temperature=1.0
),
Expand Down
8 changes: 4 additions & 4 deletions tests/contrib/vertexai/test_vertexai_llmobs.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ def test_completion(self, vertexai, mock_llmobs_writer, mock_tracer):
llm = vertexai.generative_models.GenerativeModel("gemini-1.5-flash")
llm._prediction_client.responses["generate_content"].append(_mock_completion_response(MOCK_COMPLETION_SIMPLE_1))
llm.generate_content(
"Why do bears hibernate?",
contents="Why do bears hibernate?",
generation_config=vertexai.generative_models.GenerationConfig(
stop_sequences=["x"], max_output_tokens=30, temperature=1.0
),
Expand Down Expand Up @@ -126,7 +126,7 @@ def test_completion_stream(self, vertexai, mock_llmobs_writer, mock_tracer):
(_mock_completion_stream_chunk(chunk) for chunk in MOCK_COMPLETION_STREAM_CHUNKS)
]
response = llm.generate_content(
"How big is the solar system?",
contents="How big is the solar system?",
generation_config=vertexai.generative_models.GenerationConfig(
stop_sequences=["x"], max_output_tokens=30, temperature=1.0
),
Expand Down Expand Up @@ -293,7 +293,7 @@ def test_chat(self, vertexai, mock_llmobs_writer, mock_tracer):
llm._prediction_client.responses["generate_content"].append(_mock_completion_response(MOCK_COMPLETION_SIMPLE_1))
chat = llm.start_chat()
chat.send_message(
"Why do bears hibernate?",
content="Why do bears hibernate?",
generation_config=vertexai.generative_models.GenerationConfig(
stop_sequences=["x"], max_output_tokens=30, temperature=1.0
),
Expand Down Expand Up @@ -389,7 +389,7 @@ def test_chat_stream(self, vertexai, mock_llmobs_writer, mock_tracer):
]
chat = llm.start_chat()
response = chat.send_message(
"How big is the solar system?",
content="How big is the solar system?",
generation_config=vertexai.generative_models.GenerationConfig(
stop_sequences=["x"], max_output_tokens=30, temperature=1.0
),
Expand Down

0 comments on commit d4773c0

Please sign in to comment.