Prompt caching does not work for `models/gemini-1.5-pro-002` #661

nharada1 · 2025-01-02T01:15:00Z

Description of the bug:

Using 'models/gemini-1.5-pro-002' is a model for prompt caching fails when creating the cache. To reproduce, run the example code at https://ai.google.dev/gemini-api/docs/caching?lang=python but replace the model name with pro-002

Actual vs expected behavior:

Expected: Caching writes and reads successfully

Actual: On write, I get the error "BadRequest: 400 POST https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-002:generateContent?%24alt=json%3Benum-encoding%3Dint: Request contains an invalid argument."

Any other information you'd like to share?

The context caching doc claims: "Note: Context caching is only available for stable models with fixed versions (for example, gemini-1.5-pro-001). You must include the version postfix (for example, the -001 in gemini-1.5-pro-001)."

So I'd expect that pro-002 is a stable version and should work.

vishal-dharm · 2025-01-03T19:30:40Z

Hey @nharada1, could you try using the Caching notebook? I tested with this notebook after updating the model to 1.5 Pro 002 and it worked as expected.

I'll also review the documentation example and update it with a fix.

nharada1 · 2025-01-04T19:27:08Z

The caching notebook works fine for me with the apollo text, but it fails with a video. So maybe this issue is specifically a video issue? I tried the caching notebook with the sherlock video and it fails with the same error:

WARNING:tornado.access:400 POST /v1beta/models/gemini-1.5-pro-002:generateContent?%24alt=json%3Benum-encoding%3Dint (127.0.0.1) 3463.38ms
---------------------------------------------------------------------------
BadRequest                                Traceback (most recent call last)
[<ipython-input-38-befb1830dc79>](https://localhost:8080/#) in <cell line: 3>()
      1 apollo_model = genai.GenerativeModel.from_cached_content(cached_content=apollo_cache)
      2 
----> 3 response = apollo_model.generate_content("Summarize this movie please.")
      4 print(response)

9 frames
[/usr/local/lib/python3.10/dist-packages/google/ai/generativelanguage_v1beta/services/generative_service/transports/rest.py](https://localhost:8080/#) in __call__(self, request, retry, timeout, metadata)
    845             # subclass.
    846             if response.status_code >= 400:
--> 847                 raise core_exceptions.from_http_response(response)
    848 
    849             # Return the response

BadRequest: 400 POST https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-002:generateContent?%24alt=json%3Benum-encoding%3Dint: Request contains an invalid argument.

Gunand3043 · 2025-01-06T13:26:49Z

Hey @nharada1, I tried context caching with the video, and it worked fine. Please take a look.

Video_Caching

Thanks.

nharada1 · 2025-01-06T17:48:25Z

Oh interesting it works with that video but not this one for me. Maybe an issue with the size of the cache? !curl -O https://storage.googleapis.com/generativeai-downloads/data/Sherlock_Jr_FullMovie.mp4

…

On Mon, Jan 6, 2025 at 5:27 AM Gunand3043 ***@***.***> wrote: Hey @nharada1 <https://github.com/nharada1>, I tried context caching with the video, and it worked fine. Please take a look. Video_Caching <https://colab.sandbox.google.com/gist/Gunand3043/8d745baafc6e3fd9dec038ea2d7eb97e/caching.ipynb> Thanks. — Reply to this email directly, view it on GitHub <#661 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABM6JCUIUS4ZCLTPYIIAQSL2JKAC7AVCNFSM6AAAAABUPE7GQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNZTGEYDSMZYGE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Giom-V · 2025-01-07T11:13:11Z

@nharada1 My guess is that there's something in your video that triggers the safety settings.

nharada1 · 2025-01-08T18:16:47Z

Is that the error that's occurring when it gives me the 400? I don't see anything about safety settings triggering, just a 400 POST error. When I do the same query without caching I don't see an issue either.

Also that video is from the published Google documentation, and works fine with both the other models.

Here is my reproduction notebook:

https://colab.research.google.com/drive/1FtSL3BmiwomOCAekDReo5D4pnw8gMdGs?usp=sharing

Giom-V · 2025-01-09T16:33:10Z

@nharada1 it seems we have an issue with the 1.5 model and the file API. Let me check if that could be related and I'll come back to you.

Giom-V · 2025-01-13T13:28:30Z

The issue I was checking has been fixed but it did not solve yours. I'm still trying to figure out what the problem is.

nharada1 · 2025-01-13T20:46:46Z

Thank you, appreciate you looking into it. Please let me know! Nate

…

On Mon, Jan 13, 2025 at 5:28 AM Guillaume Vernade ***@***.***> wrote: The issue I was checking has been fixed but it did not solve yours. I'm still trying to figure out what the problem is. — Reply to this email directly, view it on GitHub <#661 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABM6JCREQINDKD27H4OPOUD2KO5RLAVCNFSM6AAAAABUPE7GQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBXGEYDOMJXHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

MarkDaoust · 2025-01-21T14:51:25Z

Could the problem here just be that flash has a free tier for caching, but pro doesn't? I would have expected some kind of quota error..

https://ai.google.dev/pricing#1_5flash
https://ai.google.dev/pricing#1_5pro

manojssmk added status:triaged Issue/PR triaged to the corresponding sub-team component:python sdk Issue/PR related to Python SDK labels Jan 2, 2025

Gunand3043 added status:awaiting user response Awaiting a response from the author type:help Support-related issues and removed status:triaged Issue/PR triaged to the corresponding sub-team labels Jan 6, 2025

Gunand3043 added status:triaged Issue/PR triaged to the corresponding sub-team and removed status:awaiting user response Awaiting a response from the author labels Jan 10, 2025

Gunand3043 assigned Giom-V Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prompt caching does not work for `models/gemini-1.5-pro-002` #661

Prompt caching does not work for `models/gemini-1.5-pro-002` #661

nharada1 commented Jan 2, 2025

vishal-dharm commented Jan 3, 2025

nharada1 commented Jan 4, 2025

Gunand3043 commented Jan 6, 2025

nharada1 commented Jan 6, 2025 via email

Giom-V commented Jan 7, 2025

nharada1 commented Jan 8, 2025

Giom-V commented Jan 9, 2025

Giom-V commented Jan 13, 2025

nharada1 commented Jan 13, 2025 via email

MarkDaoust commented Jan 21, 2025

Prompt caching does not work for models/gemini-1.5-pro-002 #661

Prompt caching does not work for models/gemini-1.5-pro-002 #661

Comments

nharada1 commented Jan 2, 2025

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

vishal-dharm commented Jan 3, 2025

nharada1 commented Jan 4, 2025

Gunand3043 commented Jan 6, 2025

nharada1 commented Jan 6, 2025 via email

Giom-V commented Jan 7, 2025

nharada1 commented Jan 8, 2025

Giom-V commented Jan 9, 2025

Giom-V commented Jan 13, 2025

nharada1 commented Jan 13, 2025 via email

MarkDaoust commented Jan 21, 2025

Prompt caching does not work for `models/gemini-1.5-pro-002` #661

Prompt caching does not work for `models/gemini-1.5-pro-002` #661