How can I use the openbuddy-llama3.2 model ? #2

mhama · 2024-12-28T17:07:23Z

I updated the LLM Module's firmware to v1.3.0 with AXDL , and installed llm-openbuddy-llama3.2-1B-ax630c_0.2-m5stack1_arm64.deb via SD card. But I get only null reply with the latest version of M5Module-LLM library, when I specify the model name of "openbuddy-llama3.2-1B-ax630c" to setup method.

Is there any other instruction needed to use the openbuddy-llama3.2 model?

Abandon-ht · 2024-12-30T06:04:36Z

Is the platform you use Arduino or UIFlow?

Abandon-ht · 2024-12-30T06:07:53Z

https://uiflow2.m5stack.com/?pkey=57b3b457cf2247ce827a09d3cc245e5d You can refer to this method to make sure that you have successfully followed this model.

mhama · 2024-12-31T04:01:16Z

I'm using Arduino with the latest github version of this library https://github.com/m5stack/M5Module-LLM .
I searched for get model list api but I can't find it.

In this repository there's lsmode api. Maybe the UIFlow's get model list api calls this?

StackFlow/projects/llm_framework/main_sys/src/event_loop.cpp

Line 194 in 9d1dfce

int sys_lsmode(int com_id, const nlohmann::json &json_obj)

So I did a little hack to M5Module-LLM and send the command of "sys_lsmode".

I found this info about openbuddy model in the result:

        {
            "capabilities": [
                "text_generation",
                "chat"
            ],
            "input_type": [
                "llm.utf-8",
                "llm.utf-8.stream",
                "llm.chat_completion",
                "llm.chat_completion.stream"
            ],
            "mode": "openbuddy-llama3.2-1B-ax630c",
            "mode_param": {
                "axmodel_num": 16,
                "b_bos": false,
                "b_dynamic_load_axmodel_layer": false,
                "b_eos": false,
                "b_use_mmap_load_embed": true,
                "b_use_topk": false,
                "filename_post_axmodel": "llama_post.axmodel",
                "filename_tokenizer_model": "http://localhost:8080",
                "filename_tokens_embed": "model.embed_tokens.weight.bfloat16.bin",
                "template_filename_axmodel": "llama_p128_l%d_together.axmodel",
                "tokenizer_type": 2,
                "tokens_embed_num": 128256,
                "tokens_embed_size": 2048
            },
            "output_type": [
                "llm.utf-8",
                "llm.utf-8.stream"
            ],
            "type": "llm"
        },

This is just the same as this file:
https://github.com/m5stack/StackFlow/blob/9d1dfced01e365694f6aa69b78aed29d0bf49e26/projects/llm_framework/main_llm/mode_openbuddy-llama3.2-1B-ax630c.json

I'm using llm inference like this on .ino but it always replies with null.

    llm_config = m5_module_llm::ApiLlmSetupConfig_t();
    llm_config.model = "openbuddy-llama3.2-1B-ax630c";
    llm_work_id = module_llm.llm.setup(llm_config);
...
      module_llm.llm.inferenceAndWaitResult(llm_work_id, question.c_str(), [](String& result) {
          /* Show result on screen */
          M5.Display.printf("%s", result.c_str());
          Serial.printf("%s", result.c_str());
      });

mhama · 2024-12-31T04:43:42Z

I modified M5Module-LLM library and output error code/message at inference (ApiLlm::inferenceAndWaitResult).

error code: -4
error message: inference data push false

So maybe it fails around here:

StackFlow/projects/llm_framework/main_sys/src/event_loop.cpp

Line 598 in 9d1dfce

    
           usr_print_error(request_id, work_id, "{\"code\":-4, \"message\":\"inference data push false\"}", com_id);

Abandon-ht · 2025-01-02T06:59:57Z

It seems that the model has been installed into ModuleLLM. The problem is in the loading model. You can try to use the following json to load the model to determine whether it is successfully loaded.

{
    "request_id": "llm001",
    "work_id": "llm",
    "action": "setup",
    "object": "llm.setup",
    "data": {
        "model": "openbuddy-llama3.2-1B-ax630c",
        "response_format": "llm.utf-8.stream",
        "input": "llm.utf-8",
        "enoutput": true,
        "max_token_len": 127,
        "prompt": "You are a knowledgeable assistant capable of answering various questions and providing information."
    }
}

mhama · 2025-01-02T19:43:36Z

Ok, I got it work. Thank you Abandon-ht !
The problem was that it timeouts during model setup.
The openbuddy-llama3.2-1B-ax630c model takes about 15 sec to setup but the timeout value for the setup was 10sec on M5Module-LLM. So it timeouts and the returned work_id was null. Then the inference did not work.

mhama mentioned this issue Jan 2, 2025

longer llm setup timeout, avoid failure loading llama3.2 models m5stack/M5Module-LLM#16

Closed

dianjixz closed this as completed Jan 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I use the openbuddy-llama3.2 model ? #2

How can I use the openbuddy-llama3.2 model ? #2

mhama commented Dec 28, 2024

Abandon-ht commented Dec 30, 2024

Abandon-ht commented Dec 30, 2024

mhama commented Dec 31, 2024

mhama commented Dec 31, 2024 •

edited

Loading

Abandon-ht commented Jan 2, 2025

mhama commented Jan 2, 2025

How can I use the openbuddy-llama3.2 model ? #2

How can I use the openbuddy-llama3.2 model ? #2

Comments

mhama commented Dec 28, 2024

Abandon-ht commented Dec 30, 2024

Abandon-ht commented Dec 30, 2024

mhama commented Dec 31, 2024

mhama commented Dec 31, 2024 • edited Loading

Abandon-ht commented Jan 2, 2025

mhama commented Jan 2, 2025

mhama commented Dec 31, 2024 •

edited

Loading