Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I use the openbuddy-llama3.2 model ? #2

Closed
mhama opened this issue Dec 28, 2024 · 6 comments
Closed

How can I use the openbuddy-llama3.2 model ? #2

mhama opened this issue Dec 28, 2024 · 6 comments

Comments

@mhama
Copy link

mhama commented Dec 28, 2024

I updated the LLM Module's firmware to v1.3.0 with AXDL , and installed llm-openbuddy-llama3.2-1B-ax630c_0.2-m5stack1_arm64.deb via SD card. But I get only null reply with the latest version of M5Module-LLM library, when I specify the model name of "openbuddy-llama3.2-1B-ax630c" to setup method.

Is there any other instruction needed to use the openbuddy-llama3.2 model?

@Abandon-ht
Copy link
Contributor

Is the platform you use Arduino or UIFlow?

@Abandon-ht
Copy link
Contributor

https://uiflow2.m5stack.com/?pkey=57b3b457cf2247ce827a09d3cc245e5d You can refer to this method to make sure that you have successfully followed this model.

@mhama
Copy link
Author

mhama commented Dec 31, 2024

I'm using Arduino with the latest github version of this library https://github.com/m5stack/M5Module-LLM .
I searched for get model list api but I can't find it.

In this repository there's lsmode api. Maybe the UIFlow's get model list api calls this?

int sys_lsmode(int com_id, const nlohmann::json &json_obj)

So I did a little hack to M5Module-LLM and send the command of "sys_lsmode".

I found this info about openbuddy model in the result:

        {
            "capabilities": [
                "text_generation",
                "chat"
            ],
            "input_type": [
                "llm.utf-8",
                "llm.utf-8.stream",
                "llm.chat_completion",
                "llm.chat_completion.stream"
            ],
            "mode": "openbuddy-llama3.2-1B-ax630c",
            "mode_param": {
                "axmodel_num": 16,
                "b_bos": false,
                "b_dynamic_load_axmodel_layer": false,
                "b_eos": false,
                "b_use_mmap_load_embed": true,
                "b_use_topk": false,
                "filename_post_axmodel": "llama_post.axmodel",
                "filename_tokenizer_model": "http://localhost:8080",
                "filename_tokens_embed": "model.embed_tokens.weight.bfloat16.bin",
                "template_filename_axmodel": "llama_p128_l%d_together.axmodel",
                "tokenizer_type": 2,
                "tokens_embed_num": 128256,
                "tokens_embed_size": 2048
            },
            "output_type": [
                "llm.utf-8",
                "llm.utf-8.stream"
            ],
            "type": "llm"
        },

This is just the same as this file:
https://github.com/m5stack/StackFlow/blob/9d1dfced01e365694f6aa69b78aed29d0bf49e26/projects/llm_framework/main_llm/mode_openbuddy-llama3.2-1B-ax630c.json

I'm using llm inference like this on .ino but it always replies with null.

    llm_config = m5_module_llm::ApiLlmSetupConfig_t();
    llm_config.model = "openbuddy-llama3.2-1B-ax630c";
    llm_work_id = module_llm.llm.setup(llm_config);
...
      module_llm.llm.inferenceAndWaitResult(llm_work_id, question.c_str(), [](String& result) {
          /* Show result on screen */
          M5.Display.printf("%s", result.c_str());
          Serial.printf("%s", result.c_str());
      });

@mhama
Copy link
Author

mhama commented Dec 31, 2024

I modified M5Module-LLM library and output error code/message at inference (ApiLlm::inferenceAndWaitResult).

error code: -4
error message: inference data push false

So maybe it fails around here:

usr_print_error(request_id, work_id, "{\"code\":-4, \"message\":\"inference data push false\"}", com_id);

@Abandon-ht
Copy link
Contributor

It seems that the model has been installed into ModuleLLM. The problem is in the loading model. You can try to use the following json to load the model to determine whether it is successfully loaded.

{
    "request_id": "llm001",
    "work_id": "llm",
    "action": "setup",
    "object": "llm.setup",
    "data": {
        "model": "openbuddy-llama3.2-1B-ax630c",
        "response_format": "llm.utf-8.stream",
        "input": "llm.utf-8",
        "enoutput": true,
        "max_token_len": 127,
        "prompt": "You are a knowledgeable assistant capable of answering various questions and providing information."
    }
}

@mhama
Copy link
Author

mhama commented Jan 2, 2025

Ok, I got it work. Thank you Abandon-ht !
The problem was that it timeouts during model setup.
The openbuddy-llama3.2-1B-ax630c model takes about 15 sec to setup but the timeout value for the setup was 10sec on M5Module-LLM. So it timeouts and the returned work_id was null. Then the inference did not work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants