Paper | Installation | Usage | Mobile App
A comprehensive benchmark designed to evaluate the performance and resource consumptions of LLMs & LMMs for on-device use cases.
To install MobileBench, follow these steps:
- Clone the Repository:
git clone --recurse-submodules https://github.com/SalesforceAIResearch/MobileAIBench.git
- Create a Conda Environment:
conda create -n mobile_bench python=3.10 conda activate mobile_bench
- Run the Makefile:
make
- Add OpenAI API Key:
export OPENAI_API_KEY=<OPENAI_API_KEY>
Here are some usage examples for running MobileAIBench:
-
Dataset: hotpot_qa & databricks-15k
-
Model: xgen2-3b.gguf
-
Run on GPU:
python ./src/mobile_bench.py --task question_answering --model_lib llama_cpp_python --model_name xgen2-3b.gguf --use_gpu
-
Run on CPU:
python ./src/mobile_bench.py --task question_answering --model_lib llama_cpp_python --model_name xgen2-3b.gguf
-
Model: xgen2-3b.gguf
-
Run on GPU:
python ./src/mobile_bench.py --task all --model_lib llama_cpp_python --model_name xgen2-3b.gguf --use_gpu
-
Run on CPU:
python ./src/mobile_bench.py --task all --model_lib llama_cpp_python --model_name xgen2-3b.gguf
MobileAIBench supports both iOS and Android
- Apple Developer Account: Ensure you have signed up for an Apple Developer account(free version).
-
Open the Project in Xcode:
- Open
MobileAIBench.xcodeproj
located atios-app/MobileAIBench/MobileAIBench.xcodeproj
in Xcode.
- Open
-
Configure Xcode Settings:
- Change the Xcode developer settings to your personal team.
- Change the bundle identifier name to your choice.(Make sure that its in following format: ORGANIZATION_NAME.APP_NAME, if organization doesnt exist then just keep APP_NAME)
-
Build and Run the App:
- Build and run the app on your iPhone.
- While running on a mobile device, open in developer mode, and trust the developer of the app in the settings.
-
Add Models to the App Directory:
- Copy and paste the following models into the iPhone's app directory, ensuring they are in
.gguf
format and named correctly:tinyllama-1.1b-chat_Q4_K_M.gguf
phi-2_Q4_K_M.gguf
gemma-2b-it_Q4_K_M.gguf
stablelm-zephyr-3b_Q4_K_M.gguf
- Additionally, copy and paste the multimodal model and projector as follows:
llava-phi-2.gguf
llava-phi-2-mmproj.gguf
- Copy and paste the following models into the iPhone's app directory, ensuring they are in
-
Run the App:
- Select your iPhone as the target device.
- Run the app using the play button in Xcode.
- After building and running the app on the mobile device using Xcode, close Xcode.
-
Record Performance Metrics:
- To record CPU, RAM, memory, and thermal usage, open the Instrument app.
- Run the LLMBench project using the Instrument app.
- Depending on the metrics you want to record, choose profiles such as Activity Monitor, GPU, Metal Application, and Thermal State.
Download all the required models in .gguf
format from the following link: Download
Before you begin, ensure that you have the following:
- Android Studio: Installed on your computer. You can download it from here.
- Android Device: Optional but recommended for testing on a real device.
- Android Developer Account: Optional, but needed if you want to distribute your app on the Google Play Store.
-
Open the Project in Android Studio:
- Open your project by selecting Open an existing project or going to File > Open, Open
MobileAIBench
located atandroid-app/MobileAIBench/examples/llama.android
in Android Studio.
- Open your project by selecting Open an existing project or going to File > Open, Open
-
Build the App:
- When you open a project, Android Studio usually prompts you to sync the Gradle files automatically. If this happens, just click Sync Now in the prompt that appears.
- If the prompt does not appear, or if you’ve made changes to your build.gradle files, you may need to manually sync. Click on the Sync Project with Gradle Files button (usually represented by an elephant icon) or go to File > Sync Project with Gradle Files.
-
Run the App:
- While running on a mobile device, open in developer mode, and trust the developer of the app in the settings.
- Select your android as the target device.
- Run the app using the play button in Android Studio.
- After building and running the app on the mobile device using Android Studio, close Android Studio.
- App is now ready to be used on the target device!
- This Android app is based on the llama.android implementation from the official llama.cpp repository. You can find it here.
This release is for research purposes only in support of an academic paper. Our models, datasets, and code are not specifically designed or evaluated for all downstream purposes. We strongly recommend users evaluate and address potential concerns related to accuracy, safety, and fairness before deploying this model. We encourage users to consider the common limitations of AI, comply with applicable laws, and leverage best practices when selecting use cases, particularly for high-risk scenarios where errors or misuse could significantly impact people’s lives, rights, or safety. For further guidance on use cases, refer to our AUP and AI AUP.