v0.5.1
Scenarios
- Updated VLM scenarios for VHELM (#1592)
- Added
trust_remote_code=True
formath
andlegalbench
scenarios (#2597)
Models
- Added Google Gemini 1.5 Pro as a VLM (#2607)
- Add Mixtral 8x22b Instruct, Llama 3 Chat, and Yi Chat (#2610)
- Updated VLM models for VHELM (#1592)
- Improved handling of Gemini content blocking (#2603)
- Added default instructions prefix for multiple-choice joint adaptation for Claude 3
- Renamed some model names for consistency (#2596)
- Added Snowflake Arctic Instruct (#2599, #2591)
Frontend
- Added global landing page (#2593)
Evaluation Results
- VHELM v1.0.0
- Initial release with Gemini 1.5 Pro, Claude 3 Sonnet, Claude 3 Opus, Gemini 1.0 Pro Vision, IDEFICS 2 and GPT-4V
Contributors
Thank you to the following contributors for your work on this HELM release!