how to merge gguf quntizated models? #494

lexasub · 2025-01-24T19:11:39Z

does mergekit support gguf? if not, than feature Request: Support for Merging GGUF Quantized Models via MergeKit.
Currently, MergeKit doesn't natively support merging GGUF-format quantized models. This feature request proposes adding capability to merge locally stored GGUF models while preserving quantization benefits.

Use Case

Merge specialized quantized models (e.g., code+math+language)

Combine LoRA adapters with quantized base models

Avoid reconversion to FP16/F32 for merging

Proposed Solution

Add GGUF loader/writer interface

Implement quantization-aware merging:

class GGUFMerger:  
    def __init__(self, config):  
        self.quant_methods = {  
            'q4_k': self._merge_q4_k,  
            'q5_k': self._merge_q5_k  
        }

Support mixed quantization levels with auto-upcasting

Implementation Steps

Add GGUF file loader using llama.cpp Python bindings

Create quantization-aware merging strategies

Implement memory-efficient partial loading

Add GGUF writer with quantization preservation

Configuration Example

yaml
Copy
merge_method: dare_ties  
models:  
  - model: ./codellama-34b.Q5_K.gguf  
    parameters:  
      weight: 0.6  
      quant: q5_k  
  - model: ./wizardmath-7b.Q4_K.gguf  
    parameters:  
      weight: 0.4  
      quant: q4_k  
base_model: llama  
dtype: quantized  
output_quant: q5_k_m  # Auto-convert lighter quant  
CLI Arguments

mergekit-yaml config.yaml ./merged-model.gguf \  
  --quant-method q5_k_m \  
  --gguf-context 4096 \  
  --quant-merge-strategy conservative

Possible Challenges

Handling different quantization methods

Preserving quantization accuracy

Memory management for large merges

Additional Context
Tested with:

llama.cpp 0.8.0

MergeKit 0.3.1

GGUF v3

Community Benefit
This would enable users to:

Reduce VRAM requirements by 4-10x during merging

Maintain quantization benefits end-to-end

Combine models from different quantization sources

Request
Please 👍 if you need this feature and share your use case below.
CC: @mergekit-maintainers

The text was updated successfully, but these errors were encountered:

tushar-31093 · 2025-01-29T06:21:53Z

Looking for the same

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to merge gguf quntizated models? #494

how to merge gguf quntizated models? #494

lexasub commented Jan 24, 2025

tushar-31093 commented Jan 29, 2025

how to merge gguf quntizated models? #494

how to merge gguf quntizated models? #494

Comments

lexasub commented Jan 24, 2025

tushar-31093 commented Jan 29, 2025