[Feature] Add sampler custom logits processor #2396

hongpeng-guo · 2024-12-08T09:39:03Z

Motivation

This PR tries to support custom logits processor registered by users, so users can easily implement their custom sampling methods without the need to change the sglang code.

Related Issue

#2291

Modifications

Introduce the common abstract interface CustomLogitProcessor as in sglang/srt/sampling/custom_logit_processor.py. This interface contains a (1) Callable function to process the logtis, and (2) from_str and to_str methods that any user defined subclass can be serialized and passed into the server using request.post(url, json=data) to the /generate endpoint.
Add custom_params as a field of SamplingParams
Add custom_logit_processor as a field to GenerateReqInput, TokenizedGenerateReqInput, Req, and SamplingBatchInfo

Examples

Frist define a dummy AddLogitProcessor

from sglang.srt.sampling.custom_logit_processor import CustomLogitProcessor

class AddLogitProcessor(CustomLogitProcessor):

    def __call__(self, logits, custom_param_list):
        import torch 
        assert logits.shape[0] == len(custom_param_list)
        key = 'arg1'

        merged_params = torch.tensor(
            [custom_param_list[i][key] for i in range(len(custom_param_list))], dtype=torch.float
        ).to(device=logits.device, non_blocking=True)
        return logits + merged_params

endpoint usage example:

import requests
    
url = "http://localhost:30000/generate"
data = {
    "text": "What is the capital of France?",
    "sampling_params": {        
        "custom_params": {
            "arg1": 5.0,
        },
    },
    "custom_logit_processor": AddLogitProcessor().to_str(),
}

response = requests.post(url, json=data)
print(response.json())

offline engine usage example

import sglang as sgl

llm = sgl.Engine(model_path="meta-llama/Meta-Llama-3.1-8B-Instruct")
prompt = "The president of the United States is"
sampling_params = {"temperature": 0.8, "top_p": 0.95, "custom_params": {"arg1": 5.0}}

output = llm.generate(
    prompt, 
    sampling_params,
    custom_logit_processor=AddLogitProcessor().to_str()
)
print(output['text'])

TODO

update the docs as suggested by @merrymercy

Checklist

Format your code according to the Contributor Guide.
Add unit tests as outlined in the Contributor Guide.
Update documentation as needed, including docstrings or example tutorials.

Signed-off-by: Hongpeng Guo <[email protected]>

merrymercy · 2024-12-17T12:13:07Z

Thanks for taking this. Can you add some end-to-end tests and examples?

hongpeng-guo · 2024-12-18T11:03:14Z

Thanks for taking this. Can you add some end-to-end tests and examples?

@merrymercy Thanks for taking a look on this. I am still trying to understand the appropriate layer for the user to register their customized_logit_processor. The goal is to enable customized_logit_processor functionality without requiring changes to the internal sglang codebase. To achieve this, the function registration should occur at the API layer. The customized_logit_processor_fn and custom_params will then be passed from the program driver to the internal modules, such as Sampler and SampleParams.

If the above seems correct, I will try to get it reviewable within this week.

…ocessor

merrymercy · 2024-12-26T16:14:28Z

Your proposal sounds good. We should be able to register this function with all interfaces.
Say, we should be able to support it with the native /generate api and the offline Engine API.

…ocessor

Signed-off-by: Hongpeng Guo <[email protected]>

…m/hongpeng-guo/sglang into hpguo/add_sampler_logit_processor

python/sglang/srt/sampling/sampling_batch_info.py

Signed-off-by: Hongpeng Guo <[email protected]>

hongpeng-guo

There are a few unit test on performance and accuracy being a bit flaky. they seem not to be directly related to this PR, though.

merrymercy

We are almost there! Some final comments.

python/sglang/srt/managers/schedule_batch.py

python/sglang/srt/sampling/sampling_batch_info.py

Signed-off-by: Hongpeng Guo <[email protected]>

hongpeng-guo · 2025-01-19T07:48:12Z

We are almost there! Some final comments.

Thanks a lot for shepherding this! Just handled the comments. PTAL

merrymercy · 2025-01-19T22:42:49Z

python/sglang/srt/sampling/sampling_batch_info.py

@@ -76,15 +88,48 @@ def from_schedule_batch(
            [r.sampling_params.min_p for r in reqs], dtype=torch.float
        ).to(device, non_blocking=True)

+        # Check if any request has custom logit processor
+        has_custom_logit_processor = any(r.custom_logit_processor for r in reqs)


we can even skip this for loop if custom logit process is not enabled by the server.

merrymercy · 2025-01-19T22:43:16Z

python/sglang/srt/sampling/sampling_batch_info.py

+        self, unfinished_indices: List[int], new_indices: torch.Tensor
+    ):
+        """Filter the custom logit processor and custom params"""
+        if not self.custom_logit_processor:


this seems not needed. If self.custom_logit_processor is None, why will it go to this function?

Agreed, I can remove this or make it a simple assert.

merrymercy · 2025-01-19T22:52:55Z

@hongpeng-guo Thanks. It is merged.

Some follow-up items:

Address the above two comments.
Documentation. We can add some usage examples here https://github.com/sgl-project/sglang/blob/main/docs/references/sampling_params.md and it will be compiled as https://sgl-project.github.io/references/sampling_params.html

hongpeng-guo added 3 commits December 4, 2024 07:59

add custom params

ee8eaa3

Signed-off-by: Hongpeng Guo <[email protected]>

fix typos

19719f4

Signed-off-by: Hongpeng Guo <[email protected]>

Merge branch 'main' into hpguo/add_sampler_logit_processor

54204d2

hongpeng-guo requested review from merrymercy, hnyls2002, Ying1123, zhyncs and ispobock as code owners December 8, 2024 09:39

hongpeng-guo marked this pull request as draft December 8, 2024 09:39

hongpeng-guo changed the title ~~Add sampler logit processor~~ [WIP] Add sampler logit processor Dec 8, 2024

Merge branch 'main' into hpguo/add_sampler_logit_processor

e25a145

merrymercy force-pushed the main branch from 1ad76cd to 835f8af Compare December 9, 2024 07:31

Merge branch 'main' into hpguo/add_sampler_logit_processor

00d02fb

Merge remote-tracking branch 'origin' into hpguo/add_sampler_logit_pr…

59465a3

…ocessor

merrymercy self-assigned this Dec 26, 2024

hongpeng-guo added 12 commits December 27, 2024 08:41

Merge remote-tracking branch 'origin' into hpguo/add_sampler_logit_pr…

23a3451

…ocessor

add general custom logit processors

5d77e98

Signed-off-by: Hongpeng Guo <[email protected]>

fix typos

6dd2d94

Signed-off-by: Hongpeng Guo <[email protected]>

fix bug

34b0005

Signed-off-by: Hongpeng Guo <[email protected]>

remove merge custom_param logic from sglang

1062202

Signed-off-by: Hongpeng Guo <[email protected]>

fix inconsistency

1f16689

Signed-off-by: Hongpeng Guo <[email protected]>

update function defination

b1bdfdf

Signed-off-by: Hongpeng Guo <[email protected]>

apply the processor on the sampler

f4c79ec

Signed-off-by: Hongpeng Guo <[email protected]>

apply the processor on the sampler

c0a5905

Signed-off-by: Hongpeng Guo <[email protected]>

refine the custom logit processor

13a7bc6

Signed-off-by: Hongpeng Guo <[email protected]>

update function arg name

530127f

Signed-off-by: Hongpeng Guo <[email protected]>

update function arg name

4a6c6e2

Signed-off-by: Hongpeng Guo <[email protected]>

Merge branch 'hpguo/add_sampler_logit_processor' of https://github.co…

57ebd5c

…m/hongpeng-guo/sglang into hpguo/add_sampler_logit_processor

DreamGenX reviewed Jan 18, 2025

View reviewed changes

python/sglang/srt/sampling/sampling_batch_info.py Show resolved Hide resolved

hongpeng-guo added 7 commits January 18, 2025 14:45

moving clp deserialize into sampling_batch_info

42407cd

Signed-off-by: Hongpeng Guo <[email protected]>

Merge branch 'main' into hpguo/add_sampler_logit_processor

895a916

Merge branch 'main' into hpguo/add_sampler_logit_processor

0b3a414

Merge branch 'main' into hpguo/add_sampler_logit_processor

0688823

making default clp being None in SamplingBatchInfo

61967f1

Signed-off-by: Hongpeng Guo <[email protected]>

reorg Req taking clp

bb4771e

Signed-off-by: Hongpeng Guo <[email protected]>

Merge branch 'main' into hpguo/add_sampler_logit_processor

849a221

hongpeng-guo commented Jan 19, 2025

View reviewed changes

Merge branch 'main' into hpguo/add_sampler_logit_processor

e3b689a

merrymercy requested changes Jan 19, 2025

View reviewed changes

python/sglang/srt/managers/schedule_batch.py Outdated Show resolved Hide resolved

python/sglang/srt/sampling/sampling_batch_info.py Show resolved Hide resolved

python/sglang/srt/sampling/sampling_batch_info.py Show resolved Hide resolved

hongpeng-guo added 5 commits January 19, 2025 07:03

make flag check into scheduler

d96b9a8

Signed-off-by: Hongpeng Guo <[email protected]>

add lru cahce to clp class

3205326

Signed-off-by: Hongpeng Guo <[email protected]>

cover unittest case for batch

252672d

Signed-off-by: Hongpeng Guo <[email protected]>

improve doc str

dd63b2f

Signed-off-by: Hongpeng Guo <[email protected]>

Merge branch 'main' into hpguo/add_sampler_logit_processor

d3ac431

merrymercy reviewed Jan 19, 2025

View reviewed changes

merrymercy approved these changes Jan 19, 2025

View reviewed changes

merrymercy merged commit e403d23 into sgl-project:main Jan 19, 2025
16 checks passed

This was referenced Jan 19, 2025

[Feature] Support a custom logit processor #2291

Closed

Support top n sigma sampling #2192

Closed

Dry sample #1187

Closed

hongpeng-guo deleted the hpguo/add_sampler_logit_processor branch January 20, 2025 00:43

This was referenced Jan 20, 2025

[Enhancement] Custom Logit Processor Improvement #2998

Merged

[Doc] Update doc of custom logit processor #3021

Merged

Aykhan-sh mentioned this pull request Feb 12, 2025

[Feature] Extend CustomLogitProcessor to Support input_ids in call Method #3524

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add sampler custom logits processor #2396

[Feature] Add sampler custom logits processor #2396

hongpeng-guo commented Dec 8, 2024 •

edited

Loading

merrymercy commented Dec 17, 2024

hongpeng-guo commented Dec 18, 2024

merrymercy commented Dec 26, 2024 •

edited

Loading

hongpeng-guo left a comment

merrymercy left a comment •

edited

Loading

hongpeng-guo commented Jan 19, 2025

merrymercy Jan 19, 2025

merrymercy Jan 19, 2025

hongpeng-guo Jan 20, 2025

merrymercy commented Jan 19, 2025

[Feature] Add sampler custom logits processor #2396

[Feature] Add sampler custom logits processor #2396

Conversation

hongpeng-guo commented Dec 8, 2024 • edited Loading

Motivation

Related Issue

Modifications

Examples

TODO

Checklist

merrymercy commented Dec 17, 2024

hongpeng-guo commented Dec 18, 2024

merrymercy commented Dec 26, 2024 • edited Loading

hongpeng-guo left a comment

Choose a reason for hiding this comment

merrymercy left a comment • edited Loading

Choose a reason for hiding this comment

hongpeng-guo commented Jan 19, 2025

merrymercy Jan 19, 2025

Choose a reason for hiding this comment

merrymercy Jan 19, 2025

Choose a reason for hiding this comment

hongpeng-guo Jan 20, 2025

Choose a reason for hiding this comment

merrymercy commented Jan 19, 2025

hongpeng-guo commented Dec 8, 2024 •

edited

Loading

merrymercy commented Dec 26, 2024 •

edited

Loading

merrymercy left a comment •

edited

Loading