Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server : use common_token_to_piece instead of common_detokenize #11740

Merged
merged 2 commits into from
Feb 11, 2025

Conversation

danbev
Copy link
Collaborator

@danbev danbev commented Feb 7, 2025

TThis commit replaces the call to common_detokenize with
common_token_to_piece in the populate_token_probs.

The motivation for this change is to avoid an issue where
common_detokenize would remove the word boundary character for tokens,
which caused a regression in the server generated token probabilities.

Resolves: #11728

examples/server/server.cpp Outdated Show resolved Hide resolved
@ngxson
Copy link
Collaborator

ngxson commented Feb 7, 2025

Also, a clue is to look around the common_detokenize, probably we miss certain params when constructing these detokenized string. The lines around result.probs.reserve(max_probs); should be related

@danbev danbev changed the title server : use text_to_send instead of detokenized token llama : add remove_space_prefix to llama_detokenize Feb 10, 2025
Copy link

@CyberShadow CyberShadow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment on the code, but I've tested this patch and it builds successfully + resolves the issue #11728 . Thank you!

@danbev danbev force-pushed the server-logprobs-11728 branch from 456dabf to cc1fd2f Compare February 10, 2025 15:27
@danbev danbev marked this pull request as ready for review February 10, 2025 15:28
This commit replaces the call to common_detokenize with
common_token_to_piece in the populate_token_probs.

The motivation for this change is to avoid an issue where
common_detokenize would remove the word boundary character for tokens,
which caused a regression in the server generated token probabilities.

Resolves: ggerganov#11728
@danbev danbev force-pushed the server-logprobs-11728 branch from cc1fd2f to b70fd3a Compare February 10, 2025 18:33
@ngxson
Copy link
Collaborator

ngxson commented Feb 10, 2025

IIRC there is another place having the same logic (near the speculative decode logic), that should be fixed too

Use common_token_to_piece for post_sampling_probs as well.
@danbev
Copy link
Collaborator Author

danbev commented Feb 11, 2025

IIRC there is another place having the same logic (near the speculative decode logic), that should be fixed too

I've taken a look but could not find this in/near the speculative decoding. I did find this TODO though.

I noticed that this same logic is part of the post_sampling_probs and have updated this in 5deee0a.

@danbev danbev changed the title llama : add remove_space_prefix to llama_detokenize server : use common_token_to_piece instead of common_detokenize Feb 11, 2025
@ngxson ngxson merged commit a18f481 into ggerganov:master Feb 11, 2025
46 checks passed
tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
…ganov#11740)

* server : use common_token_to_piece instead of common_detokenize

This commit replaces the call to common_detokenize with
common_token_to_piece in the populate_token_probs.

The motivation for this change is to avoid an issue where
common_detokenize would remove the word boundary character for tokens,
which caused a regression in the server generated token probabilities.

Resolves: ggerganov#11728

* squash! server : use common_token_to_piece instead of common_detokenize

Use common_token_to_piece for post_sampling_probs as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Misc. bug: Tokens in top_probs / top_logprobs are missing whitespace
4 participants