Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize text_style using bit packing #4363

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

LocalSpook
Copy link
Contributor

Currently, text_style takes up 20 bytes, but we can pack it into 8.

First, a side benefit of that is a smaller and faster implementation of text_style::operator| (before-and-after), but that function I imagine is usually evaluated at compile-time anyway.

The main benefit is reduced binary size per function call using colors. This comes from two factors:

  • it becomes possible to pass text_style through registers (before-and-after); and
  • styled_arg becomes smaller, so less data needs to be copied around (before-and-after).

This absolutely breaks ABI. (Should the v11 namespace be bumped to v12?) I'm actually not sure if it breaks API;
it changes the API of detail::color_type—an internal type, you would think—but one that's part of the public interface of text_style.

The first commit adds more color tests, because I don't feel the existing suite is comprehensive enough to prove that this change doesn't affect semantics.

-> text_style {
return lhs |= rhs;
}

FMT_CONSTEXPR auto operator==(text_style rhs) const noexcept -> bool {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

operator== and operator!= should really be friends, but I ran into this GCC bug when doing that.

// bit is impossible in that case, because 00 (unset color) means the
// 24 bits that precede the discriminator are all zero.
//
// This test can be applied to both colors simultaneously.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very bit hacky algorithm; I hope the detailed explanation compensates for that.

Copy link
Contributor

@vitaut vitaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Overall LGTM, just one minor comment inline.

return (value_ & (1 << 25)) != 0;
}

FMT_CONSTEXPR auto get_value() const noexcept -> uint32_t {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest dropping get_ here.

Comment on lines +270 to +279
// We do that check by adding the styles. Consider what adding does to each
// possible pair of discriminators:
// 00 + 00 = 000
// 01 + 00 = 001
// 11 + 00 = 011
// 01 + 01 = 010
// 11 + 01 = 100 (!!)
// 11 + 11 = 110 (!!)
// In the last two cases, the ones we want to catch, the third bit——the
// overflow bit——is set. Bingo.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever =)

@vitaut
Copy link
Contributor

vitaut commented Feb 26, 2025

One potential concern is the ABI but it seems to be OK because all the functions involved are inline or templated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants