-
-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a feature flag to disable extra debug info #595
Comments
This allows - More optimization (smallest final binary size) - No internal code structure leak (no enum name leak).
The PR is ready: #596 |
Using a feature flag to control this is a non-starter since setting the feature would infect all other uses of |
As mentioned in #596, you can find a minimal reproducible code below. The code in question is in the fn main() {
let bytes = bincode::encode_to_vec(Foo::Bar, bincode::config::standard()).map_err(|_| ()).unwrap();
println!("{bytes:?}");
let (_, _): (Foo, _) = bincode::decode_from_slice(&bytes, bincode::config::standard()).unwrap();
}
#[derive(bincode::Encode, bincode::Decode)]
enum Foo {
Bar
} Which gives ( $ cargo build --release
$ strip errors-and-strings
$ strings errors-and-strings | grep "Foo"
Foolibrary/alloc/src/raw_vec.rscapacity overflowlibrary/alloc/src/ffi/c_str.rs You can also check that with the #![feature(prelude_import)]
#[prelude_import]
use std::prelude::rust_2021::*;
#[macro_use]
extern crate std;
fn main() {
let bytes = bincode::encode_to_vec(Foo::Bar, bincode::config::standard())
.map_err(|_| ())
.unwrap();
{
::std::io::_print(
::core::fmt::Arguments::new_v1(
&["", "\n"],
&[::core::fmt::ArgumentV1::new_debug(&bytes)],
),
);
};
let (_, _): (Foo, _) = bincode::decode_from_slice(
&bytes,
bincode::config::standard(),
)
.unwrap();
}
enum Foo {
Bar,
}
impl ::bincode::Encode for Foo {
fn encode<__E: ::bincode::enc::Encoder>(
&self,
encoder: &mut __E,
) -> core::result::Result<(), ::bincode::error::EncodeError> {
match self {
Self::Bar => {
<u32 as ::bincode::Encode>::encode(&(0u32), encoder)?;
Ok(())
}
}
}
}
impl ::bincode::Decode for Foo {
fn decode<__D: ::bincode::de::Decoder>(
decoder: &mut __D,
) -> core::result::Result<Self, ::bincode::error::DecodeError> {
let variant_index = <u32 as ::bincode::Decode>::decode(decoder)?;
match variant_index {
0u32 => Ok(Self::Bar {}),
variant => {
Err(::bincode::error::DecodeError::UnexpectedVariant {
found: variant,
type_name: "Foo",
allowed: &::bincode::error::AllowedEnumVariants::Range {
min: 0,
max: 0,
},
})
}
}
}
}
impl<'__de> ::bincode::BorrowDecode<'__de> for Foo {
fn borrow_decode<__D: ::bincode::de::BorrowDecoder<'__de>>(
decoder: &mut __D,
) -> core::result::Result<Self, ::bincode::error::DecodeError> {
let variant_index = <u32 as ::bincode::Decode>::decode(decoder)?;
match variant_index {
0u32 => Ok(Self::Bar {}),
variant => {
Err(::bincode::error::DecodeError::UnexpectedVariant {
found: variant,
type_name: "Foo",
allowed: &::bincode::error::AllowedEnumVariants::Range {
min: 0,
max: 0,
},
})
}
}
}
} In the same idea, your debug file in impl :: bincode :: Decode for Foo
{
fn decode < __D : :: bincode :: de :: Decoder > (decoder : & mut __D) ->
core :: result :: Result < Self, :: bincode :: error :: DecodeError >
{
let variant_index = < u32 as :: bincode :: Decode > :: decode(decoder)
? ; match variant_index
{
0u32 => Ok(Self :: Bar {}), variant =>
Err(:: bincode :: error :: DecodeError :: UnexpectedVariant
{
found : variant, type_name : "Foo", allowed : & :: bincode ::
error :: AllowedEnumVariants :: Range { min : 0, max : 0 }
})
}
}
} impl < '__de > :: bincode :: BorrowDecode < '__de > for Foo
{
fn borrow_decode < __D : :: bincode :: de :: BorrowDecoder < '__de > >
(decoder : & mut __D) -> core :: result :: Result < Self, :: bincode ::
error :: DecodeError >
{
let variant_index = < u32 as :: bincode :: Decode > :: decode(decoder)
? ; match variant_index
{
0u32 => Ok(Self :: Bar {}), variant =>
Err(:: bincode :: error :: DecodeError :: UnexpectedVariant
{
found : variant, type_name : "Foo", allowed : & :: bincode ::
error :: AllowedEnumVariants :: Range { min : 0, max : 0 }
})
}
}
}
Our objective is with a trigger (e.g. like here, a feature) we want to disable any enum name leak, and if it propagates to any crate using
For us the issue is regarding optimization (binary size, which will be increased for each enum in the code with the |
Again, I will need to see concrete data on how much this affects overall binary size before I would consider this to be a bloat issue. Bincode is a general purpose library and not designed for situations where you need absolutely every string possible removed from the final binary. Even on microcontrollers, which I do work on, this is not an issue. As for the leaking of enum names being a problem. Please elaborate on how this in particular is an issue. Rust crates work by building against the source of other crates. The source would also leak the enum name. |
Additionally try throwing away the error from your decode |
Indeed it may not be a problem for most of the use cases. What make me afraid is that the added code is proportional to the number of
For my use case, I build a proprietary binary (top of the dependency tree), which is not meant to be built against any other crate .
Even without used any unwrap, the strings are still even in release mode. I can get rid of them in some cases with extra compiler optimization, but it do not work every time. |
Realistically your enum variant names shouldn't be more than 20 characters. If you have an application that has over 100 enums, how much of an impact is 2kb of your final binary size going to matter? Especially because all of your log lines will probably be a significantly larger impact on your binary size. And again, you can get around this by simply dropping the error from any bincode invocation. Making something like enum YourError {
Bincode,
// ...
}
let _ = bincode::decode_from_slice(..).map_err(|_| YourError::Bincode)?; In combination with LTO will get rid of all information you're worried about leaking. If you're not using logging or custom error types, I can highly recommend adding that, especially from the importance of the application that it sounds like you're making.
Is bincode the only crate you're building against? There are many, many good crates out there that will make your life a lot easier. I can highly recommending using more of them. |
We tested your solution, and in a minimal environment it works, but in our more complex and main project it doesn't seem to work. Maybe LTO is failing to optimize this in a more complex project? In our project we use other crate (than bincode), but we made sure there was no information leak and maximum optimization. When not possible, we just use a feature flag to allow more optimization, since all cases of information leaks we found were for debug purpose. So, we just made a feature flag to enable maximum optimization (losing debug information), and when developing we just have everything we need. For us it will be painful to have to maintain a parallel version for a small line of code, do you think it's possible to find a solution, or there is no other option for us? Maybe a more generic flag could work, by also solving other issues like #594 (like |
#594 is not a optimization issue, but an issue about cluttering the As Zoey said, feature flags are not a good solution for this, and I don't see us merging any PR that adds a feature that disables functionality. |
I was more thinking about a "release" mode where you don't have to output generated code in files (#594), and optimize at a maximum binary size and internal structure leak (this issue). But I understand your point.
So for you, is there any solution other than have to maintain our own version of bincode for this single line change? |
I would argue that if the security and obfuscation of your proprietary application is this important all of your dependencies should be vendored and possibly patched. This level of binary obfuscation is not a common requirement in most applications. |
Ok, I understand. Thanks for your help |
The issue is in bincode_derive crate, where we are adding extra debug info that are leaking enum names in the final binary (which imply less optimization with a bigger binary size, and internal structure leak). I think we should use a proper feature flag to disable this. I'm working on a pull request to address that issue.
It seems there is only one leak in the code in this part of the code. I agree this debug line can be important, but should be optional for release build if required. A feature flag could be used to address that situation, and maybe it is useful for other cases like in #594.
The text was updated successfully, but these errors were encountered: