Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINIDUMP_CALLBACK_INPUT packing/alignment problem #460

Closed
qrnch-jan opened this issue May 4, 2021 · 33 comments
Closed

MINIDUMP_CALLBACK_INPUT packing/alignment problem #460

qrnch-jan opened this issue May 4, 2021 · 33 comments
Assignees
Labels
broken api An API is inaccurate and could lead to runtime failure

Comments

@qrnch-jan
Copy link

qrnch-jan commented May 4, 2021

I'm running MiniDumpWriteDump() in Rust with a callback. Accessing the fields of MINIDUMP_CALLBACK_INPUT yields "garbage" values (clearly wrong, but consistent, values). There's a noticeable size discrepancy between the C and Rust structs; it's 1000 in Rust, while it's 1312 in C. The offset of the CallbackType member is 12 in C and 16 in Rust. The Rust binding does not include a packed(4).

The entire include file is apparently wrapped in pack(4), but other types in the Rust bindings are missing packed(4), like MINIDUMP_CALLBACK_OUTPUT.

@sotteson1
Copy link
Contributor

That pack attribute is there in the metadata. Is this a Rust projection specific issue?

@kennykerr
Copy link
Contributor

I don't see it in the latest published version. Perhaps its not yet been released?

image

@sotteson1
Copy link
Contributor

Weird. I don't know what would have changed in more recent builds for this to show up. I'm about due for a release anyway.

And I just noticed my per-arch difference checking didn't notice that pack 4 is on the x64/arm64 but not x86. That might have something to do with it.

@mikebattista mikebattista added the broken api An API is inaccurate and could lead to runtime failure label May 7, 2021
@AraHaan
Copy link

AraHaan commented Jun 22, 2021

Weird. I don't know what would have changed in more recent builds for this to show up. I'm about due for a release anyway.

And I just noticed my per-arch difference checking didn't notice that pack 4 is on the x64/arm64 but not x86. That might have something to do with it.

You sure that that struct is not packed(4) inside the api header on all arch's windows supports and said api (even on 32-bit arm)?

@qrnch-jan
Copy link
Author

Are there any updates on this issue? The windows 0.23.0 crate seems to still have this issue.

@riverar
Copy link
Collaborator

riverar commented Feb 9, 2022

@qrnch-jan @AraHaan This seems to be working great, can we close this issue? I've verified it's annotated correctly in metadata (17.0.15.55535) as well.

[dependencies.windows]
version = "0.32.0"
features = [
    "Win32_Foundation",
    "Win32_Graphics_Dxgi",
    "Win32_Storage_FileSystem",
    "Win32_System_Kernel",
    "Win32_System_Memory",
    "Win32_System_Diagnostics_Debug",
]
#[test]
fn test_minidump_callback_input_size() {
    unsafe {
        assert_eq!(std::mem::size_of::<MINIDUMP_CALLBACK_INPUT>(), 1312);
    }
}

@qrnch-jan
Copy link
Author

The sizes match, though the main problem I was having is still having an issue -- but I need an extra pair of eyes, just to make sure I'm not reading the results wrong.

The issue is that I'm getting some very odd values for CallbackType in Rust code.

This is from a C program which I'm using to sanity-check the Rust code's output:

BOOL CALLBACK MyMiniDumpCallback(
  PVOID pParam,
  const PMINIDUMP_CALLBACK_INPUT pInput,
  PMINIDUMP_CALLBACK_OUTPUT pOutput)
{
  BOOL bRet = FALSE;
  int i, j;
  const uint8_t *p;
  unsigned offs;

  /* ... */

  offs = (uint8_t*)(&(pInput->CallbackType)) - (uint8_t*)pInput;
  printf("%p   %d      %u    offs=%u\n", pInput, (int)pInput->CallbackType,
      (unsigned)sizeof(MINIDUMP_CALLBACK_INPUT), offs);

I.e. it calculates the offset to the CallbackType member. When run, offs is 12.

This Rust code:

unsafe extern "system" fn minidump_callback(
  param: *mut c_void,
  input: *const MINIDUMP_CALLBACK_INPUT,
  output: *mut MINIDUMP_CALLBACK_OUTPUT
) -> BOOL {
  let param = &mut *(param as *mut MDCContext);

  // ...

  let memberaddr = &(*input).CallbackType as *const _;
  let offs = memberaddr as u64 - input as u64;

  println!(
    "input={:p}    {}    calltypeoffs={}",
    input,
    std::mem::size_of::<MINIDUMP_CALLBACK_INPUT>(),
    offs
  );

... outputs that calltypeoffs is 16.

To me, this looks like the C structure is pack(4), while the Rust version is not (and indeed, inspection of the relevant mod.rs indicates this to be the case). However, that made a lot more sense when there was a size difference. Now that the sizes match, I sort-of expected the offset to be right .. but I guess there's some padding throwing me for a loop.

I'm guessing that marking the relevant structs as pack(4) in mod.rs will solve this issue.

@riverar
Copy link
Collaborator

riverar commented Feb 9, 2022

Got it. I took another look and I must have looked at *_INFORMATION and not *_INPUT. It is indeed missing the packing attribute in metadata. Will get that patched up now.

Along with MINIDUMP_CALLBACK_OUTPUT, MINIDUMP_DIRECTORY, MINIDUMP_FUNCTION_TABLE_STREAM, and friends.

@riverar
Copy link
Collaborator

riverar commented Feb 9, 2022

@mikebattista @sotteson1 Do you remember what the final outcome of #418 #521 were? That is, what do we do with structures that don't have architecture-specific variants? Target x64 (pack=4) and downstream x86 generators just ignore it? Or do we invent a new type? Or am I discovering the reason this issue is still open? 😆

@tim-weis
Copy link
Contributor

tim-weis commented May 7, 2022

This issue just hit another client.

@kennykerr
Copy link
Contributor

What's the status of this old issue? There's another question about the type of the callback field.

microsoft/windows-rs#1892

@kennykerr
Copy link
Contributor

This looks like a duplicate of #1744. The issue is still there in the latest metadata (24.0.1-preview).

Originally posted by @tim-weis in microsoft/windows-rs#1892 (comment)

@mikebattista
Copy link
Collaborator

@mikebattista @sotteson1 Do you remember what the final outcome of #418 #521 were? That is, what do we do with structures that don't have architecture-specific variants? Target x64 (pack=4) and downstream x86 generators just ignore it? Or do we invent a new type? Or am I discovering the reason this issue is still open? 😆

@sotteson1 / @AArnott for comment.

@AArnott
Copy link
Member

AArnott commented Jul 12, 2022

I don't understand the question, or it's been resolved long ago. Structs that are the same for all architectures are declared once in the metadata without special attributes. When a struct exists for only a subset of architectures or is declared differently across architectures, it is declared with attributes to indicate which architectures this particular declaration belongs to. The struct may be declared multiple times with unique architecture attributes. In such a case the type may be declared multiple times in the exact same namespace and with the same name, which means projection should take care to select the right declaration based on the target architecture.

@kennykerr
Copy link
Contributor

@AArnott's description is consistent with how Rust treats metadata. So I'm guessing there's something else going on here. If someone could clarify that would be great.

@riverar
Copy link
Collaborator

riverar commented Jul 12, 2022

#include <pshpack4.h> is in use in minidumpapiset.h therefore it appears all structures in this file need pack=4 unconditionally. Some are missing in metadata (e.g. #460 (comment)).

This gets a little hairy for the architecture-specific structures; not sure if the current tooling supports overriding the pack across all those. (The architecture-specific split is still needed as some structures are missing fields, for example the Pad member of MINIDUMP_THREAD_CALLBACK on ARM64.)

@tim-weis
Copy link
Contributor

Structs that are the same for all architectures are declared once in the metadata without special attributes.

That doesn't sound right. MINIDUMP_CALLBACK_INPUT shares the same tokens in source code across architectures, but isn't actually the same. Specifically, it's ignoring the structure packing. While 4 is the natural alignment for 32-bit targets, it isn't for 64-bit targets. That information needs to make its way into the metadata.

To illustrate the issue here's both a C++ and Rust implementation that should produce identical output:

main.cpp

#include <Windows.h>
#include <DbgHelp.h>

#include <cstddef>
#include <format>
#include <cstdio>

int main()
{
    std::puts(
        std::format("Offset of ProcessHandle: {}",
                    offsetof(MINIDUMP_CALLBACK_INPUT, ProcessHandle)).c_str()
    );
}

Compiling and running this from an x64 Visual Studio command prompt yields the following output:

Offset of ProcessHandle: 4

Cargo.toml

[package]
name = "minidump"
version = "0.0.0"
edition = "2021"

[dependencies.windows-sys]
version = "0.36.1"
features = [
    "Win32_System_Diagnostics_Debug",
    "Win32_Foundation",
    "Win32_Storage_FileSystem",
    "Win32_System_Kernel"
]

[dependencies]
memoffset = "0.6.5"

main.rs

use windows_sys::Win32::System::Diagnostics::Debug::MINIDUMP_CALLBACK_INPUT;

fn main() {
    println!(
        "Offset of ProcessHandle: {}",
        memoffset::offset_of!(MINIDUMP_CALLBACK_INPUT, ProcessHandle)
    );
}

cargo run --target x86_64-pc-windows-msvc produces the following output:

Offset of ProcessHandle: 8

Using the latest windows crate (0.38.0) produces the same output as the version based on windows-sys (0.36.1).

@kennykerr
Copy link
Contributor

Thanks for the repro! So the issue is that MINIDUMP_CALLBACK_INPUT is missing an alternative representation in metadata for non-x86 architectures to accurately capture the unique packing on those architectures. Once that's provided in the win32 metadata, then the languages can correctly represent this structure.

@kennykerr
Copy link
Contributor

@sotteson1 / @mikebattista is this something we can quickly rectify? This issue has been open for some time and is clearly causing problems for a number of customers.

@tim-weis
Copy link
Contributor

Another note on the previous repro that touches on an issue that was discussed here earlier: Adding output for the size of the MINIDUMP_CALLBACK_INPUT structure, i.e.

    std::puts(
        std::format("Size of MINIDUMP_CALLBACK_INPUT: {}",
                    sizeof(MINIDUMP_CALLBACK_INPUT)).c_str()
    );

and

    println!(
        "Size of MINIDUMP_CALLBACK_INPUT: {}",
        std::mem::size_of::<MINIDUMP_CALLBACK_INPUT>()
    );

produces identical output

Size of MINIDUMP_CALLBACK_INPUT: 1312

I would have expected the size of the Rust structure to be larger due to the wider structure packing, so maybe something else is wrong in addition to the missing structure packing attribute (as @riverar noted here).

@kennykerr
Copy link
Contributor

Once we have the correct metadata, I'm happy to add cross-arch tests to validate this is Rust.

@riverar
Copy link
Collaborator

riverar commented Jul 12, 2022

Packing aside.

The current architecture neutral MINIDUMP_CALLBACK_INPUT points to x64-specific MINIDUMP_THREAD_CALLBACK. One could argue this is correct given the metadata targets x64, but I don't think the Rust side is expecting this relationship. We should probably instead ensure architecture-specific structs are generated if any members have architecture-specific variants. (i.e. The winmd should have three architecture-specific variants of MINIDUMP_CALLBACK_INPUT.)

There's a fun bag of issues here!

@kennykerr
Copy link
Contributor

That sort of problem is not new: #725

@AArnott
Copy link
Member

AArnott commented Jul 12, 2022

The metadata contains typeref's from neutral APIs to arch-specific APIs, yes. In CsWin32 we always resolve a typeref 'loosely' by checking whether the referenced type is arch-specific and if so, we may look for another declaration of that type with the right architecture on it. Metadata simply isn't great at representing multiple architectures in a single file.
FWIW though, it is useful and will become even more useful going forward in CsWin32 for the metadata to be willing to declare neutral structs that depend on arch-specific structs. I would not want to see an otherwise neutral struct be declared as arch-specific just because it has a transitive dependency on an arch-specific one. CsWin32 handles this and we'd lose information in the metadata if such a change was made.

The use case I'm thinking of is a neutral struct with a field that is a pointer to an arch-specific struct. When C# targets AnyCPU, we should be able to emit the neutral struct just fine, and simply leave the field pointer as IntPtr or void* instead of using ArchSpecific* as the field type. That makes C# AnyCPU (which is by far the common case) much more useful.

@kennykerr
Copy link
Contributor

Well, @sotteson1 has already said he wants to fix such instances rather than calling it a feature.

Yep, it's a bug in the metadata we can fix. I should be able to come up with a test that prevents this kind of thing in the future.

#725 (comment)

A lot of languages (probably most) generate code ahead of time so the kind of fancy "loose typing" you employ in C# isn't necessarily possible for everyone. Either way, this needs to be fixed and ideally done so consistently.

@AArnott
Copy link
Member

AArnott commented Jul 12, 2022

The loose typing I'm talking about isn't a C# or .NET runtime feature. It's something I do at projection time, and I would expect any projection can do it. The only requirement is that the projection know which CPU architecture is being targeted and that if the project is targeting multiple CPU architectures that unique projections be created for each. But that's a requirement regardless.

I think this issue should focus on the packing/alignment problem, which if I read correctly comes from the C parser we're using not recognizing packing settings that don't appear in syntax on the struct itself. Discussing redesigning how arch-specific structs are represented in metadata should probably go in a different issue.

@riverar
Copy link
Collaborator

riverar commented Jul 12, 2022

I would not want to see an otherwise neutral struct be declared as arch-specific just because it has a transitive dependency on an arch-specific one. CsWin32 handles this and we'd lose information in the metadata if such a change was made.

Hm, what information would you lose?

It seems weird that we would do all the work to generate architecture-specific structures and then leave them floating around with no references, instead relying on downstream bindings to "determine if type reference is pointing to the wrong arch and automatically tidy up".

@AArnott
Copy link
Member

AArnott commented Jul 12, 2022

what information would you lose?

Consider that struct A is neutral but for a field with a pointer to struct B which is arch-specific. Today, struct A is described in metadata as neutral. If instead the metadata declared A as arch-specific because it references B which is arch-specific, we lose the data that A is actually itself neutral and therefore can be safely represented in an AnyCPU targeted assembly.

@sotteson1
Copy link
Contributor

Hey @tannergooding, we're noticing that these two structs from minidumpapiset.h are wrapped by #include <pshpack4.h>, but only one of them is marked with [StructLayout(LayoutKind.Sequential, Pack = 4)] by ClangSharp. Any idea what might be causing that?

typedef struct _MINIDUMP_VM_POST_READ_CALLBACK
{
...
} MINIDUMP_VM_POST_READ_CALLBACK, *PMINIDUMP_VM_POST_READ_CALLBACK;

typedef struct _MINIDUMP_CALLBACK_INPUT {
...
}

(This looks the same for x86, x64 and arm64)

    [StructLayout(LayoutKind.Sequential, Pack = 4)]
    public unsafe partial struct MINIDUMP_VM_POST_READ_CALLBACK
    {
...
    }

    public partial struct MINIDUMP_CALLBACK_INPUT
    {
....
    }

@tannergooding
Copy link
Member

I can take a look later today.

I'd guess this is a case where Clang isn't surfacing the relevant information because it matches the expected default alignment -or- potentially that its only present for say 64-bit but the header is being processed as 32-bit or vice-versa.

@sotteson1
Copy link
Contributor

sotteson1 commented Sep 8, 2022

I can take a look later today.

I'd guess this is a case where Clang isn't surfacing the relevant information because it matches the expected default alignment -or- potentially that its only present for say 64-bit but the header is being processed as 32-bit or vice-versa.

Sorry, I found an existing issue in ClangSharp for this @tannergooding. It looks like this sometime happens for 32-bit but never for 64-bit. Since I already scan some headers in multiple architectures, I can for this one and get the correct packing from the 64-bit one.

@sotteson1
Copy link
Contributor

Fixed with #1233

@riverar
Copy link
Collaborator

riverar commented Sep 9, 2022

@sotteson1 There's still the issue of arch-neutral MINIDUMP_CALLBACK_INPUT referencing arch-specific types (e.g. MINIDUMP_THREAD_CALLBACK). Can create a new issue for that, though some of the discussion occurred above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broken api An API is inaccurate and could lead to runtime failure
Projects
None yet
Development

No branches or pull requests

9 participants