-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compiler can silently generate a bad image when there's 0xFFFF parameters in the module #94892
Comments
Michal pointed at the similarity with his old issue: dotnet/coreclr#20865 |
I found a similar error but with fields: #94890 |
@FSDKO yes, everything that is represented as a “range” in metadata (Fields, Methods, Parameters) has this issue. |
Tagging subscribers to this area: @dotnet/area-system-reflection Issue DetailsVersion Used: Steps to Reproduce: Started as an investigation of jbevain/cecil#913
using System.Reflection;
var a = Assembly.LoadFile(typeof(P3).Assembly.Location);
foreach (var t in a.GetTypes())
{
Console.WriteLine(t.FullName);
foreach (var m in t.GetMethods(BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.Instance | BindingFlags.Static))
{
Console.WriteLine(m.Name);
foreach (var p in m.GetParameters())
{
Console.WriteLine(p.Name);
}
}
}
Expected Behavior:
Actual Behavior:
Note that the last row of the
|
Tagging subscribers to this area: @dotnet/area-system-reflection-metadata Issue DetailsVersion Used: Steps to Reproduce: Started as an investigation of jbevain/cecil#913
using System.Reflection;
var a = Assembly.LoadFile(typeof(P3).Assembly.Location);
foreach (var t in a.GetTypes())
{
Console.WriteLine(t.FullName);
foreach (var m in t.GetMethods(BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.Instance | BindingFlags.Static))
{
Console.WriteLine(m.Name);
foreach (var p in m.GetParameters())
{
Console.WriteLine(p.Name);
}
}
}
Expected Behavior:
Actual Behavior:
Note that the last row of the
|
Probably the same as #94890, but with parameters. |
@AlekseyTs There was in the past interest to work around this in the Roslyn compiler itself by generating a dummy field/method. If we do a file format change to fix this file format bug, it would very likely not be backported to desktop CLR. |
I do not think compilers would want to do that. In my opinion, if |
It would also be interesting to see how native metadata writer was dealing with the situation. |
It doesn't. The linked PR (dotnet/coreclr#20865) that didn't merge has the details. The file format spec has allowance to represent this situation but neither the writers nor the readers implement it. If we fix this bug by implementing the spec, it would be netcore only. That's why the referenced comment from Tomas calls for a separate Roslyn issue to emit a workaround method/field instead. |
Implementing the workaround in |
All of our metadata writers give callers full control over emitted tokens. Emitting extra tokens behind user's back would likely just replace a bug with a new bug. The ECMA-335 metadata formats are used for things that are not .NET programs (winmd, toc files, IBC profile data, coff object files to be used with link.exe and metadata merging etc.). The metadata writer doesn't know what it's s actually emitting and how legal a dummy would be. I don't think it can emit a dummy. Emitting a dummy set of tokens from S.R.Metadata behind the back may even break Roslyn's EnC scenarios (but I'm not authoritave on EnC). |
Isn't not emitting the dummy definitely wrong, for any kind of output? I am specifically talking about scenarios when there is a need to refer past the end of the table. Do we know of any scenario for which not emitting a dummy is desired? BTW, EnC scenario, is compiler scenario as well. |
The proper fix is to implement the file format spec which allows representing this situation differently. It is what the linked PR was trying to do in the native writer. There would be a similar fix for S.R.Metadata reader and writer. It would be netcore only. Roslyn would need to emit a workaround token if it still cares about desktop because desktop is very unlikely to get a fix in its reader. For desktop CLR, such netcore-only fix wouldn't make things any worse than they are now because without the fix, we would have a record that is not readable, and with the fix, we would have a record that is not readable either. (See the PR description on how that's achieved.) |
It looks like we are going in circle. The questions that I am asking are not getting answered. At the moment, I am not convinced that pushing the fix to compilers is the right thing to do. And the fix in S.R.Metadata looks like a viable alternative to me. I am happy to continue discussion off-line (Teams meeting/call, etc). |
I know what you mean - you want to replace a bug in S.R.Metadata (emitting assemblies that are not readable) with another bug (emitting assemblies that have tokens the user didn't ask for). You are asking for this because Roslyn doesn't care about the other bug (if we rule out EnC is not affected, which I'm not sure about). S.R.Metadata does care about the other bug too. Such fix would not be acceptable for the other non-Roslyn scenarios where S.R.Metadata is used. We could consider a flag to metadata writer called "EmitWorkaroundTokensThereMaybeBugs" to allow emitting workaround tokens. If Roslyn opts into this and runs into bugs due to that (netmodules generated like that might not be mergable with link.exe due to duplicate workaround methods, for example), it will be Roslyn bugs. I don't think it fixes much. |
If you are proposing going with a format change instead and compilers emitting a dummy token only for the benefit a legacy desktop metadata reader, That is probably an option. But, then, I think the S.R.Metadata should be fixed first, and then we in Compilers can decide if we care about the legacy case, which is broken today anyway. |
I do not think I agree with this interpretation. Or, if we call the workaround "another bug". I do not see why "another bug" in Compilers is better than "another bug" in S.R.Metadata.
I asked for a single concrete example that would support this statement. So far, none were given. Until then, I remain very skeptical that this is actually the case and assume that the statement is false. |
I don't believe the contract of the C# compiler is to generate 1:1 type/method mapping with the input C# source code or there would be no
I gave concrete examples where the ECMA-335 file format is used for purposes that are not ".NET program" above ("winmd, toc files, IBC profile data, coff object files"). The file format is a convenient way to represent various things that are not "executable .NET programs" and we cannot make an assumption that adding an extra type would be harmless. We already have uses within Microsoft where this is invalid or requires special care and that doesn't count customer uses that we don't know about. WinMD is a file format based on ECMA-335. A dummy that is valid to emit in a WinMD looks different from a dummy for a .NET program (I don't actually know a dummy would look so that it doesn't upset tools that read WinMD). S.R.Metadata writer doesn't have much awareness about whether it's emitting WinMD. WinMD is just an example where someone took a .NET file format and used it for non-.NET things. A TOC file looks like an assembly, but it's an input to a compiler, not a runnable thing. It provides information about how an assembly got compiled into a shared native library. It contains a subset of an existing assembly that got compiled to native code. If this subset happens to hit the 0xFFFF case and the metadata writer inserts a type/method that didn't exist in the original assembly (original that had more than 0xFFFF methods and didn't need a dummy), the TOC no longer matches the input assembly. Whether the reader will be able to handle it or crash is up to chance. |
None of these are concrete examples. A concrete example is a specific scenario. This is what we do today and it works great. This is how it would look like with the dummy and this is how it breaks. With details where it breaks and why that part cannot be adjusted to not break. |
Even if we decide that the behavior of Note, at the moment, I am not even convinced that the suggested workaround (with the dummy) is appropriate for compiler scenarios. That is another reason why I am not comfortable with implementing it in compilers. For example, when a .NET module is produced, compiler is constrained in terms of what extra types it embeds. Presence of those can cause unexpected failures when multimodule assemblies are produced, or multiple modules are merged into a single assembly. |
Another example where S.R.Metadata injecting random things into the output would likely break is the https://github.com/microsoft/win32metadata project. It's another example of a project using ECMA-335 file format to store things that are not executable .NET programs and have its own rules (like visibility/accessibility being meaningless because this format pretty much encodes a C header file). |
I guess what I am still having a hard time to understand is how already broken scenarios (one cannot read the metadata back) can be broken by a fix. The proposed workaround might not be considered as a "fix" for them (i.e. they remain broken, but in a different way), that I can understand. |
They may not be broken. The ECMA-335 file format does have a provision for this, it's just S.R.Metadata and the native reader/writer in CoreCLR wasn't implemented to support what the spec says. Win32Metadata is used by multiple other projects. E.g. one of them is a Rust repo that has a metadata reader written in Rust. Maybe that one implements the spec and would handle it. If a reader implements the spec, it would work. Spec has a provision for it (put 0 for the method list instead of num_methods+1). We'd need to update our readers to support this because they failed to implement the spec. |
Could you provide a specific quote and where the provision can be found? For example, I was able to find the following in ECMA-335 "II.22.37 TypeDef : 0x02" section. And it looks like putting 0 would be a violation of the spec.
|
"Indexes to tables begin at 1, so index 1 means the first row in any given metadata table. (An index |
Combined with "18. MethodList can be null or non-null" |
Thank you for the quotes. Well, I am not surprised that we ended up with the current implementation. Different sections of the spec are, essentially, contradict each other. Or, perhaps, some of the sections are incomplete, while they could easily be interpreted as specifying complete set of possible values. |
Since the issue with 0xFFFF applies to parameters, fields and methods we should just use this issue. Other issues: |
Version Used:
.NET 7.0.400-preview.23225.8
Steps to Reproduce:
Started as an investigation of jbevain/cecil#913
dotnet run
the console project:Expected Behavior:
Should print:
Actual Behavior:
Note that the last row of the
Method
table has aParam
of0x80000000
which is unexpected:The text was updated successfully, but these errors were encountered: