Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make more Opcodes hold a FieldType #40

Closed
wants to merge 14 commits into from

Conversation

C0D3-M4513R
Copy link
Contributor

@C0D3-M4513R C0D3-M4513R commented Jul 31, 2024

I wanted to include a more exhaustive test, but that was prevented by #41

@C0D3-M4513R C0D3-M4513R marked this pull request as draft July 31, 2024 20:45
Currently the test tests that `anewarray` gets parsed correctly
The run-time constant pool item at the index must be a symbolic reference to a class, array, or interface type.
- https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.checkcast
The run-time constant pool item at the index must be a symbolic reference to a class, array, or interface type.
- https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.instanceof
@C0D3-M4513R C0D3-M4513R marked this pull request as ready for review July 31, 2024 21:29
@C0D3-M4513R C0D3-M4513R changed the title Make Multianewarray hold a FieldType Make more Opcodes hold a FieldType Jul 31, 2024
src/bytecode.rs Outdated
Comment on lines 638 to 641
let ty = read_cp_classinfo(code, &mut ix, pool)?;
match FieldType::parse(&ty) {
Ok(ty) => ty,
Err(_) => FieldType::Ty(Ty::Object(ty)),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right to me. Conceptually the classinfo field contains the name of a class, which is different from a field type. For instance if the class happened to be called "B", parsing it as a field type would turn it into FieldType::Ty(Ty::Base(BaseType::Byte)) which is incorrect. On top of that this approach of handling Err and assuming it is an Ty::Object seems not very robust. Err can be returned for any number of reasons and is not a good way to handle control flow. You're basically hacking around the fact that ClassInfo structures contain a different syntax (e.g. java/lang/String) than what field type is intended to parse (Ljava/lang/String;) but this is not a good solution.

Copy link
Contributor Author

@C0D3-M4513R C0D3-M4513R Aug 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to circle back to this, but having to manually parse the Cow of anewarray, instanceof and checkcast for array depths also seems a little strange though

Like imagine casting an Object to an int[].

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah like I said in the comment below, I'm not opposed to putting in a structured type here instead of the Cow. I'm just saying using FieldType is wrong, and it should be a different structured type that precisely represents the possible range of values here.

Copy link
Contributor Author

@C0D3-M4513R C0D3-M4513R Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is it wrong? can you not checkcast from Object to [J or [B?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance if the class happened to be called "B", parsing it as a field type would turn it into FieldType::Ty(Ty::Base(BaseType::Byte)) which is incorrect.

Copy link
Contributor Author

@C0D3-M4513R C0D3-M4513R Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I said I'm not at all opposed to this. I just want the implementation to use types with precision.

I proposed multiple solutions now, but you find none sufficient.
Tbh I do not know what you want from me at this point.

You say you see too many issues with my solution, so is there something I'm not seeing?

Copy link
Contributor Author

@C0D3-M4513R C0D3-M4513R Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also are you saying that the following sample opcode is wrong too? (also taken from #41)

21: checkcast #21 // class "[Ljava/lang/String;"

There's a number of problems with this code, starting with the fact that the class name in these instructions is not encoded with L...;.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code works when you use strip_prefix instead of split_prefix. I added the L...; bit as an extra bit of precaution, just to be sure.

fn main() {
    let ty = "[[[test";
    let mut ty = ty;
    let mut dims = 0;
    while let Some(new_ty) = ty.strip_prefix("[") {
        dims+=1;
        ty=new_ty;
    }
    
    let ty = ty.strip_prefix("L").unwrap_or(ty);
    let ty = ty.strip_suffix(";").unwrap_or(ty);
    
    println!("{ty}, {dims}");
}

Try this java code:

class L {
    public L tryCast(Object o) {
        L l = (L)o;
        return l;
    }
}

Look at what the checkcast instruction generates, the string value that you get, and what your parsing code does to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With my new parsing code (ReferenceType) I'd expect it to correctly resolve to the Object Variant of ReferenceType.

I did miss a couple of cases of FieldType though, that should be replaced with ReferenceType.

@staktrace
Copy link
Owner

I think this PR is fundamentally flawed because you're trying to use a FieldType out of convenience (i.e it represents approximately the things you care about) rather than out of correctness. If you need to have a more strongly typed operand here for the opcodes, I'd suggest creating a new type to represent that more exactly. Let me know if you disagree!

@C0D3-M4513R
Copy link
Contributor Author

I think this is entirely justified.
See the bytecode in #41. You have some types there that entirely match the FieldType.

@staktrace
Copy link
Owner

I think this is entirely justified.

Are you referring to your PR, or my response?

See the bytecode in #41. You have some types there that entirely match the FieldType.

But would those types (in the case where they are valid) be used for these opcodes? And even if there is some overlap, I wouldn't want to use the type unless it is semantically equivalent, which it doesn't seem like it is. In my experience reusing code for syntactic/structural reasons usually ends up getting a mistake. I'm happy to explain more in detail if you'd like what I mean by that.

@C0D3-M4513R
Copy link
Contributor Author

@C0D3-M4513R C0D3-M4513R closed this Aug 4, 2024
@C0D3-M4513R C0D3-M4513R reopened this Aug 5, 2024
@C0D3-M4513R C0D3-M4513R requested a review from staktrace August 5, 2024 22:28
@staktrace
Copy link
Owner

I implemented this to my satisfaction in #49.

@staktrace staktrace closed this Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants