sema_asm cannot differentiate between instructions with the same name but different args #1727

welrox · 2024-12-27T14:32:43Z

In sema_asm.c, we have the following:

bool sema_analyse_asm(SemaContext *context, AsmInlineBlock *block, Ast *asm_stmt)
{
	ASSERT0(compiler.platform.asm_initialized);

	AsmInstruction *instr = asm_instr_by_name(asm_stmt->asm_stmt.instruction);
	if (!instr) RETURN_SEMA_ERROR(asm_stmt, "Unknown instruction");

	// Check arguments
	Expr **args = asm_stmt->asm_stmt.args;
	unsigned expected_params = instr->param_count;
	unsigned arg_count = vec_size(args);
	if (expected_params != arg_count)
	{
		RETURN_SEMA_ERROR(asm_stmt, "Too %s arguments to instruction '%s', expected %d.",
						  expected_params > arg_count ? "few" : "many",
						  instr->name, expected_params);
	}

	...
}

From this, we see that when an asm statement AST node is encountered, its corresponding instruction is looked up by the instruction name, which is then used to do arg checking.

But what if we had several instructions with the same name, but different numbers of arguments? Then it would arbitrarily pick one of them, disregarding the number of args in the AST node. This creates false Too few/many arguments to instruction errors, as it always expects a particular form of the instruction, even when there is a form that matches the given arg count.

The text was updated successfully, but these errors were encountered:

lerno · 2024-12-27T15:32:11Z

There should never be instructions with the same name but different number of arguments. That is what the asm instruction variant is for.

welrox · 2024-12-27T16:01:09Z

It seems to be fairly common in aarch64 at least. For example, you have LDR (immediate) and LDR (literal) which take 3 and 2 arguments despite having the same name. I haven't looked much into instruction variants, but it seems that variants would force you into writing ldr.imm and ldr.lit instead of using ldr for both of them.

lerno · 2024-12-27T17:58:04Z

In the LDR case, isn't that just an address? So it's not different arity in the C3 asm version, because it's treated as a memory location.

welrox · 2024-12-27T19:13:42Z

I gave a bad example in my previous comment, sorry about that.
In aarch64 a lot of instructions have optional arguments, so an instruction can have different arities when written.
Some examples:

ldr x0, [x1]; // LDR immediate, no offset (x0 = [x1])
ldr x0, [x1], 8; // LDR immediate, post-index offset = 8 (x0 = [x1]; x1 = x1 + 8)

add x0, x1, x2; // ADD register (x0 = x1 + x2)
add x0, x1, x2, lsl 4; // ADD register with lshift 4 (x0 = x1 + (x2 << 4))

The problem is that there is currently no way to register an instruction with optional parameters, and if you register the same instruction multiple times you run into the issue described in my original post. So currently the only way to write these instructions is asm string asm("...") statements instead of asm blocks asm {...}, which is not ideal since you wouldn't be able to access variables or have expressions.

lerno · 2024-12-27T19:45:51Z

The add instruction with lsl doesn't fit the grammar any, so the actual way to do that is:

add $x0, $x1 + $x2 << 4;

lerno · 2024-12-27T21:36:13Z

So, it's possible it's needed to patch this, but let's first find the case where it's needed.

welrox · 2024-12-28T20:26:06Z

The thing is add x0, x1, x2, lsl 4 is a single, valid instruction in aarch64 that includes both the add and left shift, while if you do add $x0, $x1, $x2 << 4 in C3 the backend emits two instructions: one for the left shift and one for add. So the point still stands that if you want to have good support for aarch64, you will need a way to encode instructions with optional arguments, unless you are fine with generating extra code to have the same effect.

lerno · 2024-12-28T22:39:54Z

It should support add $x0, $x1 + $x2 << 4 and translate that into add x0, x1, x2, lsl 4

lerno · 2024-12-28T22:40:28Z

It doesn't though, and that is a bug. So please file a bug for that and any similar instructions that should support it.

lerno added the Discussion needed This feature needs discussion to iron out details label Dec 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sema_asm cannot differentiate between instructions with the same name but different args #1727

sema_asm cannot differentiate between instructions with the same name but different args #1727

welrox commented Dec 27, 2024

lerno commented Dec 27, 2024

welrox commented Dec 27, 2024

lerno commented Dec 27, 2024

welrox commented Dec 27, 2024

lerno commented Dec 27, 2024

lerno commented Dec 27, 2024

welrox commented Dec 28, 2024

lerno commented Dec 28, 2024

lerno commented Dec 28, 2024

sema_asm cannot differentiate between instructions with the same name but different args #1727

sema_asm cannot differentiate between instructions with the same name but different args #1727

Comments

welrox commented Dec 27, 2024

lerno commented Dec 27, 2024

welrox commented Dec 27, 2024

lerno commented Dec 27, 2024

welrox commented Dec 27, 2024

lerno commented Dec 27, 2024

lerno commented Dec 27, 2024

welrox commented Dec 28, 2024

lerno commented Dec 28, 2024

lerno commented Dec 28, 2024