Name	Name	Last commit message	Last commit date
parent directory ..
.vscode	.vscode
src	src
Cargo.lock	Cargo.lock
Cargo.toml	Cargo.toml
README.MD	README.MD
assembler TODO.md	assembler TODO.md

Assembler

Assembler

Introduction

This is an assembler for the Rusty Virtual Machine. It's designed to be simple to use and easy to understand, yet powerful enough to write complex programs.

Basic usage

To assemble a file:

./assembler my_file.asm

For full usage instructions, run with the --help flag.

Assembly language

$: current address

The $ symbol is a special label that represents the current address in the binary as it's being assembled.
The assembler will replace every $ symbol with the literal current address at the time of the assembly.
Note that $ has no knowledge of runtime stack pointers, so it's undefined behavior to use it inside procedures.

mov1 r1 $

&: unique symbol

The & symbol represents a unique symbol that will be replaced with a unique text string at the time of the assembly.

# Define a unique label
@&

Address literals

Address literals are used to specify a memory address in the binary. Address lierals are 8-byte unsigned integers enclosed in square brackets.

mov1 r1 [1234]

Labels

A label is a compile time symbol that represents a memory address in the binary. Labels are declared in the assembly code using the @ symbol. To export a label, prefix it with @@ instead.
Label names can only contain alphabetic characters and underscores. Also, they must not overwrite any of the reserved assembly instructions or registers.

# Regular label
@my_label

# Exported label
@@my_exported_label

Sections

A section is a distinct of the assembly program, identified by a label that points to the start of the section.

.my_section:
  # The assembly code here is inside of the .my_section section

Some section names are reserved for specific functions:

.text section

The .text section is treated as the entry point of the program. This means that the label text points to the first byte that will be executed by the VM.

.include section

The .include section is used to include other assembly units and uses a special syntax.
The paths of the units to include are speficied as a string literal, optionally preceded by a @@ token to re-export the included unit's contents.

.include:

  # Include the `archlib.asm` file
  "archlib.asm"

  # Include the `stdio.asm` file and re-export its contents
  @@ "stdio.asm"

Re-exporting symbols works as follows:

unit foo.asm exports a @@foo label
unit bar.asm includes and re-exports foo.asm via @@ "foo.asm"
unit foobar.asm includes bar.asm and has access to the foo label

The assembly unit path resolution works as follows:

If the provided path is absolute, stop searching.
Check if the provided path is relative to the current working directory.
Check if the provided path is relative to any specified include paths.

Include paths are directories where the assembler searches for included units. Include paths can be specified through the -L option or via the RUSTYVM_ASM_LIB environment variable.

Function-like macros

A function-like macro is a parametrized compile-time metaprogramming tool that allows you to generate assembly code in-place. Function-like macros are declared thorugh the % prefix and invoked via the ! operator.
Macros work internally by replacing every invocation with the expanded macro body.

# Defining a macro
%my_macro:

  mov1 r1 1
  mov1 r2 2
  mov1 r3 3

# End of macro definition
%endmacro

# Invoking a macro
!my_macro

Macros may accept a number of positional parameters, specified after the macro name.
The parameters are referenced in the macro body by enclosing their name in curly brackets: {arg_name}.

# Defining a macro with arguments
%my_macro arg1 arg2:

  mov1 r1 {arg1}
  mov1 r2 {arg2}

# End of macro definition
%endmacro

# Using a macro with arguments
!my_macro 1 2

To export a macro, prefix it with a double %% instead.

# Exporting a macro
%%my_macro:

  mov1 r1 1
  mov1 r2 2
  mov1 r3 3

# End of macro definition
%endmacro

Inline macros

Inline macros are a non-parametrized compile-time metaprogramming tool that allows you to paste a stream of tokens in-place.
Inline macros are declared in the assembly code through the %- prefix. To expand a constant macro, prefix it with the = symbol.

# Defining an inline macro
%-ZERO: 0

# Expanding an inline macro
mov1 r1 =ZERO

# Note that inline macros can be recursively expanded
%-r1_zero: r1 =ZERO

# This expands to `mov1 r1 0`
mov1 =r1_zero

To export an inline macro, prefix it with %%- instead.

# Exporting an inline macro
%%-ZERO: 0

Inline macros are useful to define compile-time constants and they don't take up program space, unlike static data.

Assembly instructions

Every assembly intruction can be represented as a 1-byte integer code.

By convention, move-like operations treat the first argument as the destination and the second argument as the source.

Arithmetical instructions

Instruction	Description
`iadd`	Add the integer values stored in registers `r1` and `r2`. Store the result in register `r1`. Update the arithmetical flags.
`isub`	Subtract the integer values stored in registers `r1` and `r2`. Store the result in register `r1`. Update the arithmetical flags.
`imul`	Multiply the integer values stored in registers `r1` and `r2`. Store the result in register `r1`. Update the arithmetical flags.
`idiv`	Divide the integer values stored in registers `r1` and `r2`. Store the result in register `r1`. Update the arithmetical flags.
`imod`	Store the remainder of the integer division between the values stored in registers `r1` and `r2` in register `r1`. Update the arithmetical flags.
`fadd`	Add the floating point values stored in registers `r1` and `r2`. Store the result in register `r1`. Update the arithmetical flags.
`fsub`	Subtract the floating point values stored in registers `r1` and `r2`. Store the result in register `r1`. Update the arithmetical flags.
`fmul`	Multiply the floating point values stored in registers `r1` and `r2`. Store the result in register `r1`. Update the arithmetical flags.
`fdiv`	Divide the floating point values stored in registers `r1` and `r2`. Store the result in register `r1`. Update the arithmetical flags.
`fmod`	Store the remainder of the floating point division between the values stored in registers `r1` and `r2` in register `r1`. Update the arithmetical flags.
`inc a`	Increment the integer value stored in the specified register `a`. Update the arithmetical flags.
`inc1 a`	Increment the 1-byte integer value stored at `a`. Update the arithmetical flags.
`inc2 a`	Increment the 2-byte integer value stored at `a`. Update the arithmetical flags.
`inc4 a`	Increment the 4-byte integer value stored at `a`. Update the arithmetical flags.
`inc8 a`	Increment the 8-byte integer value stored at `a`. Update the arithmetical flags.
`dec a`	Decrement the integer value stored in the specified register `a`. Update the arithmetical flags.
`dec1 a`	Decrement the 1-byte integer value stored at `a`. Update the arithmetical flags.
`dec2 a`	Decrement the 2-byte integer value stored at `a`. Update the arithmetical flags.
`dec4 a`	Decrement the 4-byte integer value stored at `a`. Update the arithmetical flags.
`dec8 a`	Decrement the 8-byte integer value stored at `a`. Update the arithmetical flags.

No operation instructions

Instruction	Description
`nop`	Do nothing for this cycle.

Memory instructions

Instruction	Description
`mov a b`	Copy the second register `b` value into the first register `a`.
`mov1 a b`	Copy 1 byte from `b` into `a`.
`mov2 a b`	Copy 2 bytes from `b` into `a`.
`mov4 a b`	Copy 4 bytes from `b` into `a`.
`mov8 a b`	Copy 8 bytes from `b` into `a`.
`push a`	Push the value stored in the specified register `a` onto the stack.
`push1 a`	Push 1 bytes from `a` onto the stack.
`push2 a`	Push 2 bytes from `a` onto the stack.
`push4 a`	Push 4 bytes from `a` onto the stack.
`push8 a`	Push 8 bytes from `a` onto the stack.
`pushsp a`	Increase the stack pointer by `a`.
`pushsp1 a`	Increase the stack pointer by the 1-byte value `a`.
`pushsp2 a`	Increase the stack pointer by the 2-byte value `a`.
`pushsp4 a`	Increase the stack pointer by the 4-byte value `a`.
`pushsp1 a`	Increase the stack pointer by the 8-byte value `a`.
`pop1 a`	Pop 1 byte from the top of the stack and store it at `a`.
`pop2 a`	Pop 2 bytes from the top of the stack and store it at `a`.
`pop4 a`	Pop 4 bytes from the top of the stack and store it at `a`.
`pop8 a`	Pop 8 bytes from the top of the stack and store it at `a`.
`popsp a`	Decrease the stack pointer by `a`.
`popsp1 a`	Decrease the stack pointer by the 1-byte value `a`.
`popsp2 a`	Decrease the stack pointer by the 2-byte value `a`.
`popsp4 a`	Decrease the stack pointer by the 4-byte value `a`.
`popsp8 a`	Decrease the stack pointer by the 8-byte value `a`.

Flow control instructions

Instruction	Description
`jmp a`	Jump to the specified label `a`.
`jmpnz a`	Jump to the specified label `a` if `zf` = zero.
`jmpz a`	Jump to the specified label `a` if `zf` != zero.
`jmpgr a`	Jump to the specified label `a` if `sf` = `of` and `zf` = 0.
`jmpge a`	Jump to the specified label `a` if `sf` = `of`.
`jmplt a`	Jump to the specified label `a` if `sf` != `of`.
`jmple a`	Jump to the specified label `a` if `sf` != `of` or `zf` = 1.
`jmpof a`	Jump to the specified label `a` if `of` = 1.
`jmpnof a`	Jump to the specified label `a` if `of` = 0.
`jmpcr a`	Jump to the specified label `a` if `cf` = 1.
`jmpncr a`	Jump to the specified label `a` if `cf` = 0.
`jmpsn a`	Jump to the specified label `a` if `sf` = 1.
`jmpnsn a`	Jump to the specified label `a` if `sf` = 0.
`call a`	Push the current `pc` onto the stack and jump to the specified label `a`.
`ret`	Pop 8 bytes from the top of the stack and jump to the popped address.

Comparison instructions

Instruction	Description
`cmp a b`	Compare the values stored in the specified registers. If the values are equal, set register `zf` to `1`. Else, set register `zf` to `0`.
`cmp1 a b`	Compare 1 byte from `a` and `b`. If the values are equal, set register `zf` to `1`. Else, set register `zf` to `0`.
`cmp2 a b`	Compare 2 bytes from `a` and `b`. If the values are equal, set register `zf` to `1`. Else, set register `zf` to `0`.
`cmp4 a b`	Compare 4 bytes from `a` and `b`. If the values are equal, set register `zf` to `1`. Else, set register `zf` to `0`.
`cmp8 a b`	Compare 8 bytes from `a` and `b`. If the values are equal, set register `zf` to `1`. Else, set register `zf` to `0`.

Logical bitwise instructions

Instruction	Description
`and`	Perform a bitwise AND between the values stored in registers `r1` and `r2`. Store the result in register `r1`. Update the status flags.
`or`	Perform a bitwise OR between the values stored in registers `r1` and `r2`. Store the result in register `r1`. Update the status flags.
`xor`	Perform a bitwise XOR between the values stored in registers `r1` and `r2`. Store the result in register `r1`. Update the status flags.
`not`	Perform a bitwise NOT on the value stored in register `r1`. Store the result in register `r1`. Update the status flags.
`shl`	Perform a bitwise left shift on the value stored in register `r1` by the value stored in register `r2`. Store the result in register `r1`.
`shr`	Perform a bitwise right shift on the value stored in register `r1` by the value stored in register `r2`. Store the result in register `r1`.

Special intructions

Instruction	Description
`intr`	Trigger the interrupt with the interrupt code speficied in the `int` register.
`exit`	Exit the program with the exit code stored in the `exit` register.

Interrupts

Interrupts are defined in the archlib.asm library file.

Instruction	Description
`PRINT_SIGNED`	Print the signed integer value stored in the `print` register.
`PRINT_UNSIGNED`	Print the unsigned integer value stored in the `print` register.
`PRINT_CHAR`	Print the unicode character stored in the `print` register.
`PRINT_STRING`	Print the string at the address stored in the `print` register.
`PRINT_BYTES`	Print the bytes at the address stored in the `print` register up to the length stored in the `r1` register.
`INPUT_SIGNED`	Get the next signed integer input from the console and store it in the `input` register.
`INPUT_UNSIGNED`	Get the next unsigned integer input from the console and store it in the `input` register.
`INPUT_STRING`	Get the next string input from the console and allocate it on the heap. Store its address in the `input` register and its length in the `r1` register. The returned string is not to be considered null-terminated. If the input is not a valid string, set `error` register to `INVALID_INPUT`. If the EOF is encountered, set `error` register to `END_OF_FILE`. If another error is encountered, set `error` register to `GENERIC_ERROR`. If no error is encountered, set `error` register to `NO_ERROR`.
`RANDOM`	Generate a random 8-byte number and store it in the `r1` register.
`HOST_TIME_NANOS`	Get the current host system time in nanoseconds and store it in the `r1` register.
`ELAPSED_TIME_NANOS`	Get the elapsed time since the program started in nanoseconds and store it in the `r1` register.
`DISK_READ`	Read `r3` bytes from local storage at the address specified in `r1` and store them at the address specified in `r2`. If the read fails, set the `error` register.
`DISK_WRITE`	Write `r3` bytes from the address specified in `r2` to local storage at the address specified in `r1`. If the write fails, set the `error` register.
`FLUSH_STDOUT`	Flush the stdout buffer.
... more to be implemented

Pseudo instructions

Pseudo instructions are compile-time operators that affect the generated binary.

Instruction	Description
`dn <n> <number>`	Define number. Insert an `n`-byte number in-place into the binary. Note that the only allowed sizes are 1, 2, 4, and 8.
`ds <string>`	Define string. Insert a string in-place into the binary. Note that the assembler does not automatically a null-termination byte to string literals, so it's up to the programmer to correctly terminate the string according to the program needs.
`db <byte array>`	Define bytes. Insert a byte array in-place into the binary. This is a shorthand form for `da u8 \<byte array>`. See the Array data types section.
`da <element type> <array>`	Define array. Insert an array in-place into the binary.
`offsetfrom <label>`	Offset from label. Calculate the memory offset in bytes between the current position and `label`, and insert the result in-place into the binary. Note that the resulting offset is an 8-byte unsigned integer and `label` must be defined prior to this instruction.
`printstr <string>`	Print a string literal declared in-place in the byte code. This instruction is equivalent to defining a string literal in-place and printing it. Note that the string does not need to be null-terminated. This instruction is useful for debugging purposes.

Array data types

Data type	Description
`u8`	1-byte unsigned integer
`u16`	2-bytes unsigned integer
`u32`	4-bytes unsigned integer
`u64`	8-bytes unsigned integer
`i8`	1-byte signed integer
`i16`	2-bytes signed integer
`i32`	4-bytes signed integer
`i64`	8-bytes signed integr
`f64`	8-bytes float
`[<element type> : <length>]`	Array of `element type` with `length` elements. Example: `[i32:3]` is an array of 3 `i32` values.

Errors and error codes

When the virtual machine encounters an error, it will set the error register to a specific error code. It's the programmer's responsibility to check the error register after fallible operations and handle eventual errors.
An error code is represented as a 1-byte unsigned integer.

Error Code	Description
`NO_ERROR`	No error occurred.
`END_OF_FILE`	End of file reached while reading input.
`INVALID_INPUT`	The input from the console was not a valid integer.
`ZERO_DIVISION`	A division by zero occurred.
`STACK_OVERFLOW`	The stack overflowed.
`OUT_OF_BOUNDS`	An out of bounds memory access occurred.
`UNALIGNED_ADDRESS`	An unaligned memory access occurred.
`PERMISSION_DENIED`	A permission denied error occurred.
`TIMED_OUT`	An operation timed out.
`NOT_FOUNT`	A resource was not found.
`ALREADY_EXISTS`	A resource already exists.
`INVALID_DATA`	Provided data is invalid.
`INTERRUPTED`	The process was interrupted.
`OUT_OF_MEMORY`	The VM memory is either full or not enough to perform an operation.
`WRITE_ZERO`	The last IO operation write 0 bytes.
`MODULE_UNAVAILABLE`	The specified VM module is not available in the current context.
`GENERIC_ERROR`	A generic error occurred.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assembler

assembler

README.MD

Assembler

Introduction

Basic usage

Assembly language

$: current address

&: unique symbol

Address literals

Labels

Sections

.text section

.include section

Function-like macros

Inline macros

Assembly instructions

Arithmetical instructions

No operation instructions

Memory instructions

Flow control instructions

Comparison instructions

Logical bitwise instructions

Special intructions

Interrupts

Pseudo instructions

Array data types

Errors and error codes

Files

assembler

Directory actions

More options

Directory actions

More options

Latest commit

History

assembler

Folders and files

parent directory

README.MD

Assembler

Introduction

Basic usage

Assembly language

$: current address

&: unique symbol

Address literals

Labels

Sections

.text section

.include section

Function-like macros

Inline macros

Assembly instructions

Arithmetical instructions

No operation instructions

Memory instructions

Flow control instructions

Comparison instructions

Logical bitwise instructions

Special intructions

Interrupts

Pseudo instructions

Array data types

Errors and error codes