Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP JIT based on #346 #350

Draft
wants to merge 51 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
208f110
Change TEST_CI to TEST_VERBOSE_IMAGES and invert semantic
Baekalfen Sep 23, 2024
9c8a7ee
Try to quit debug window better
Baekalfen Sep 20, 2024
9407c38
Remove redundant definitions in sound.pxd
Baekalfen Sep 16, 2024
6e6e0d8
Restructure CPU interrupt handling
Baekalfen Sep 17, 2024
2d06208
Simplify CGB bank read in MB getitem
Baekalfen Sep 16, 2024
de50027
Fix blargg tests for CPU without is_stuck
Baekalfen Jul 29, 2024
f1faa97
Add places to bail in MB and opcodes but not CPU
Baekalfen Sep 29, 2024
b32e627
Implement cycles target for CPU with support to bail
Baekalfen Sep 29, 2024
4b6a053
Refactor LCD and Timer cycles_to_interrupts
Baekalfen Sep 29, 2024
cecef34
Implement LCD cycles to frame
Baekalfen Sep 29, 2024
4310f98
Only pre-check interrupts and halt in CPU tick
Baekalfen Sep 17, 2024
5f7fcaf
Refactor sound ticking to have a tick method
Baekalfen Sep 16, 2024
0c15c68
Fix CPU clock cycles for CB 46, CB 4E, CB 56, CB 5E, CB 66, CB 6E, CB…
Baekalfen Sep 25, 2024
118b715
Fix memory timings, supporting sub-opcode timing
Baekalfen Sep 25, 2024
b95c2cb
Fix import order in opcodes_gen.py
Baekalfen Sep 25, 2024
3292f7a
Defer post-tick and reduce time keeping on tick
Baekalfen Sep 29, 2024
0724cb0
Update whichboot pytest
Baekalfen Sep 14, 2024
5e8cf7e
Saving SameSuite and Blargg results for sound, although they are stil…
Baekalfen Sep 26, 2024
3451731
Saving Blargg interrupt time results as they depend on sound, and are…
Baekalfen Sep 26, 2024
6459839
Saving Pokemon Pinball test even though it broke because of timing ch…
Baekalfen Sep 26, 2024
a6a895a
Saving Tetris example as timings (and randomness) have changed
Baekalfen Sep 26, 2024
779aaae
Ignore SDL2 warning in pytest
Baekalfen Mar 25, 2024
1a638ef
wip jit
Baekalfen Mar 20, 2024
363f7c2
wip working compile
Baekalfen Mar 21, 2024
b29ee4b
jit default_rom working
Baekalfen Mar 22, 2024
c11cc49
CB support
Baekalfen Mar 22, 2024
9efdd96
Clearing bootrom
Baekalfen Mar 22, 2024
0da9f93
Much better, but not quite there
Baekalfen Mar 25, 2024
8ef82cf
Added bail-early for setitem, but makes little difference
Baekalfen Mar 25, 2024
1dc8c71
Working well now (all pytests cleared 2:32:00)
Baekalfen Mar 27, 2024
425bc67
Functionally sound with dlopen
Baekalfen Mar 29, 2024
e952aa5
No GIL working
Baekalfen Mar 31, 2024
aea78c1
Attempts at general improvements
Baekalfen Apr 1, 2024
773c097
Working hybrid PyPy/Cython JIT. Flushing on getitem, minimized bounda…
Baekalfen Apr 2, 2024
d00da42
minimum 4 cycles
Baekalfen Sep 17, 2024
ba2d0b7
JIT: Change streamline all to cpu.cycles and convert LCD to cpu.cycles
Baekalfen Sep 18, 2024
133b78c
Correct EXT_SUFFIX to be dynamic
Baekalfen Sep 18, 2024
921d003
JIT general clean-up
Baekalfen Sep 19, 2024
790e7a4
WIP: make JIT block aware of cycles_target
Baekalfen Sep 21, 2024
3a9a1cb
wip note about jit blocks
Baekalfen Sep 23, 2024
c37c940
wip jit_analyze account for cycles_target and master interrupt enable
Baekalfen Sep 26, 2024
d7cbb72
Cleanup and first attempt at collecting jit blocks for analysis
Baekalfen Sep 26, 2024
3db44f8
fixup
Baekalfen Sep 26, 2024
9512000
Bulk compilation working, no loading on start
Baekalfen Sep 27, 2024
0293fe5
Implement init loading of already JITed blocks
Baekalfen Sep 28, 2024
b074f86
Move JIT into its own file and run in a thread
Baekalfen Sep 28, 2024
9ccc8b9
fixup! Move JIT into its own file and run in a thread
Baekalfen Sep 30, 2024
b29ec55
Correcting boundary instructions. Several were missing
Baekalfen Oct 1, 2024
e12bf55
First steps in transforming JIT code. While loops.
Baekalfen Oct 6, 2024
d50193d
Remove bootrom clearing JIT
Baekalfen Oct 6, 2024
0247bec
setup.py enable wraparound
Baekalfen Oct 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/pr-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ jobs:
- name: Run PyTest
env:
PYTEST_SECRETS_KEY: ${{ secrets.PYTEST_SECRETS_KEY }}
TEST_CI: 1
TEST_VERBOSE_IMAGES: 0
TEST_NO_UI: 1
run: |
python -m pytest tests/ -n auto -v
Expand Down Expand Up @@ -111,7 +111,7 @@ jobs:
- name: Run PyTest
env:
PYTEST_SECRETS_KEY: ${{ secrets.PYTEST_SECRETS_KEY }}
TEST_CI: 1
TEST_VERBOSE_IMAGES: 0
TEST_NO_UI: 1
run: |
pypy3 -m pytest tests/ -n auto -v
Expand Down
6 changes: 5 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,11 @@ build:
cd ${ROOT_DIR}/extras/bootrom && $(MAKE)
CFLAGS=$(CFLAGS) ${PY} setup.py build_ext -j $(shell getconf _NPROCESSORS_ONLN) --inplace

clean:

clean_jit:
find . -type f -name "*jit_*" -delete

clean: clean_jit
@echo "Cleaning..."
cd ${ROOT_DIR}/extras/default_rom && $(MAKE) clean
cd ${ROOT_DIR}/extras/bootrom && $(MAKE) clean
Expand Down
2 changes: 1 addition & 1 deletion extras/examples/gamewrapper_tetris.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
tetris.start_game(timer_div=0x00) # The timer_div works like a random seed in Tetris

tetromino_at_0x00 = tetris.next_tetromino()
assert tetromino_at_0x00 == "Z", tetris.next_tetromino()
assert tetromino_at_0x00 == "O", tetris.next_tetromino()
assert tetris.score == 0
assert tetris.level == 0
assert tetris.lines == 0
Expand Down
3 changes: 2 additions & 1 deletion pyboy/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ def valid_file_path(path):
parser.add_argument("-s", "--scale", default=defaults["scale"], type=int, help="The scaling multiplier for the window")
parser.add_argument("--sound", action="store_true", help="Enable sound (beta)")
parser.add_argument("--no-renderer", action="store_true", help="Disable rendering (internal use)")
parser.add_argument("--jit", action="store_true", help="Enable JIT compile (beta)")
parser.add_argument(
"--gameshark",
type=str,
Expand Down Expand Up @@ -169,7 +170,7 @@ def main():
pyboy.load_state(f)

render = not argv.no_renderer
while pyboy._tick(render):
while pyboy.tick():
pass

pyboy.stop()
Expand Down
2 changes: 1 addition & 1 deletion pyboy/core/cartridge/base_mbc.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ cdef class BaseMBC:
cdef void save_ram(self, IntIOInterface) noexcept
cdef void load_ram(self, IntIOInterface) noexcept
cdef void init_rambanks(self, uint8_t) noexcept
cdef str getgamename(self, uint8_t[:,:]) noexcept
cdef str getgamename(self, uint8_t[:,:])

cdef uint8_t getitem(self, uint16_t) noexcept nogil
cdef void setitem(self, uint16_t, uint8_t) noexcept nogil
Expand Down
12 changes: 8 additions & 4 deletions pyboy/core/cpu.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
#


from libc.stdint cimport int16_t, uint8_t, uint16_t, uint32_t, uint64_t
from libc.stdint cimport int16_t, int64_t, uint8_t, uint16_t, int64_t

cimport pyboy.core.mb
from pyboy.utils cimport IntIOInterface
Expand All @@ -26,17 +26,21 @@ cdef uint8_t INTR_VBLANK, INTR_LCDC, INTR_TIMER, INTR_SERIAL, INTR_HIGHTOLOW

cdef class CPU:
cdef bint is_stuck
cdef bint interrupt_master_enable, interrupt_queued, halted, stopped
cdef bint interrupt_master_enable, interrupt_queued, halted, stopped, bail
cdef bint jit_jump

cdef uint8_t interrupts_flag, interrupts_enabled, interrupts_flag_register, interrupts_enabled_register

cdef int64_t cycles

cdef inline int check_interrupts(self) noexcept nogil
cdef void set_interruptflag(self, int) noexcept nogil
cdef bint handle_interrupt(self, uint8_t, uint16_t) noexcept nogil

@cython.locals(opcode=uint16_t)
cdef inline uint8_t fetch_and_execute(self) noexcept nogil
cdef int tick(self) noexcept nogil
@cython.locals(_cycles0=int64_t)
cdef int tick(self, int64_t) noexcept nogil
cdef void save_state(self, IntIOInterface) noexcept
cdef void load_state(self, IntIOInterface, int) noexcept

Expand All @@ -50,4 +54,4 @@ cdef class CPU:

cdef pyboy.core.mb.Motherboard mb

cdef dump_state(self, str) noexcept with gil
cdef void dump_state(CPU) noexcept with gil
128 changes: 65 additions & 63 deletions pyboy/core/cpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@

import array

import cython

from pyboy import utils

from . import opcodes
Expand Down Expand Up @@ -39,6 +41,9 @@ def __init__(self, mb):
self.halted = False
self.stopped = False
self.is_stuck = False
self.cycles = 0

self.jit_jump = False

def save_state(self, f):
for n in [self.A, self.F, self.B, self.C, self.D, self.E]:
Expand All @@ -53,6 +58,7 @@ def save_state(self, f):
f.write(self.interrupts_enabled_register)
f.write(self.interrupt_queued)
f.write(self.interrupts_flag_register)
f.write_64bit(self.cycles)

def load_state(self, f, state_version):
self.A, self.F, self.B, self.C, self.D, self.E = [f.read() for _ in range(6)]
Expand All @@ -69,45 +75,48 @@ def load_state(self, f, state_version):
if state_version >= 8:
self.interrupt_queued = f.read()
self.interrupts_flag_register = f.read()
logger.debug("State loaded: %s", self.dump_state(""))
if state_version >= 12:
self.cycles = f.read_64bit()
# logger.debug("State loaded: %s", self.dump_state(""))

def dump_state(self, sym_label):
opcode_data = [
self.mb.getitem(self.mb.cpu.PC + n) for n in range(3)
] # Max 3 length, then we don't need to backtrack
def dump_state(self):
sym_label = ""
opcode_data = [self.mb.getitem(self.PC + n) for n in range(3)] # Max 3 length, then we don't need to backtrack

opcode = opcode_data[0]
opcode_length = opcodes.OPCODE_LENGTHS[opcode]
opcode_length = opcodes.get_length(opcode)
opcode_str = f"Opcode: [{opcodes.CPU_COMMANDS[opcode]}]"
if opcode == 0xCB:
opcode_str += f" {opcodes.CPU_COMMANDS[opcode_data[1]+0x100]}"
else:
opcode_str += " " + " ".join(f"{d:02X}" for d in opcode_data[1:opcode_length])

return (
"\n"
f"A: {self.mb.cpu.A:02X}, F: {self.mb.cpu.F:02X}, B: {self.mb.cpu.B:02X}, "
f"C: {self.mb.cpu.C:02X}, D: {self.mb.cpu.D:02X}, E: {self.mb.cpu.E:02X}, "
f"HL: {self.mb.cpu.HL:04X}, SP: {self.mb.cpu.SP:04X}, PC: {self.mb.cpu.PC:04X} ({sym_label})\n"
f"{opcode_str} "
f"Interrupts - IME: {self.mb.cpu.interrupt_master_enable}, "
f"IE: {self.mb.cpu.interrupts_enabled_register:08b}, "
f"IF: {self.mb.cpu.interrupts_flag_register:08b}\n"
f"LCD Intr.: {self.mb.lcd.cycles_to_interrupt()}, LY:{self.mb.lcd.LY}, LYC:{self.mb.lcd.LYC}\n"
f"Timer Intr.: {self.mb.timer.cycles_to_interrupt()}\n"
f"halted:{self.halted}, "
f"interrupt_queued:{self.interrupt_queued}, "
f"stopped:{self.stopped}\n"
print(
# "\n"
f"A: {self.A:02X}, F: {self.F:02X}, B: {self.B:02X}, "
f"C: {self.C:02X}, D: {self.D:02X}, E: {self.E:02X}, "
f"HL: {self.HL:04X}, SP: {self.SP:04X}, PC: {self.PC:04X} ({sym_label})" #\n"
# f"{opcode_str} "
# f"Interrupts - IME: {self.interrupt_master_enable}, "
# f"IE: {self.interrupts_enabled_register:08b}, "
# f"IF: {self.interrupts_flag_register:08b}\n"
# f"LCD Intr.: {self.mb.lcd.cycles_to_interrupt()}, LY:{self.mb.lcd.LY}, LYC:{self.mb.lcd.LYC}\n"
# f"Timer Intr.: {self.mb.timer.cycles_to_interrupt()}\n"
# f"halted:{self.halted}, "
# f"interrupt_queued:{self.interrupt_queued}, "
# f"stopped:{self.stopped}\n"
)

def set_interruptflag(self, flag):
self.interrupts_flag_register |= flag

def tick(self):
def tick(self, cycles_target):
_cycles0 = self.cycles
_target = _cycles0 + cycles_target

if self.check_interrupts():
# TODO: Cycles it took to handle the interrupt?
self.halted = False
# TODO: We return with the cycles it took to handle the interrupt
return 0

if self.halted and self.interrupt_queued:
# GBCPUman.pdf page 20
Expand All @@ -117,62 +126,55 @@ def tick(self):
self.PC += 1
self.PC &= 0xFFFF
elif self.halted:
return 4 # TODO: Number of cycles for a HALT in effect?

old_pc = self.PC # If the PC doesn't change, we're likely stuck
old_sp = self.SP # Sometimes a RET can go to the same PC, so we check the SP too.
cycles = self.fetch_and_execute()
if not self.halted and old_pc == self.PC and old_sp == self.SP and not self.is_stuck and not self.mb.breakpoint_singlestep:
logger.debug("CPU is stuck: %s", self.dump_state(""))
self.is_stuck = True
self.cycles += cycles_target # TODO: Number of cycles for a HALT in effect?
self.interrupt_queued = False
return cycles

self.bail = False
while self.cycles < _target:
# TODO: cpu-stuck check for blargg tests?
self.fetch_and_execute()
if self.bail: # Possible cycles-target changes
break

def check_interrupts(self):
if self.interrupt_queued:
# Interrupt already queued. This happens only when using a debugger.
return False

if (self.interrupts_flag_register & 0b11111) & (self.interrupts_enabled_register & 0b11111):
if self.handle_interrupt(INTR_VBLANK, 0x0040):
self.interrupt_queued = True
elif self.handle_interrupt(INTR_LCDC, 0x0048):
self.interrupt_queued = True
elif self.handle_interrupt(INTR_TIMER, 0x0050):
self.interrupt_queued = True
elif self.handle_interrupt(INTR_SERIAL, 0x0058):
self.interrupt_queued = True
elif self.handle_interrupt(INTR_HIGHTOLOW, 0x0060):
self.interrupt_queued = True
else:
logger.error("No interrupt triggered, but it should!")
self.interrupt_queued = False
return True
else:
self.interrupt_queued = False
return False

def handle_interrupt(self, flag, addr):
if (self.interrupts_enabled_register & flag) and (self.interrupts_flag_register & flag):
raised_and_enabled = (self.interrupts_flag_register & 0b11111) & (self.interrupts_enabled_register & 0b11111)
if raised_and_enabled:
# Clear interrupt flag
if self.halted:
self.PC += 1 # Escape HALT on return
self.PC &= 0xFFFF

# Handle interrupt vectors
if self.interrupt_master_enable:
self.interrupts_flag_register ^= flag # Remove flag
self.mb.setitem((self.SP - 1) & 0xFFFF, self.PC >> 8) # High
self.mb.setitem((self.SP - 2) & 0xFFFF, self.PC & 0xFF) # Low
self.SP -= 2
self.SP &= 0xFFFF

self.PC = addr
self.interrupt_master_enable = False

if raised_and_enabled & INTR_VBLANK:
self.handle_interrupt(INTR_VBLANK, 0x0040)
elif raised_and_enabled & INTR_LCDC:
self.handle_interrupt(INTR_LCDC, 0x0048)
elif raised_and_enabled & INTR_TIMER:
self.handle_interrupt(INTR_TIMER, 0x0050)
elif raised_and_enabled & INTR_SERIAL:
self.handle_interrupt(INTR_SERIAL, 0x0058)
elif raised_and_enabled & INTR_HIGHTOLOW:
self.handle_interrupt(INTR_HIGHTOLOW, 0x0060)
self.interrupt_queued = True
return True
else:
self.interrupt_queued = False
return False

def handle_interrupt(self, flag, addr):
self.interrupts_flag_register ^= flag # Remove flag
self.mb.setitem((self.SP - 1) & 0xFFFF, self.PC >> 8) # High
self.mb.setitem((self.SP - 2) & 0xFFFF, self.PC & 0xFF) # Low
self.SP -= 2
self.SP &= 0xFFFF

self.PC = addr
self.interrupt_master_enable = False

def fetch_and_execute(self):
opcode = self.mb.getitem(self.PC)
if opcode == 0xCB: # Extension code
Expand Down
68 changes: 68 additions & 0 deletions pyboy/core/jit.pxd
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
#
# License: See LICENSE.md file
# GitHub: https://github.com/Baekalfen/PyBoy
#

from posix cimport dlfcn

cimport cython
from libc.stdint cimport int64_t, uint8_t, uint16_t, uint32_t, uint64_t

cimport pyboy
cimport pyboy.core.cartridge.base_mbc
cimport pyboy.core.cpu
from pyboy.logging.logging cimport Logger

from . cimport opcodes


cdef Logger logger
ctypedef int(*f_type)(pyboy.core.cpu.CPU, int64_t) noexcept nogil

cdef class JIT:
cdef pyboy.core.cpu.CPU cpu
cdef pyboy.core.cartridge.base_mbc.BaseMBC cartridge
cdef dict queue
cdef bint thread_stop
cdef object thread_queue
cdef object thread

cdef f_type[0xFFFFFF] array
cdef int[0xFFFFFF] cycles

cdef inline int load(self, str module_name, str module_path, str file_base, list block_manifest) except -1 with gil:
# logger.debug("JIT LOAD %d", block_id)
cdef void* handle = dlfcn.dlopen(module_path.encode(), dlfcn.RTLD_NOW | dlfcn.RTLD_GLOBAL) # RTLD_LAZY?
if (handle == NULL):
return -1
dlfcn.dlerror() # Clear error

cdef f_type execute
for func_name, block_id, block_max_cycles in block_manifest:
execute = <f_type> dlfcn.dlsym(handle, func_name.encode())
if (execute == NULL):
print(dlfcn.dlerror())

self.array[block_id] = execute
self.cycles[block_id] = block_max_cycles

cdef inline int execute(self, int block_id, int64_t cycles_target) noexcept nogil:
# logger.debug("JIT EXECUTE %d", block_id)
return self.array[block_id](self.cpu, cycles_target)

cdef void stop(self) noexcept with gil

cdef uint8_t getitem_bank(self, uint8_t, uint16_t) noexcept nogil

cdef void _jit_clear(self) noexcept with gil
cdef tuple get_module_name(self, str) with gil
cdef void gen_files(self, str, str, list) noexcept with gil
cdef void compile(self, str, str, str) noexcept with gil
cdef object emit_code(self, object, str) with gil
# @cython.locals(block_max_cycles=int64_t)
# cdef bint analyze(self, int, int64_t, bint) noexcept with gil
cdef void offload(self, int, int64_t, bint) noexcept with gil
@cython.locals(block_id=int64_t, cycles_target=int64_t, interrupt_master_enable=bint, count=int64_t)
cdef void process(self) noexcept with gil

cpdef void threaded_processor(JIT) noexcept with gil
Loading
Loading