GH-127953: Make line number lookup O(1) regardless of the size of the code object #128350

markshannon · 2024-12-30T16:12:25Z

Changes the line data array from 2 bytes per instruction to 2-5 bytes per instruction depending on the line length of the code object, changing the size of the line delta from 1 byte to 1-4 bytes as needed.

Most code objects are fewer than 250 lines, and still use 1 byte for line delta per instruction.
Code objects up to ~65k lines need 2 bytes per instruction.
3 bytes are needed for code objects up to ~16M lines long.
4 bytes are needed for code objects up to the maximum of ~1 billion lines.
I doubt 4 bytes will ever be needed, but it is supported.

This PR also fixes the INSTRUMENT_DEBUG mode which was broken in 2e95c5b

The time taken to execute the example given in #127953 (comment) is about the same as 3.10 or 3.11. Maybe a bit faster, but I didn't benchmark it rigorously.

Modifying the script to generate a file of 100k lines, this PR is a bit faster than 3.11.
At 1M lines, this PR is a bit slower than 3.11.

Issue: sys.settrace suffers quadratic behavior for large dictionary literals on 3.12+ #127953

markshannon · 2024-12-30T22:08:29Z

Benchmarking shows no significant change in performance. The coverage benchmark shows a small speedup, but it is probably noise.

iritkatriel · 2025-01-03T12:37:29Z

Python/instrumentation.c

-    return COMPUTED_LINE;
+    int delta = line - code->co_firstlineno;
+    assert(delta > NO_LINE);
+    return delta;


Looks like line_delta is actually an offset from the start now.

iritkatriel · 2025-01-03T12:41:07Z

Python/instrumentation.c

+get_line_delta(_PyCoLineInstrumentationData *line_data, int index)
+{
+    uint8_t *ptr = &line_data->data[index*line_data->bytes_per_entry+1];
+    uint32_t value = *ptr;


Could assert here that bytes_per_entry is within range.

We can't easily assert anything here, as we don't know the max line number without traversing the entire locations table.
I've added an assert to set_line_delta instead.

I meant that bytes_per_entry is between 1 and 4.

(bytes_per_entry is between 2 and 5 because it includes the original opcode)

gaogaotiantian · 2025-01-03T13:59:20Z

I’m taking a vacation now and I’ll take a look once I’m back!

gaogaotiantian · 2025-01-16T19:18:34Z

Include/cpython/code.h

@@ -38,8 +38,8 @@ typedef struct {
   Line instrumentation creates an array of
   these. One entry per code unit.*/


Does One entry per code unit still hold for the new struct?

gaogaotiantian · 2025-01-16T19:29:29Z

Python/instrumentation.c

+    assert(line_delta >= NO_LINE);
+    uint32_t adjusted = line_delta - NO_LINE;
+    uint8_t *ptr = &line_data->data[index*line_data->bytes_per_entry+1];
+    assert(adjusted < (1ULL << (line_data->bytes_per_entry*8)));


This assertion should minus the opcode byte right? assert(adjusted < (1ULL << ((line_data->bytes_per_entry - 1)*8)))

gaogaotiantian · 2025-01-16T19:34:25Z

Python/instrumentation.c

+        if (line_data->bytes_per_entry > 3) {
+            if (line_data->bytes_per_entry > 4) {
+                assert(line_data->bytes_per_entry == 5);
+                *ptr = adjusted >> 24;


I think we should do a little endian here so we can put this in a loop:

for (int idx = 1; idx < line_data->bytes_per_entry; idx++) { *ptr = adjusted & 0xff; ptr++; adjusted >>= 8; };

Same thing for decoding.

gaogaotiantian · 2025-01-16T19:39:32Z

Python/instrumentation.c

+            else {
+                bytes_per_entry = 5;
+            }
+            code->_co_monitoring->lines = PyMem_Malloc(1 + code_len *bytes_per_entry);


Suggested change

code->_co_monitoring->lines = PyMem_Malloc(1 + code_len *bytes_per_entry);

code->_co_monitoring->lines = PyMem_Malloc(1 + code_len * bytes_per_entry);

Make line number lookup O(1) regardless of the size of the code object

751143f

bedevere-app bot mentioned this pull request Dec 30, 2024

sys.settrace suffers quadratic behavior for large dictionary literals on 3.12+ #127953

Open

bedevere-app bot added the awaiting core review label Dec 30, 2024

markshannon added needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels Dec 30, 2024

Use co_monitoring line info in PyCode_Addr2Line if it is available

262abf1

markshannon requested a review from gaogaotiantian December 30, 2024 22:08

iritkatriel reviewed Jan 3, 2025

View reviewed changes

markshannon added 2 commits January 3, 2025 14:10

Address review comments

9ea88c0

Assert that bytes_per_entry is in range(2,6)

c816f7e

gaogaotiantian reviewed Jan 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-127953: Make line number lookup O(1) regardless of the size of the code object #128350

GH-127953: Make line number lookup O(1) regardless of the size of the code object #128350

markshannon commented Dec 30, 2024 •

edited

Loading

markshannon commented Dec 30, 2024

iritkatriel Jan 3, 2025

iritkatriel Jan 3, 2025

markshannon Jan 3, 2025

iritkatriel Jan 3, 2025

markshannon Jan 3, 2025

gaogaotiantian commented Jan 3, 2025

gaogaotiantian Jan 16, 2025

gaogaotiantian Jan 16, 2025

gaogaotiantian Jan 16, 2025

gaogaotiantian Jan 16, 2025

		@@ -38,8 +38,8 @@ typedef struct {
		Line instrumentation creates an array of
		these. One entry per code unit.*/

	code->_co_monitoring->lines = PyMem_Malloc(1 + code_len *bytes_per_entry);
	code->_co_monitoring->lines = PyMem_Malloc(1 + code_len * bytes_per_entry);

GH-127953: Make line number lookup O(1) regardless of the size of the code object #128350

Are you sure you want to change the base?

GH-127953: Make line number lookup O(1) regardless of the size of the code object #128350

Conversation

markshannon commented Dec 30, 2024 • edited Loading

markshannon commented Dec 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gaogaotiantian commented Jan 3, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markshannon commented Dec 30, 2024 •

edited

Loading