-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"unhalted reads" is a misfeature and needs to be reworked/removed #527
Comments
cbiffle
added a commit
to cbiffle/humility
that referenced
this issue
Dec 22, 2024
This fixes oxidecomputer#527 enough to be able to use the STM32U575. I am still testing to see whether I can also get consistent reads on the L4 now.
cbiffle
added a commit
to cbiffle/humility
that referenced
this issue
Jan 25, 2025
This fixes oxidecomputer#527 enough to be able to use the STM32U575. I am still testing to see whether I can also get consistent reads on the L4 now.
cbiffle
added a commit
to cbiffle/humility
that referenced
this issue
Jan 25, 2025
I've been tracking down reasons why Humility sometimes gives an inconsistent snapshot of system state in things like the `tasks` output, and I've found one of the culprits. There's this `halt_and_read` function that we use when we want a consistent snapshot. Uses of this function often carry comments like "We read the entire [thing being read] at a go to get as consistent a snapshot as possible." Unfortunately, despite its name, `halt_and_read` doesn't necessarily halt, and this decision is not in control of the caller or obvious to readers of the calling code. The decision to halt is a little complex and depends on... - The target triple -- after discovering an M0 that doesn't support unhalted reads to flash, we appear to have opted out the entire thumbv6m architecture from this meddling. - The exact dynamic call path to reach this point and whether anyone has called halt(), thereby incrementing the `halted` variable (the halt() call _does_ in fact halt; the `op_start` call is similarly shaped but only _maybe_ halts) This behavior has been responsible for a number of bugs I keep hitting when trying to debug _other_ bugs, which is the worst time to be hitting such bugs. This includes - Intermittent failures in archive ID matching on the STM32U5 series, where debugger flash reads are not supported while the CPU is executing (as on the STM32G0, but unlike the STM32H7). - Task status output that is inconsistent, e.g. no tasks showing as runnable or multiple tasks ready but only the idle task actually running. Both of these have had me convinced that we had scheduler bugs; so far it does not appear that we have scheduler bugs, it's just the debugger misrepresenting things. Aside: remember that whether debugger accesses to buses are allowed to race the CPU is a _vendor dependent decision_ that varies from SoC to SoC. It is not consistent on all ARMv6/7/8-M parts; it is not consistent on all parts from ST; it is not consistent on all parts built around the Cortex-M33; it is not consistent across flash, RAM, and peripherals in a single part. Because commercial tools generally don't _do_ this, vendors don't document the specific behavior, and you get to discover it by things failing. In short, doing unhalted reads of anything is asking for flake and should only be done in very specific circumstances. This commit takes a somewhat heavy-handed approach and simply removes the secret unhalted reads behind `halt_and_read`. There may be situations where we legitimately want unhalted reads -- humility dashboard comes to mind -- and IMO these commands should be using an explicit `read_without_halting` operation that is defined as racing, and therefore as "maybe doesn't work on your chip." I have not done that here. Fixes oxidecomputer#527.
cbiffle
added a commit
to cbiffle/humility
that referenced
this issue
Jan 25, 2025
I've been tracking down reasons why Humility sometimes gives an inconsistent snapshot of system state in things like the `tasks` output, and I've found one of the culprits. There's this `halt_and_read` function that we use when we want a consistent snapshot. Uses of this function often carry comments like "We read the entire [thing being read] at a go to get as consistent a snapshot as possible." Unfortunately, despite its name, `halt_and_read` doesn't necessarily halt, and this decision is not in control of the caller or obvious to readers of the calling code. The decision to halt is a little complex and depends on... - The target triple -- after discovering an M0 that doesn't support unhalted reads to flash, we appear to have opted out the entire thumbv6m architecture from this meddling. - The exact dynamic call path to reach this point and whether anyone has called halt(), thereby incrementing the `halted` variable (the halt() call _does_ in fact halt; the `op_start` call is similarly shaped but only _maybe_ halts) This behavior has been responsible for a number of bugs I keep hitting when trying to debug _other_ bugs, which is the worst time to be hitting such bugs. This includes - Intermittent failures in archive ID matching on the STM32U5 series, where debugger flash reads are not supported while the CPU is executing (as on the STM32G0, but unlike the STM32H7). - Task status output that is inconsistent, e.g. no tasks showing as runnable or multiple tasks ready but only the idle task actually running. Both of these have had me convinced that we had scheduler bugs; so far it does not appear that we have scheduler bugs, it's just the debugger misrepresenting things. Aside: remember that whether debugger accesses to buses are allowed to race the CPU is a _vendor dependent decision_ that varies from SoC to SoC. It is not consistent on all ARMv6/7/8-M parts; it is not consistent on all parts from ST; it is not consistent on all parts built around the Cortex-M33; it is not consistent across flash, RAM, and peripherals in a single part. Because commercial tools generally don't _do_ this, vendors don't document the specific behavior, and you get to discover it by things failing. In short, doing unhalted reads of anything is asking for flake and should only be done in very specific circumstances. This commit takes a somewhat heavy-handed approach and simply removes the secret unhalted reads behind `halt_and_read`. There may be situations where we legitimately want unhalted reads -- humility dashboard comes to mind -- and IMO these commands should be using an explicit `read_without_halting` operation that is defined as racing, and therefore as "maybe doesn't work on your chip." I have not done that here. Fixes oxidecomputer#527.
cbiffle
added a commit
to cbiffle/humility
that referenced
this issue
Jan 25, 2025
I've been tracking down reasons why Humility sometimes gives an inconsistent snapshot of system state in things like the `tasks` output, and I've found one of the culprits. There's this `halt_and_read` function that we use when we want a consistent snapshot. Uses of this function often carry comments like "We read the entire [thing being read] at a go to get as consistent a snapshot as possible." Unfortunately, despite its name, `halt_and_read` doesn't necessarily halt, and this decision is not in control of the caller or obvious to readers of the calling code. The decision to halt is a little complex and depends on... - The target triple -- after discovering an M0 that doesn't support unhalted reads to flash, we appear to have opted out the entire thumbv6m architecture from this meddling. - The exact dynamic call path to reach this point and whether anyone has called halt(), thereby incrementing the `halted` variable (the halt() call _does_ in fact halt; the `op_start` call is similarly shaped but only _maybe_ halts) This behavior has been responsible for a number of bugs I keep hitting when trying to debug _other_ bugs, which is the worst time to be hitting such bugs. This includes - Intermittent failures in archive ID matching on the STM32U5 series, where debugger flash reads are not supported while the CPU is executing (as on the STM32G0, but unlike the STM32H7). - Task status output that is inconsistent, e.g. no tasks showing as runnable or multiple tasks ready but only the idle task actually running. Both of these have had me convinced that we had scheduler bugs; so far it does not appear that we have scheduler bugs, it's just the debugger misrepresenting things. Aside: remember that whether debugger accesses to buses are allowed to race the CPU is a _vendor dependent decision_ that varies from SoC to SoC. It is not consistent on all ARMv6/7/8-M parts; it is not consistent on all parts from ST; it is not consistent on all parts built around the Cortex-M33; it is not consistent across flash, RAM, and peripherals in a single part. Because commercial tools generally don't _do_ this, vendors don't document the specific behavior, and you get to discover it by things failing. In short, doing unhalted reads of anything is asking for flake and should only be done in very specific circumstances. This commit takes a somewhat heavy-handed approach and simply removes the secret unhalted reads behind `halt_and_read`. There may be situations where we legitimately want unhalted reads -- humility dashboard comes to mind -- and IMO these commands should be using an explicit `read_without_halting` operation that is defined as racing, and therefore as "maybe doesn't work on your chip." I have not done that here. Fixes oxidecomputer#527.
cbiffle
added a commit
to cbiffle/humility
that referenced
this issue
Jan 25, 2025
I've been tracking down reasons why Humility sometimes gives an inconsistent snapshot of system state in things like the `tasks` output, and I've found one of the culprits. There's this `halt_and_read` function that we use when we want a consistent snapshot. Uses of this function often carry comments like "We read the entire [thing being read] at a go to get as consistent a snapshot as possible." Unfortunately, despite its name, `halt_and_read` doesn't necessarily halt, and this decision is not in control of the caller or obvious to readers of the calling code. The decision to halt is a little complex and depends on... - The target triple -- after discovering an M0 that doesn't support unhalted reads to flash, we appear to have opted out the entire thumbv6m architecture from this meddling. - The exact dynamic call path to reach this point and whether anyone has called halt(), thereby incrementing the `halted` variable (the halt() call _does_ in fact halt; the `op_start` call is similarly shaped but only _maybe_ halts) This behavior has been responsible for a number of bugs I keep hitting when trying to debug _other_ bugs, which is the worst time to be hitting such bugs. This includes - Intermittent failures in archive ID matching on the STM32U5 series, where debugger flash reads are not supported while the CPU is executing (as on the STM32G0, but unlike the STM32H7). - Task status output that is inconsistent, e.g. no tasks showing as runnable or multiple tasks ready but only the idle task actually running. Both of these have had me convinced that we had scheduler bugs; so far it does not appear that we have scheduler bugs, it's just the debugger misrepresenting things. Aside: remember that whether debugger accesses to buses are allowed to race the CPU is a _vendor dependent decision_ that varies from SoC to SoC. It is not consistent on all ARMv6/7/8-M parts; it is not consistent on all parts from ST; it is not consistent on all parts built around the Cortex-M33; it is not consistent across flash, RAM, and peripherals in a single part. Because commercial tools generally don't _do_ this, vendors don't document the specific behavior, and you get to discover it by things failing. In short, doing unhalted reads of anything is asking for flake and should only be done in very specific circumstances. This commit takes a somewhat heavy-handed approach and simply removes the secret unhalted reads behind `halt_and_read`. There may be situations where we legitimately want unhalted reads -- humility dashboard comes to mind -- and IMO these commands should be using an explicit `read_without_halting` operation that is defined as racing, and therefore as "maybe doesn't work on your chip." I have not done that here. Fixes oxidecomputer#527.
cbiffle
added a commit
to cbiffle/humility
that referenced
this issue
Jan 27, 2025
I've been tracking down reasons why Humility sometimes gives an inconsistent snapshot of system state in things like the `tasks` output, and I've found one of the culprits. There's this `halt_and_read` function that we use when we want a consistent snapshot. Uses of this function often carry comments like "We read the entire [thing being read] at a go to get as consistent a snapshot as possible." Unfortunately, despite its name, `halt_and_read` doesn't necessarily halt, and this decision is not in control of the caller or obvious to readers of the calling code. The decision to halt is a little complex and depends on... - The target triple -- after discovering an M0 that doesn't support unhalted reads to flash, we appear to have opted out the entire thumbv6m architecture from this meddling. - The exact dynamic call path to reach this point and whether anyone has called halt(), thereby incrementing the `halted` variable (the halt() call _does_ in fact halt; the `op_start` call is similarly shaped but only _maybe_ halts) This behavior has been responsible for a number of bugs I keep hitting when trying to debug _other_ bugs, which is the worst time to be hitting such bugs. This includes - Intermittent failures in archive ID matching on the STM32U5 series, where debugger flash reads are not supported while the CPU is executing (as on the STM32G0, but unlike the STM32H7). - Task status output that is inconsistent, e.g. no tasks showing as runnable or multiple tasks ready but only the idle task actually running. Both of these have had me convinced that we had scheduler bugs; so far it does not appear that we have scheduler bugs, it's just the debugger misrepresenting things. Aside: remember that whether debugger accesses to buses are allowed to race the CPU is a _vendor dependent decision_ that varies from SoC to SoC. It is not consistent on all ARMv6/7/8-M parts; it is not consistent on all parts from ST; it is not consistent on all parts built around the Cortex-M33; it is not consistent across flash, RAM, and peripherals in a single part. Because commercial tools generally don't _do_ this, vendors don't document the specific behavior, and you get to discover it by things failing. In short, doing unhalted reads of anything is asking for flake and should only be done in very specific circumstances. This commit takes a somewhat heavy-handed approach and simply removes the secret unhalted reads behind `halt_and_read`. There may be situations where we legitimately want unhalted reads -- humility dashboard comes to mind -- and IMO these commands should be using an explicit `read_without_halting` operation that is defined as racing, and therefore as "maybe doesn't work on your chip." I have not done that here. Fixes oxidecomputer#527.
cbiffle
added a commit
to cbiffle/humility
that referenced
this issue
Jan 27, 2025
I've been tracking down reasons why Humility sometimes gives an inconsistent snapshot of system state in things like the `tasks` output, and I've found one of the culprits. There's this `halt_and_read` function that we use when we want a consistent snapshot. Uses of this function often carry comments like "We read the entire [thing being read] at a go to get as consistent a snapshot as possible." Unfortunately, despite its name, `halt_and_read` doesn't necessarily halt, and this decision is not in control of the caller or obvious to readers of the calling code. The decision to halt is a little complex and depends on... - The target triple -- after discovering an M0 that doesn't support unhalted reads to flash, we appear to have opted out the entire thumbv6m architecture from this meddling. - The exact dynamic call path to reach this point and whether anyone has called halt(), thereby incrementing the `halted` variable (the halt() call _does_ in fact halt; the `op_start` call is similarly shaped but only _maybe_ halts) This behavior has been responsible for a number of bugs I keep hitting when trying to debug _other_ bugs, which is the worst time to be hitting such bugs. This includes - Intermittent failures in archive ID matching on the STM32U5 series, where debugger flash reads are not supported while the CPU is executing (as on the STM32G0, but unlike the STM32H7). - Task status output that is inconsistent, e.g. no tasks showing as runnable or multiple tasks ready but only the idle task actually running. Both of these have had me convinced that we had scheduler bugs; so far it does not appear that we have scheduler bugs, it's just the debugger misrepresenting things. Aside: remember that whether debugger accesses to buses are allowed to race the CPU is a _vendor dependent decision_ that varies from SoC to SoC. It is not consistent on all ARMv6/7/8-M parts; it is not consistent on all parts from ST; it is not consistent on all parts built around the Cortex-M33; it is not consistent across flash, RAM, and peripherals in a single part. Because commercial tools generally don't _do_ this, vendors don't document the specific behavior, and you get to discover it by things failing. In short, doing unhalted reads of anything is asking for flake and should only be done in very specific circumstances. This commit takes a somewhat heavy-handed approach and simply removes the secret unhalted reads behind `halt_and_read`. There may be situations where we legitimately want unhalted reads -- humility dashboard comes to mind -- and IMO these commands should be using an explicit `read_without_halting` operation that is defined as racing, and therefore as "maybe doesn't work on your chip." I have not done that here. Fixes oxidecomputer#527.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I spent the morning trying to bring up Hubris on the STM32U575 for funsies. The port was easy enough, but I kept having issues with Humility reading the archive ID. Or sometimes the tasks table. Or sometimes something else.
I noticed that it was complaining that the archive ID read as zeros, which is suspicious. By going in behind its back with openocd and forcing a halt, I was able to get it to read the archive ID correctly. I verified through openocd that reading flash while the processor is running returns zeros.
Why would Humility be reading flash while the processor is running? Haven't I filed a dozen bugs over the years about inconsistent reads racing the processor?
I dug in and found this:
humility/humility-core/src/hubris.rs
Lines 3682 to 3688 in c0b8876
In other words, the core is silently turning
halt_and_read
intojust_read_racing_the_processor
on anything that isn't a v6m part. This is presumably because the first place halting reads loudly broke was on STM32G0, but they're subtly broken on all of our processors. Whether an unhalted read to an address range might succeed depends on several factors:This feature needs to be rethought or, more likely, removed. I have disabled it locally and have the STM32U5 working on my fork. I suspect this exists exclusively to allow
humility dashboard
to appear to run unintrusively, but IMO ifhumility dashboard
really wants inconsistent reads, it should be using an inconsistent read operation and spare all the other commands.I suspect that this feature is responsible for #526 but I haven't proven it yet.
The text was updated successfully, but these errors were encountered: