Use reverse postorder in `non_ssa_locals` #96601

tmiasko · 2022-05-01T13:00:25Z

The reverse postorder, unlike preorder, is now cached inside the MIR
body. Code generation uses reverse postorder anyway, so it might be
a small perf improvement to use it here as well.

The reverse postorder, unlike preorder, is now cached inside the MIR body. Code generation uses reverse postorder anyway, so it might be a small perf improvement to use it here as well.

rust-highfive · 2022-05-01T13:00:28Z

r? @davidtwco

(rust-highfive has picked a reviewer for you, use r? to override)

tmiasko · 2022-05-01T13:01:09Z

@bors try @rust-timer queue

rust-timer · 2022-05-01T13:01:11Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-05-01T13:01:16Z

⌛ Trying commit fa41852 with merge 204cd52d1e796574f64ace5276cb3a794e73585f...

bors · 2022-05-01T14:37:49Z

☀️ Try build successful - checks-actions
Build commit: 204cd52d1e796574f64ace5276cb3a794e73585f (204cd52d1e796574f64ace5276cb3a794e73585f)

rust-timer · 2022-05-01T14:37:50Z

Queued 204cd52d1e796574f64ace5276cb3a794e73585f with parent f75d884, future comparison URL.

rust-timer · 2022-05-01T15:55:12Z

Finished benchmarking commit (204cd52d1e796574f64ace5276cb3a794e73585f): comparison url.

Summary:

Primary benchmarks: 🎉 relevant improvement found
Secondary benchmarks: no relevant changes found

	Regressions 😿 (primary)	Regressions 😿 (secondary)	Improvements 🎉 (primary)	Improvements 🎉 (secondary)	All 😿 🎉 (primary)
count¹	0	0	1	0	1
mean²	N/A	N/A	-0.3%	N/A	-0.3%
max	N/A	N/A	-0.3%	N/A	-0.3%

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

number of relevant changes ↩
the arithmetic mean of the percent change ↩

nagisa · 2022-05-01T16:47:49Z

Does iterating in this order retain the benefits of pre-order as described in #85741?

tmiasko · 2022-05-02T11:05:17Z

Does iterating in this order retain the benefits of pre-order as described in #85741?

Yes. If x dominates y, then in any depth first walk of the control flow graph, x must be before y in a pre-order, and x must be before y in reverse post-order.

davidtwco · 2022-05-03T01:25:24Z

@bors r+

bors · 2022-05-03T01:25:25Z

📌 Commit fa41852 has been approved by davidtwco

bors · 2022-05-03T03:06:08Z

⌛ Testing commit fa41852 with merge 9aeba99dc26b95ab9a9d737df6352a197e436845...

rust-log-analyzer · 2022-05-03T05:20:36Z

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

bors · 2022-05-03T05:21:06Z

💔 Test failed - checks-actions

tmiasko · 2022-05-03T07:19:32Z

@bors retry spurious network error

bors · 2022-05-03T09:01:08Z

⌛ Testing commit fa41852 with merge fe21c30fb9b85db46cd807a0a345bb06bee90882...

bors · 2022-05-03T09:27:55Z

💔 Test failed - checks-actions

tmiasko · 2022-05-03T09:29:38Z

@bors retry #93784

rust-log-analyzer · 2022-05-03T09:54:40Z

The job x86_64-gnu-tools failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

.......... (60/63)
..        (63/63)


/checkout/src/test/rustdoc-gui/search-tab-selection-if-current-is-empty.goml search-tab-selection-if-current-is-empty... FAILED
[ERROR] (line 6) TimeoutError: waiting for selector "#titles" failed: timeout 30000ms exceeded: for command `wait-for: "#titles"`
Build completed unsuccessfully in 0:00:41

bors · 2022-05-03T12:16:08Z

⌛ Testing commit fa41852 with merge e1df625...

bors · 2022-05-03T14:56:47Z

☀️ Test successful - checks-actions
Approved by: davidtwco
Pushing e1df625 to master...

rust-timer · 2022-05-03T16:14:30Z

Finished benchmarking commit (e1df625): comparison url.

Summary: This benchmark run did not return any relevant results.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

eddyb · 2022-08-24T13:26:05Z

compiler/rustc_codegen_ssa/src/mir/analyze.rs

    // If there exists a local definition that dominates all uses of that local,
-    // the definition should be visited first. Traverse blocks in preorder which
+    // the definition should be visited first. Traverse blocks in an order that
    // is a topological sort of dominance partial order.
-    for (bb, data) in traversal::preorder(&mir) {
+    for (bb, data) in traversal::reverse_postorder(&mir) {
        analyzer.visit_basic_block_data(bb, data);


Huh, a bit surprised this was using preorder. I thought it was RPO since that's the correct def-before-use order (but I guess we have to check for dominance anyway so this can't go wrong just be unnecessarily conservative?).

EDIT: ah I see, it went through these steps:

visit_body -> inlined loop over blocks: Remove dead code from LocalAnalyzer #85965

loop over blocks -> preorder: Use preorder traversal when checking for SSA locals #85741

preorder -> reverse_postorder (this PR)

Would it make sense to force visit_body to use RPO? Or have two forms, visit_body_unordered and visit_body_rpo?

Note that preorder and reverse postorder give identical end results, since either visits a definition before a use, when the definition dominates the use (and the order is irrelevant otherwise).

Note: you can mostly ignore the rant below, it's me reasoning about RPO to myself, really

Hmm, I think it depends on what kind of "definition" we're talking about - I think I overapproximated what RPO actually did, and how it's stronger than just giving you "dominator before dominated".

I guess that implies a fun iteration algorithm like this:

fn visit_block(&mut self, bb: Block) { if self.visited[bb] { return; } if let Some(dom_bb) = self.doms[bb] { self.visit_block(dom_bb); } // ... guaranteed to get here only once and *after* all dominators ... }

More seriously though, I guess RPO is important in SSA for merges (let's ignore cycles for now):

S(tart) / \ a b \ / M(erge)

flatten dedup
^{(keep first)} dedup
^{(keep last)}

preorder S->{a->M, b->M} SaMbM SaMb SabM

reverse (per-column) MbMaS bMaS MbaS

postorder {M<-a, M<-b}<-S MaMbS MabS aMbS

reverse (per-column) SbMaM SbaM SbMa

And RPO is usually the reversed "dedup (keep first)" postorder, which ends up as SbaM (though siblings can be reversed in the initial postorder visit to get SabM if that's aesthetically preferred for e.g. an IR dump - we should do this for --emit=mir IMO).

What we're looking for in this example is S->{a,b}->M which is a bit like structured control-flow, and for SSA in particular it allows seeing all the definitions of φ nodes (or "BB args" values etc.) before a merge (not sure if this logic works for backedges, but it partially might?).
In MIR we don't have φ nodes but we still want to see all "sources" of a merge before the merge for dataflow algorithms, for pretty much the same reason SSA IRs use φ/BB args, the difference being that a fixpoint dataflow algorithm is only slowed down by the suboptimal order (whereas SSA IR passes may have bigger issues with lacking definitions used by φ nodes).

Also, you may have noticed the above table that the preorder "dedup (keep last)" is SabM without any reversals (but "keep last" is more expensive computationally, since we tend not to have the "flatten" form at all but instead skip visiting eagerly, which naturally results in "keep first").

So really what "RPO" does is a more efficient way to get "preorder but keep only the last visit instead of the usual first" (isomorphic up to sibling order, but I think you can get them perfectly equal if you make the "aesthetic fix" to RPO).

Use reverse postorder in non_ssa_locals

fa41852

The reverse postorder, unlike preorder, is now cached inside the MIR body. Code generation uses reverse postorder anyway, so it might be a small perf improvement to use it here as well.

rust-highfive assigned davidtwco May 1, 2022

rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label May 1, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 1, 2022

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 1, 2022

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 1, 2022

davidtwco approved these changes May 3, 2022

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 3, 2022

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 3, 2022

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 3, 2022

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 3, 2022

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 3, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label May 3, 2022

bors merged commit e1df625 into rust-lang:master May 3, 2022

rustbot added this to the 1.62.0 milestone May 3, 2022

tmiasko deleted the ssa-rpo branch May 3, 2022 15:11

eddyb reviewed Aug 24, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use reverse postorder in `non_ssa_locals` #96601

Use reverse postorder in `non_ssa_locals` #96601

tmiasko commented May 1, 2022

rust-highfive commented May 1, 2022

tmiasko commented May 1, 2022

rust-timer commented May 1, 2022

bors commented May 1, 2022

bors commented May 1, 2022

rust-timer commented May 1, 2022

rust-timer commented May 1, 2022

nagisa commented May 1, 2022

tmiasko commented May 2, 2022

davidtwco commented May 3, 2022

bors commented May 3, 2022

bors commented May 3, 2022

rust-log-analyzer commented May 3, 2022

bors commented May 3, 2022

tmiasko commented May 3, 2022

bors commented May 3, 2022

bors commented May 3, 2022

tmiasko commented May 3, 2022

rust-log-analyzer commented May 3, 2022

bors commented May 3, 2022

bors commented May 3, 2022

rust-timer commented May 3, 2022

eddyb Aug 24, 2022 •

edited

Loading

tmiasko Aug 24, 2022

eddyb Aug 25, 2022

		flatten	dedup ^{(keep first)}	dedup ^{(keep last)}
preorder	`S->{a->M, b->M}`	`SaMbM`	`SaMb`	`SabM`
	reverse (per-column)	`MbMaS`	`bMaS`	`MbaS`
postorder	`{M<-a, M<-b}<-S`	`MaMbS`	`MabS`	`aMbS`
	reverse (per-column)	`SbMaM`	`SbaM`	`SbMa`

Use reverse postorder in non_ssa_locals #96601

Use reverse postorder in non_ssa_locals #96601

Conversation

tmiasko commented May 1, 2022

rust-highfive commented May 1, 2022

tmiasko commented May 1, 2022

rust-timer commented May 1, 2022

bors commented May 1, 2022

bors commented May 1, 2022

rust-timer commented May 1, 2022

rust-timer commented May 1, 2022

Footnotes

nagisa commented May 1, 2022

tmiasko commented May 2, 2022

davidtwco commented May 3, 2022

bors commented May 3, 2022

bors commented May 3, 2022

rust-log-analyzer commented May 3, 2022

bors commented May 3, 2022

tmiasko commented May 3, 2022

bors commented May 3, 2022

bors commented May 3, 2022

tmiasko commented May 3, 2022

rust-log-analyzer commented May 3, 2022

bors commented May 3, 2022

bors commented May 3, 2022

rust-timer commented May 3, 2022

eddyb Aug 24, 2022 • edited Loading

Choose a reason for hiding this comment

tmiasko Aug 24, 2022

Choose a reason for hiding this comment

eddyb Aug 25, 2022

Choose a reason for hiding this comment

Use reverse postorder in `non_ssa_locals` #96601

Use reverse postorder in `non_ssa_locals` #96601

eddyb Aug 24, 2022 •

edited

Loading