Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use reverse postorder in non_ssa_locals #96601

Merged
merged 1 commit into from
May 3, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions compiler/rustc_codegen_ssa/src/mir/analyze.rs
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,9 @@ pub fn non_ssa_locals<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>>(
}

// If there exists a local definition that dominates all uses of that local,
// the definition should be visited first. Traverse blocks in preorder which
// the definition should be visited first. Traverse blocks in an order that
// is a topological sort of dominance partial order.
for (bb, data) in traversal::preorder(&mir) {
for (bb, data) in traversal::reverse_postorder(&mir) {
analyzer.visit_basic_block_data(bb, data);
Comment on lines 42 to 46
Copy link
Member

@eddyb eddyb Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, a bit surprised this was using preorder. I thought it was RPO since that's the correct def-before-use order (but I guess we have to check for dominance anyway so this can't go wrong just be unnecessarily conservative?).

EDIT: ah I see, it went through these steps:

Would it make sense to force visit_body to use RPO? Or have two forms, visit_body_unordered and visit_body_rpo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that preorder and reverse postorder give identical end results, since either visits a definition before a use, when the definition dominates the use (and the order is irrelevant otherwise).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: you can mostly ignore the rant below, it's me reasoning about RPO to myself, really


Hmm, I think it depends on what kind of "definition" we're talking about - I think I overapproximated what RPO actually did, and how it's stronger than just giving you "dominator before dominated".

I guess that implies a fun iteration algorithm like this:

fn visit_block(&mut self, bb: Block) {
    if self.visited[bb] { return; }
    if let Some(dom_bb) = self.doms[bb] {
        self.visit_block(dom_bb);
    }

    // ... guaranteed to get here only once and *after* all dominators ...
}

More seriously though, I guess RPO is important in SSA for merges (let's ignore cycles for now):

  S(tart)
 / \
a   b
 \ /
  M(erge)
flatten dedup
(keep first)
dedup
(keep last)
preorder S->{a->M, b->M} SaMbM SaMb SabM
reverse (per-column) MbMaS bMaS MbaS
postorder {M<-a, M<-b}<-S MaMbS MabS aMbS
reverse (per-column) SbMaM SbaM SbMa

And RPO is usually the reversed "dedup (keep first)" postorder, which ends up as SbaM (though siblings can be reversed in the initial postorder visit to get SabM if that's aesthetically preferred for e.g. an IR dump - we should do this for --emit=mir IMO).

What we're looking for in this example is S->{a,b}->M which is a bit like structured control-flow, and for SSA in particular it allows seeing all the definitions of φ nodes (or "BB args" values etc.) before a merge (not sure if this logic works for backedges, but it partially might?).
In MIR we don't have φ nodes but we still want to see all "sources" of a merge before the merge for dataflow algorithms, for pretty much the same reason SSA IRs use φ/BB args, the difference being that a fixpoint dataflow algorithm is only slowed down by the suboptimal order (whereas SSA IR passes may have bigger issues with lacking definitions used by φ nodes).

Also, you may have noticed the above table that the preorder "dedup (keep last)" is SabM without any reversals (but "keep last" is more expensive computationally, since we tend not to have the "flatten" form at all but instead skip visiting eagerly, which naturally results in "keep first").

So really what "RPO" does is a more efficient way to get "preorder but keep only the last visit instead of the usual first" (isomorphic up to sibling order, but I think you can get them perfectly equal if you make the "aesthetic fix" to RPO).

}

Expand Down