-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the option to use the helix slope to decide the LoopHelixFit direction hypothesis #1401
base: main
Are you sure you want to change the base?
Conversation
Hi @michaelmackenzie,
which require these tests: build. @Mu2e/write, @Mu2e/fnalbuild-users have access to CI actions on main. ⌛ The following tests have been triggered for db6b6b9: build (Build queue has 1 jobs) |
☀️ The build tests passed at db6b6b9.
N.B. These results were obtained from a build of this Pull Request at db6b6b9 after being merged into the base branch at df7782c. For more information, please check the job page here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Michael this is good and important work.
As I see it there are 3 separate functionalities introduced in this PR. I would like to see them separated more in code. The separable functions I see are:
- A function to decide if a helix fit direction measurement is significant. This has nothing directly to do with LoopHelixFit and should be implemented external to the module.
- compare 2 LoopHelix fits of the same hits with opposite physical direction and classify them as clearly upstream, clearly downstream, or ambiguous. This could be implemented as an external helper function, making it easy to try different algorithms.
- a logical function to decide which fits to create and which to keep the using the above functions, plus the option of forcing the direction by hand. Probably this works best as a LoopHelixFit member function.
I think splitting these functions out will make the code clearer, more flexible and more maintainable.
Hi Dave, Thank you for the feedback! I agree we should aim for cleanliness and modularity here.
This is definition of fit direction ambiguity is defined by the
By this do you mean add a helper function outside of LoopHelixFit that compares two KKTrackKinKal::LoopHelix objects and decides which is the better fit? I think this makes sense and I'll get started, though I would still advocate for the preference of tracks with calo-hits as it recovers a large fraction of incorrectly fit conversion track directions, but this may not be as general if comparing two downstream or two upstream tracks in a different use case. Where is the best place for this helper function to live?
By this do you mean essentially moving the code within the for-loop to a member function that is called to make a specific track fit hypothesis (upstream vs. downstream in this case, but this could be extended to mass hypothesis etc.)? I can do this, and I agree separating out this logic should be cleaner.
I agree, I'll work on splitting and cleaning the logic I've added here to be clearer and more maintainable (and more flexible if I can!). |
📝 The HEAD of |
Yes. If you first define a virtual function it could have several implementaitons. Parameters should be passed via fcl. There are several examples how to do this in LoopHelixModule/KKFit. |
Yes. |
Even if currently this selection is just a comparison with a data member, it's good to buffer the selection through a function, to allow you to possibly extend the test in future without changing the code structure. |
db6b6b9
to
53cf22e
Compare
Hi Dave, Thank you for the replies and advice! I restructured the code a bit, where I moved the track fitting to a member function as well as the helix direction test and the track comparison test. I can move the track comparison to an outside function/class, but I'm not sure where would be best to put this. Where should this check live? I also rebased at a point, to not fall too far behind. I think this is a bit cleaner of a setup and it allows more straightforward maintenance as different steps are more decoupled. Let me know what you think about this and your opinion on where to move the track comparison code to when you have a chance! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Michael, thanks for factorizing out the functions. I have some further questions and suggestions for clarifying the new functionality, please have a look.
fhicl::Atom<int> fitDirection { Name("FitDirection"), Comment("Particle direction to fit, either upstream or downstream") }; | ||
fhicl::OptionalAtom<double> slopeSigThreshold{ Name("SlopeSigThreshold"), Comment("Helix slope significance threshold to assume the direction")}; | ||
fhicl::OptionalAtom<double> slopeSigCut{ Name("SlopeSigCut"), Comment("Helix slope significance cut when assuming a fit direction")}; | ||
fhicl::Atom<int> fitDirection { Name("FitDirection"), Comment("Particle direction to fit, either upstream or downstream"), [this](){ return !slopeSigThreshold(); }}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest making this parameter optional, with the meaning that, if it is set, the fit direction will be rigidly enforced, but if not, it will be dynamically decided. If unset, and if the Slope* parameters are also not set, the module init should throw. That will make the intent explicit, and simplify downstream logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is similar to the current setup, so I don't think the downstream logic changes. In this setup, it throws an error if one tries to set SlopeSigThreshold
and FitDirection
saying FitDirection
isn't a supported parameter. I updated to an OptionalAtom so now we can better control the error message.
@@ -155,7 +156,9 @@ namespace mu2e { | |||
fhicl::OptionalTable<KKFinalConfig> finalSettings { Name("FinalSettings") }; | |||
fhicl::OptionalTable<KKExtrapConfig> Extrapolation { Name("Extrapolation") }; | |||
// LoopHelix module specific config | |||
fhicl::Atom<int> fitDirection { Name("FitDirection"), Comment("Particle direction to fit, either upstream or downstream") }; | |||
fhicl::OptionalAtom<double> slopeSigThreshold{ Name("SlopeSigThreshold"), Comment("Helix slope significance threshold to assume the direction")}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it confusing that both of these parameters use 'Helix' in the name and description, while one applies to the helix seed, and the other to the loop helix fit. Please rename these to make the distinction clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the SlopeSigCut name to SlopeSigVetoCut to make it clearer that this parameter is to cut out helix seeds. I've also updated the comments to say "Input helix seed..." to be clear that these values are applied to the inputs.
@@ -318,12 +339,43 @@ namespace mu2e { | |||
kkbf_ = std::move(std::make_unique<KKBField>(*bfmgr,*det)); | |||
} | |||
if(print_ > 0) kkbf_->print(std::cout); | |||
|
|||
// Print the fit direction configuration | |||
if(print_ > -1) printf("[LoopHelixFit::%s::%s] Fit dz/dt direction = %.0f, PDG = %i, use helix slope = %o with threshold %.1f, use helix slope cut = %o with theshold %.1f\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the first case I've seen of modules using negative print thresholds, which may be confusing. The standard interpretation of '0' has been 'no printout', but that hasn't been strictly enforced.
I suggest bringing this up for wider discussion at the next sim/reco meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed this to print_ > 0
, as I only set this to -1
to get this summary info for my larger tests. In general I use print_ > -1
checks for printout that should be there in normal running, but can be silenced if needed for some reason (but I don't think this is the case here).
if(useHelixSlope_) return hseed.recoDir().predictDirection(slopeSigThreshold_); | ||
if(fdir_.fitDirection() == TrkFitDirection::FitDirection::upstream ) return HelixRecoDir::PropDir::upstream; | ||
if(fdir_.fitDirection() == TrkFitDirection::FitDirection::downstream) return HelixRecoDir::PropDir::downstream; | ||
return HelixRecoDir::PropDir::ambiguous; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this statement can ever be reached, since fdir_ must be either upstream or downstream.
I think this can be addressed by making fdir_ optiona, as discussed above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it can ever reach the final return statement, I just wanted to protect against any unexpected cases (perhaps a forced TrkFitDirection that is ambiguous and forces the track fit to do both fits each time?). I can remove this case or throw an error here instead if this is better.
const bool goodfit_1(goodFit(ktrk_1)), goodfit_2(goodFit(ktrk_2)); | ||
if(print_>1) printf(" [LoopHelixFit::%s]: status 1 = %o, status 2 = %o\n", __func__, goodfit_1, goodfit_2); | ||
|
||
if(!goodfit_2) return true; //if the second is bad or if both are bad, default to the first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the function returns on this statement the first fit could still be bad.
Returning a more complicated return value than 'bool' would address this, by having 3 states (1 is better, 2 is better, neither are good)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If both fits are bad, this is caught outside this comparison in the requirement that the accepted track is a good fit. In the case all fits are meant to be saved, this defaults to saving the first track given to the comparison. I updated to return 1 if the first is better, 0 if the second is better, and -1 if both are bad fits, still defaulting to the first track if both are bad.
if(print_>1) printf(" [LoopHelixFit::%s]: p 1 = %.4f, n(calo-hits) 1 = %lu, p 2 = %.4f, n(calo-hits) 2 = %lu\n", | ||
__func__, p_1, n_calo_1, p_2, n_calo_2); | ||
|
||
if(n_calo_1 == n_calo_2) return p_1 >= p_2; //if they both have the same number of calo-hits, use p(chi^2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest making the calo hit comparison optional and configurable. It's not clear to me how this responds to true upstream tracks for instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Processing cosmic events with upstream and downstream tracks, including this prioritization of tracks with calo-hits also improved the true selection efficiency for both true propagation directions (but not by as much as for CEs). I've made this a configurable flag, defaulting to true since it improves the selection efficiency for all cases I've checked so far (CE, downstream cosmics, and upstream cosmics).
const float slopeSig = (slopeErr > 0.f) ? slope/slopeErr : 0.f; | ||
if(slopeSig*fdir_.dzdt() < slopeSigCut_) { //slope is below the given threshold relative to the given direction | ||
if(print_ > 0) printf("[LoopHelixFit::%s] Skipping helix seed with slope significance %.1f\n", __func__, slopeSig); | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused by the logic: if the slope isn't significant, don't we still want to fit the track? Especially if the direction is hard-coded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a request from Giani, that for the trigger we may want to skip fitting helix seeds that are very likely upstream tracks when only looking for downstream tracks. In the Offline processing, I imagine there's no harm in fitting these downstream as well since we're forcing a downstream fit for all tracks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the intent is:
- if the slope is significant, only fit that one (significant) direction
- if the slope isn't significant fit both directions
Maybe I've misunderstood the logic, but it seems the fit is skipped entirely if the slope isn't significant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would only be used if using a forced fit direction, and throws an error if used in the helix slope-based fit direction mode. The intent with this is in the Online processing, only fit with the forced downstream hypothesis and skip events that have a significant upstream slope to save processing time with minimal efficiency loss for downstream tracks. This update is essentially separate from the rest of the updates as it's only relevant to the forced fit direction processing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm starting to think this test belongs outside loopHelixFit, either in the helix fitter itself, or (after merging) in a dedicated module that would only be run in the trigger reco sequence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. Perhaps this should be a new producer module that reads in a helix seed collection and produces a filtered helix collection? This way you can tailor the collection for the LoopHelixFit input, but not lose these helices if another path may want to use them in a separate LoopHelixFit module instance (e.g. an upstream trigger that only looks at significantly upstream-oriented helices using the same initial helix collection).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this filtering to the HelixFilter module, which seemed a more appropriate place for this type of selection.
fitpart = static_cast<PDGCode::type>(-1*fitpart); // reverse sign | ||
|
||
// determine the fit direction hypothesis | ||
auto helix_dir = chooseHelixDir(hseed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend combining the 2 hseed functions into 1, returning a vector of fit directions. That could be empty (bad helix), have 1 entry, or have 2 entries. Then you don't have to worry about interactions between these functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not positive I follow this suggestion exactly. Is it that I return a vector of directions, loop over this filling a vector of track hypotheses, and then choose the best hypothesis from this list (which ideally should be length 1 or 2, where 0 means the helix seed was a bad seed)? If my understanding is right I can start making this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If my understanding is right I can start making this change.
That's what I meant.
The goal is to make the code executed in produce as simple as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this update, where I return a vector of length 0, 1, or 2 and loop over this vector and perform a track fit for each hypothesis. If ever extended, it will keep running compare_tracks
taking the best result from the loop for any number of hypotheses (but of course there are only two at maximum as of now).
const bool use_downstream = !ktrk_up || (ktrk_down && compare_tracks(*ktrk_down, *ktrk_up)); //if one is null, default to the other (use down if both are null, doesn't matter) | ||
if(print_ > 0) | ||
printf("[LoopHelixFit::%s]: Fit both direction options, slope = %9.2e, sig = %.2f, status down = %i, status up = %i, p(chi^2) down = %.4f, p(chi^2) up = %.4f: chose %s\n", | ||
__func__, hseed.recoDir().slope(), hseed.recoDir().slopeSig(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement is too complex. Please recode it as a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated to pass the pointers rather than the references, so the compare_tracks
function can also check if the pointer is null (though I'm not as used to working with std::unique_ptr
so let me know if I haven't used it correctly)
if(print_>1) ktrk->printFit(std::cout,print_-1); | ||
|
||
// save the fit result | ||
if(goodfit || saveall_){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since goodfit has been tested several times already before getting to this statement I think the saveall_ flag no longer does anything. We should discuss whether to remove it or redefine it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with saveall_
set it will still save the track bad fit track (possibly without the drift fit or extrapolation performed) as was true before. In the case both fits are performed and both are bad, the downstream is saved with this flag. I don't know what this feature is used for, so I'm open to removing or redefining it in anyway that is suggested!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
saveall_ is for debugging failed fits. It's never used in production.
… update flags and printout
…helix fit directions
228c58e
to
1d973f7
Compare
I rebased and force-pushed the updates to not conflict with the HelixFilter updates at the HEAD of Offline. I apologize if this causes any issues with the review comments! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for moving the helix direction FP testing outside this class.
This is getting closer, but I still think there are places where the code needs to be simplified and clarified, see the comments below.
@@ -253,6 +273,10 @@ namespace mu2e { | |||
fixedfield_ = true; | |||
kkbf_ = std::move(std::make_unique<KKConstantBField>(VEC3(0.0,0.0,bz))); | |||
} | |||
|
|||
if(useFitDir_ && useHelixSlope_) //can either force the direction or determine it from the helix slope |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought having both of these true was a valid configuration, meaning that LoopHelixFit should fit the helix IFF it has a valid slope in the prescribed direction (including ambiguous).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are two different configuration options: force the fit direction to be a given direction for all helices or decide the fit direction on a helix-by-helix basis. In the latter option, it looks at the helix slope to decide if the direction is clear or if it's ambiguous and needs to be fit twice to decide. One could adjust the helix slope threshold to either always guess a direction (0 threshold) or never guess and always fit twice (very large sigma threshold), but these are both in the latter fit mode option.
} | ||
|
||
int LoopHelixFit::compare_tracks(const std::unique_ptr<KKTRK>& ktrk_1, const std::unique_ptr<KKTRK>& ktrk_2) const { | ||
// return 1 to select the first track, 0 to select the second, -1 to indicate both are bad tracks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please code these return values as a (private) enum, so that downstream interpretation will be self-evident;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this comparison to use the private enums kFirstPass
, kSecondPass
, and kBothFail
to be more clear.
ktrk = std::move(ktrk_i); | ||
helix_dir = dir; | ||
} else { | ||
const bool accept = compare_tracks(ktrk, ktrk_i) == 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic would be clearer if you started with an array<std::unique_ptr,2>, filled that in the loop over helix_dirs, and then called compare_tracks on the array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated to this style, passing the array to the compare tracks call.
bool goodHelix(HelixSeed const& hseed) const; | ||
std::vector<HelixRecoDir::PropDir> chooseHelixDir(HelixSeed const& hseed) const; | ||
int compare_tracks(const std::unique_ptr<KKTRK>& ktrk_1, const std::unique_ptr<KKTRK>& ktrk_2) const; | ||
std::unique_ptr<KKTRK> fitTrack(art::Event& event, HelixSeed const& hseed, const float dir, const PDGCode::type fitpart); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unclear why you pass dir as a float. Using TrkFitDirection would make the interface unambiguous. If you need the FP direction (line 413) you can get that from TrkFitDirection::dzdt()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated to passing the HelixRecoDir::PropDir enum directly, and then checking this enum to decide on the slope direction.
if(print_>1) ktrk->printFit(std::cout,print_-1); | ||
|
||
// save the fit result | ||
if(goodfit || saveall_){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
saveall_ is for debugging failed fits. It's never used in production.
fitflag.clear(TrkFitFlag::FitOK); | ||
if(undefined_dir) fitflag.merge(TrkFitFlag::AmbFitDir); | ||
auto kkseed = kkfit_.createSeed(*ktrk,fitflag,*calo_h); | ||
if(print_>0) printf("[LoopHelixFit::%s::%s] Create seed : fitcon = %.4f, nHits = %2lu, seedActiveHits = %2u, %lu calo-hits\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these additional print lines make it hard to read the code, for instance finding the sampleFit() call. Please move the prints into a single function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I was adding these mostly for the debugging process to understand if I was change any track information by fitting once vs. twice. I've moved these to a print_track_info
function, which I can remove or cleanup more now that the debugging has settled.
It will be faster to talk on zoom to resolve this.
…On Fri, Feb 7, 2025 at 10:09 AM michaelmackenzie ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In Mu2eKinKal/src/LoopHelixFit_module.cc
<#1401 (comment)>:
> @@ -253,6 +273,10 @@ namespace mu2e {
fixedfield_ = true;
kkbf_ = std::move(std::make_unique<KKConstantBField>(VEC3(0.0,0.0,bz)));
}
+
+ if(useFitDir_ && useHelixSlope_) //can either force the direction or determine it from the helix slope
These are two different configuration options: force the fit direction to
be a given direction for all helices or decide the fit direction on a
helix-by-helix basis. In the latter option, it looks at the helix slope to
decide if the direction is clear or if it's ambiguous and needs to be fit
twice to decide. One could adjust the helix slope threshold to either
always guess a direction (0 threshold) or never guess and always fit twice
(very large sigma threshold), but these are both in the latter fit mode
option.
—
Reply to this email directly, view it on GitHub
<#1401 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABAH575Y2FRCRNVATJYXSPL2OTZHPAVCNFSM6AAAAABVYNBKMKVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDMMBSGQ3TENJZG4>
.
You are receiving this because you are on a team that was mentioned.Message
ID: ***@***.***>
--
David Brown ***@***.***
Office Phone (510) 486-7261
Lawrence Berkeley National Lab
M/S 50R5008 (50-6026C) Berkeley, CA 94720
|
This adds the option to use the helix seed slope's significance to decide the fit direction (downstream vs. upstream) to assume in the LoopHelixFit. In the case the slope isn't very significant, both fit directions are fit and the p(chi^2) results are compared. The goal of this option is to fit downstream and upstream tracks in a single LoopHelixFit instance, producing a single output track collection where each helix seed corresponds to at most one track direction hypothesis (there can still be different track mass hypothesis collections).
In the ambiguous fit direction case where a calo-hit is included in one fit result but not the other hypothesis fit result, the fit that included the calo-hit is taken instead. This recovers a significant fraction of incorrectly selected CE- and CE+ direction tracks without additional loss.
I additionally added the option to cut on the helix slope significance when fitting a fixed direction hypothesis, skipping helix seeds with a helix slope that is unlikely to correspond to the fit direction hypothesis. This configuration can then be used in the trigger, to not waste time fitting tracks that are unlikely to match the fit hypothesis.
The current update does not change and default behavior. The next step would be investigating making CPR more fit hypothesis agnostic, which a Yale undergrad will look into, at which point we would be able to use this update to merge the De (Dmu) and Ue (Umu) reconstruction paths into a single Electron (Muon) reconstruction path.
An example fcl file to process events with this configuration is here:
/exp/mu2e/app/users/mmackenz/Yale/trigger/helix_slope/test.fcl