Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track Processing update #285

Merged
merged 14 commits into from
Dec 13, 2024
Merged

Track Processing update #285

merged 14 commits into from
Dec 13, 2024

Conversation

tschuh
Copy link

@tschuh tschuh commented Aug 1, 2024

This PR updates mainly the modules TrackerTFP, TrackFindingTracklet and TrackTrigger.
It brings the Kalman Filter, Track Multiplexer, Duplicate Removal and Track Quality steps up to date (https://indico.cern.ch/event/1449862/contributions/6116147/attachments/2924526/5133676/thomas.pdf)

Detailed list for Ian:

  1. DataFormats/L1TrackTrigger
  • TTBV update
  • added kf chi2 output product (pair of doubles)
  1. L1Trigger/TrackFindingTracklet
  • HitPatternHelper updated to cope changes, likely broken
  • KFin deleted
  • KalmanFilter added
  • 5 parameter fit option added (https://indico.cern.ch/event/1449862/contributions/6116147/attachments/2924526/5133676/thomas.pdf)
  • ProducerKalmanFilter added
  • State added
  • DRin renamed in TrackMultiplexer
  • L1FPGATrackProducer updated
  • DR renamed in DuplicateRemoval and updated
  • KFout removed
  • TBout removed
  • ChannelAssignment updated
  • FitTrack updated
  • L1TrackNtupleMaker updated with layer encoding specifics
  1. L1Trigger/TrackTrigger
  • SensorModule updated
  • Setup updated
  • L1TrackQuality moved to trackerTFP
  1. L1Trigger/TrackerDTC
  • DTC updated
  • LayerEncoding updated
  • Stub updated
  1. L1Trigger/TrackerTFP
  • L1TrackQuality renamed in TrackQuality and turned into ESProduct
  • TrackFindingProcessor added.
  • CleanTrackBuilder added
  • DataFormats updated
  • Demonstrator updated
  • Duplicate Removal added
  • GeometricProcessor updated
  • HoughTransform updated
  • KalmanFilter updated
  • LayerEncoding updated
  • MiniHoughTransform deleted
  • State updated
  • ZHoughTransform deleted
  • KFin deleted

@tschuh tschuh requested a review from tomalin August 1, 2024 00:01
@tschuh tschuh self-assigned this Aug 1, 2024
@tomalin
Copy link
Collaborator

tomalin commented Aug 5, 2024

As you're changing 160 files, the text describing this PR needs to be longer and more detailed. What are the changes that have been implemented? And how do they affect tracking performance? Have you given a talk which describes in detail the changes and the effect on performance. If so, please link it from here.

@tomalin
Copy link
Collaborator

tomalin commented Aug 5, 2024

This fails CI quite spectacularly https://gitlab.cern.ch/cms-l1tk/cmssw_CI/-/pipelines/7858190 . e.g. For HYBRID_NEWKF, the tracking efficiency is up by 0.3% (good), but the number of reconstructed tracks is down by 80% (very weird, given that no efficiency is lost!), and the z0 resolution is a factor 5 worse (bad).

For comparison, this is the git CI for a dummy PR that changed no code https://gitlab.cern.ch/cms-l1tk/cmssw_CI/-/pipelines/7924082 .

@tomalin
Copy link
Collaborator

tomalin commented Aug 13, 2024

This fails CI quite spectacularly https://gitlab.cern.ch/cms-l1tk/cmssw_CI/-/pipelines/7858190 . e.g. For HYBRID_NEWKF, the tracking efficiency is up by 0.3% (good), but the number of reconstructed tracks is down by 80% (very weird, given that no efficiency is lost!), and the z0 resolution is a factor 5 worse (bad).

For comparison, this is the git CI for a dummy PR that changed no code https://gitlab.cern.ch/cms-l1tk/cmssw_CI/-/pipelines/7924082 .

==== Thomas says CI failure explained by poor z0 resolution. He's retuning the digitisation ranges to fix this. But he believes that the tracklet helix parameters are wrong (inconsistent with seed positions by ~1cm) for a small fraction of the tracklets, which is also contributing to poor z0 resolution.

@tschuh
Copy link
Author

tschuh commented Aug 14, 2024

new KF maths are fixed now, also passing CI using new KF. However, CI using old KF fails due to slightly lower z0 resolution. My PR should not impact those...

@tomalin
Copy link
Collaborator

tomalin commented Sep 2, 2024

I understand that the ProducerTM usually orders tracks according to their seed type. This is important for the functioning of the DR, so should be described in the comment at the top of the class header file, which explains what the class does. And at the line of the code that causes this ordering to happen, an additional comment should appear to highlight it. A comment mentioning this inside ProducerDR would also be wise.

@tschuh
Copy link
Author

tschuh commented Sep 2, 2024

I understand that the ProducerTM usually orders tracks according to their seed type. This is important for the functioning of the DR, so should be described in the comment at the top of the class header file, which explains what the class does. And at the line of the code that causes this ordering to happen, an additional comment should appear to highlight it. A comment mentioning this inside ProducerDR would also be wise.

done

@tomalin
Copy link
Collaborator

tomalin commented Sep 3, 2024

@Chriisbrown & @cgsavard -- Thomas's overview description of this PR says "L1TrackQuality renamed in TrackQuality and turned into ESProduct". Are you both and Andreas happy with this change?

@tomalin
Copy link
Collaborator

tomalin commented Sep 3, 2024

At the top of the header of the KF class where you do the maths (KalmanFilter.h?), add the comments copied from L7-31 of https://github.com/cms-L1TK/cmssw/blob/tschuh/L1Trigger/TrackFindingTMTT/interface/KFParamsComb.h#L7 , so the maths is documented. Although as you treat r-phi and r-z planes independently, you may need to modify a few lines of this.

@tomalin
Copy link
Collaborator

tomalin commented Sep 3, 2024

Overview description of this PR should mention that new class TrackFindingTracklet/ProducerKF has been added.

@@ -9,42 +9,39 @@

namespace trklet {

/*! \class trklet::DR
/*! \class trklet::DuplicateRemoval
* \brief Class to bit- and clock-accurate emulate duplicate removal
* DR identifies duplicates based on pairs of tracks that share stubs in at least 3 layers.
* It keeps the first such track in each pair.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modify this comment to make clear that the track order is determined by TrackMultiplexer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modify comment to mention ProducerTM, since it's not obvious that the Track Multiplexer can be found there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link

@cgsavard cgsavard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the comments I could in the parts that I am familiar with. Unfortunately there are a lot of files I am unfamiliar with and so I couldn't be of much help there. I am not opposed to changing tracker quality to its own EDProducer if you feel that it is a better setup. Although right now the HYBRID version still uses it as a function and not EDProducer. Can we make that consistent?

There are three main concerns I brought up:

  1. The change in naming scheme from chi2rz and chi2rphi to chi20 and chi21 only adds confusion to any others who look in. I think it's better to change these to the naming scheme that is generally accepted.
  2. The MVA1 variable is not being calculated correctly as many of the BDT inputs are wrong. The correct variables that should be input can be seen further down in the TrackQuality.cc file and I made notes on changes need to each variable.
  3. The tttracks being created in TrackFindingProcessor.cc have incorrect chi2 and mva variables. The chi2rz and chi2rphi should not be pdof and the mva should be in the 0-1 range and not binned upon tttrack initialization.

L1Trigger/TrackFindingTracklet/src/HitPatternHelper.cc Outdated Show resolved Hide resolved
L1Trigger/TrackFindingTracklet/src/TrackMultiplexer.cc Outdated Show resolved Hide resolved
L1Trigger/TrackerTFP/python/Demonstrator_cfi.py Outdated Show resolved Hide resolved
Comment on lines 9 to 11
InputLabelKF = cms.string( "ProducerCTB" ), #
InputLabelDR = cms.string( "ProducerKF" ), #
InputLabelTQ = cms.string( "ProducerKF" ), #
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the order was DR -> KF -> TQ -> TFP, is there a mistake here? Is the DR output being used anywhere?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done and that is the tmtt track reco chain which runs KF -> DR -> TQ

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused about this other chain we have. Why is this in the TFP folder if it's only for TMTT? And is this actually ever used then?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not good in naming stuff. I put my dtc emulator in TrackerDTC and my tfp emulator in TrackerTFP.

L1Trigger/TrackerTFP/src/TrackQuality.cc Outdated Show resolved Hide resolved
L1Trigger/TrackerTFP/src/TrackQuality.cc Outdated Show resolved Hide resolved
L1Trigger/TrackerTFP/src/TrackFindingProcessor.cc Outdated Show resolved Hide resolved
Comment on lines +44 to +52
// TQ MVA bin conversion LUT
constexpr array<double, numBinsMVA_> TrackQuality::mvaPreSigBins() const {
array<double, numBinsMVA_> lut = {};
lut[0] = -16.;
for (int i = 1; i < numBinsMVA_; i++)
lut[i] = invSigmoid(TTTrack_TrackWord::tqMVABins[i]);
return lut;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a discussion in PR #272 about how to set things up such that we avoid ever having to do the inverse sigmoid. I believe this was mainly to better compare with the firmware as we will not do this conversion there. Since it was recently decided there, I think we should stay true to that.

conifer::BDT<ap_fixed<10, 5>, ap_fixed<10, 5>> bdt(tq->model().fullPath());
// collect features and classify using bdt
const vector<ap_fixed<10, 5>>& output = bdt.decision_function({cot, zT, chi2B, nstub, ninterior, chi20, chi21});
const float mva = output[0].to_float();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the above changes to the input values are fixed, it should be checked that the MVA distribution still has a peak of fake tracks at 0 and real tracks at 1 to make sure it's working again. Right not the variable is nonsense.

Copy link
Author

@tschuh tschuh Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as long as it is not retrained to the current z0, cot and chi2s digitization we should not expect great performance. When we retrain we should change the input variables to zT(layer encoding granularity), chi2B, hitPattern, chi2rphi, chi2rz. And maybe create a version without chi2B to have something we can test in f/w.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I do think the performance won't be as good as before, but we should still expect a fake track peak about 0 and real track peak around 1 so it would be nice to double check this is the case and nothing else is way off. I agree on the retraining and will pass this info along.

tt::StreamsTrack& acceptedTracks,
tt::StreamsStub& lostStubs,
tt::StreamsTrack& lostTracks);
void produce(tt::StreamsTrack& streamsTrack, tt::StreamsStub& streamsStub);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that consumes, produces & produce are key functions in CMSSW, common to all EDProducers, with definate meaning, I'm not keen on seeing the same function names being used in code that is not an EDProducer. e.g. If one searches the code for "produce" to find where EDProducts are being created, one will get false matches in TrackMultiplexer etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any progress on this comment?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like using those names, I would recommend to let cmssw reviewer decide.

@@ -21,13 +21,13 @@ namespace hph {
oldKFPSet_(iConfig.getParameter<edm::ParameterSet>("oldKFPSet")),
setupTT_(setupTT),
dataFormats_(dataFormats),
dfcot_(dataFormats_.format(trackerTFP::Variable::cot, trackerTFP::Process::kfin)),
dfzT_(dataFormats_.format(trackerTFP::Variable::zT, trackerTFP::Process::kfin)),
dfcot_(dataFormats_.format(trackerTFP::Variable::cot, trackerTFP::Process::gp)),
Copy link
Collaborator

@tomalin tomalin Sep 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the HitPatternHelper, which is only being used in the Hybrid (or Tracklet) algo, being made dependent on the GP, which is only used in the TMTT algo. What is going on here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KF depends on GP, Hybrid uses KF.

@tomalin
Copy link
Collaborator

tomalin commented Sep 8, 2024

All your code uses a few classes a lot, such as TTBV, Frame, FrameStub & FrameTrack, StreamStub & StreamTrack, DataFormats & DataFormat, Setup. But they're only documented in a few sentences in /L1Trigger/TrackerTFP/README.md . I suggest adding a dedicated section to this README, explaining these things and giving examples of common manipulations, such as extracting a float from the digi data.

@@ -151,6 +149,8 @@

# HYBRID: prompt tracking
if (L1TRKALGO == 'HYBRID'):
process.TrackTriggerSetup.GeometricProcessor.ChosenRofZ = 50.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the Hybrid depending on the TMTT GP? And why is the eta range of the Hybrid being set to 2.4? I seem tp recall the L1TrkNtuple_eff_eta.pdf plot showing non-zero efficiency up to eta 2.5 in the past, although that currently doesn't seem to be the case. Is the eta cut in TrackTriggerSetup used only in the DTC & KF? It's also dodgy changing the default value of parameters for the baseline algo in a cfg.py, as anyone else running this algo will have to copy these parameter mods.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

performing those changes (restoring the past configuration) is dodgy to begin with, they are just made to pass CI.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make the CI more reluctant so that we don't need these lines of code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skinnari do you remember if the Hybrid eta range for track reco was 2.5 or 2.4? It seems to be 2.4 now, but I recall it being 2.5.

@tomalin
Copy link
Collaborator

tomalin commented Sep 8, 2024

Please improve the comments at the top of the various .h and _cfg.py files with "Demonstrator" in the title, so that anyone wishing to use this code to check SW vs FW would understand what the various modules and doing and how they need to set up Vivado etc. to work with them.
e.g. TrackerTFP/test/AnalyzerDemonstrator.cc & TrackerTFP/plugins/ProducerDemonstrator.cc both have the same very short comment, despite the fact that one is an ESProducer and the other is an EDAnalyzer. Please make clear what the difference between them is. e.g. What is the ESProducer for? What is the EDAnalyzer doing?

################################################################################################
# Run bit-accurate TMTT L1 tracking emulation.
#
# To run execute do
Copy link
Collaborator

@tomalin tomalin Sep 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this job called CleanRelVal if it runs the TMTT tracking? Also, the comment inside it says it is called test_cfg.py.
In any case. we already have TrackerTFP/test/test_cfg.py to run the TMTT tracking, so why add this second job?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be obsolete soon

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will it be obsolete?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already is.

@tomalin
Copy link
Collaborator

tomalin commented Sep 8, 2024

Expand existing comment at start of LayerEncoding.h, to explain what the encoding actually is. i.e. How are the layers encoded?


namespace trackerTFP {

// Class to format final tfp output
Copy link
Collaborator

@tomalin tomalin Sep 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name TrackFindingProcessor suggests that this class runs the entire TFP chain. But the comment here says that it only does some formatting of the TFP output, (which if true means it should be renamed to something suggesting that). I think it actually makes the TTTracks and the stream objects.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any progress on this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no

@cgsavard
Copy link

cgsavard commented Oct 1, 2024

I am wondering if one can solves this simply by increasing the number of comparison modules

The increase in duplicate high eta tracks does not go away when doubling the number of DR comparison modules to 64, which I just tried.

@tomalin
Copy link
Collaborator

tomalin commented Oct 2, 2024

Your code uses MessageLogger via edm::LogPrint("L1Trigger/TrackerTFP"). I believe the category has to be a simple string, so a forward slash is not allowed.
e.g. https://gitlab.cern.ch/ejclemen/cmssw_trackfinding_hlsframework/-/blob/master/TrackFindingTrackletHLS/test/L1tracking_cfg.py?ref_type=heads#L34 gives an example of how the level of printout from the "IRProducer" category is controlled. You can see that the category corresponds to the python variable name, and python variable names cant include forward slash. (Also note that whenever the MessageLogger prints the category name, it automatically also prints the name of the EDProducer that called it). Also LogPrint is intended for "Warnings" https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideMessageLogger , so should only be used when something has gone wrong.

tomalin and others added 3 commits November 22, 2024 15:43
* Make combined modules default

* tweak

* Improve USEHYBRID ifdef range

* Fix compiler error for pure Tracklet algo

* Move fitpattern.txt refs, so only used for pure Tracklet algo

* code format
* add TQ MVA output binning scheme

* remove code no longer used

* move logistic sigmoid into TTTrack class

* fix for proper precision after MVA conversion in track word

* label all MVA variables before logistic sigmoid with 'Pre'

* formatting

* fix filling of track word before running TQ MVA

* switch to post-sigmoid mva bins

* remove any onnx instances

* fix MVA1 initialization

* bin TQ MVA in pre-sigmoided bins in KFout

* move pre-sigmoid bins to track quality class

* change minimum bin
@tschuh tschuh changed the base branch from L1TK-dev-14_0_0_pre2 to L1TK-dev-14_2_0_pre4 November 27, 2024 16:26
@tomalin
Copy link
Collaborator

tomalin commented Dec 4, 2024

@tschuh as this looks a significant improvement on the previous HYBRID_NEWKF results, I'm inclined to merge this as soon as it passes git CI. However, can you say here if you've managed to solve any of the issues with the HYBRID_NEWKF performance described by @cgsavard in her talk https://indico.cern.ch/event/1467716/contributions/6179940/attachments/2948559/5182327/L1TK_10_16_24_newKF.pdf ?

@tomalin
Copy link
Collaborator

tomalin commented Dec 5, 2024

I think @cgsavard 's comments here #285 (review) have not yet been addressed?

@tschuh
Copy link
Author

tschuh commented Dec 5, 2024

I think @cgsavard 's comments here #285 (review) have not yet been addressed?

those got addressed.

@tschuh
Copy link
Author

tschuh commented Dec 5, 2024

@tschuh as this looks a significant improvement on the previous HYBRID_NEWKF results, I'm inclined to merge this as soon as it passes git CI. However, can you say here if you've managed to solve any of the issues with the HYBRID_NEWKF performance described by @cgsavard in her talk https://indico.cern.ch/event/1467716/contributions/6179940/attachments/2948559/5182327/L1TK_10_16_24_newKF.pdf ?

The issues described by her talk seems to correspond to this issue. I am not a Tracklet expert and not able to solve it.

@tomalin
Copy link
Collaborator

tomalin commented Dec 6, 2024

@tschuh as this looks a significant improvement on the previous HYBRID_NEWKF results, I'm inclined to merge this as soon as it passes git CI. However, can you say here if you've managed to solve any of the issues with the HYBRID_NEWKF performance described by @cgsavard in her talk https://indico.cern.ch/event/1467716/contributions/6179940/attachments/2948559/5182327/L1TK_10_16_24_newKF.pdf ?

The issues described by her talk seems to correspond to this issue. I am not a Tracklet expert and not able to solve it.

I see that this PR changes the python cfg parameter "UseTTStubResiduals" mentioned in issue from False to True. So the KF is no longer by default using the digital output of the Tracklet stage, but instead recalculating the residuals itself. I assume your hypothesis is that the recalculated residuals are still wrong because they are using the helix parameters of the Tracklet seeds, and these you believe are wrong. If we looked at the TTTrack collection produced by the Tracklet part of the chain (i.e. by L1FPGATrackProducer.cc), should we see this? e.g. Should we see the bias in z0 reported in Claire's talk?

@tschuh
Copy link
Author

tschuh commented Dec 6, 2024

The old KF in the Hybrid simulation is not fed with stub parameters calculated by the MatchProcessor. It is fed by stub collections determined by the TrackBuilder, but the stub parameters are taken from DTC stubs. Therefore you can't see the corruption in the hybrid s/w chain.

@cgsavard
Copy link

I think that the issues surrounding the TQ and the resolution shifts are ok to be left as an issue and not addressed here as they are separate from the motivation of this PR.

@tschuh Have you looked into the increase in duplicate tracks at high eta that is a result of the changes made in this PR? This is shown in slide 6 here. I do think this should at be looked into and addressed a little before this PR is merged as this is a direct result of the PR, whereas the other issues were issues that existed beforehand.

@tschuh
Copy link
Author

tschuh commented Dec 11, 2024

I will make the mux order of seed types programmable in the TrackMultiplexer, that may give us a handle to improve the DR.

@tschuh
Copy link
Author

tschuh commented Dec 11, 2024

I am not sure if that changed a lot, summary printout sees now difference.
In order to judge if the higher duplicate rate at high eta is a good or bad thing we need to understand how efficiency and fake rate has changed at high eta. It could be that the KF improved at high eta picking stub combinations which are more likely to match TPs. That will decrease fake rate but also increase efficiency and potentially duplicate rate.

Copy link
Collaborator

@tomalin tomalin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this represents a significant improvement in tracking performance over the old KF, I'm merging it. The tracking performance still has clear evidence of bugs, as shown in @cgsavard talk https://indico.cern.ch/event/1467716/contributions/6179940/attachments/2948559/5182327/L1TK_10_16_24_newKF.pdf . Thomas suspects these are caused by this issue #150 , which needs debugging. Claire also suspects the TQ is buggy, but that is unrelated to this PR. She says she's OK with merging this PR too.

@tomalin tomalin merged commit a3400d3 into L1TK-dev-14_2_0_pre4 Dec 13, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants