Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: store systematic in array option #113

Merged
merged 5 commits into from
Jan 30, 2025
Merged

Conversation

Ming-Yan
Copy link
Collaborator

  • store the systematic variation in the array option
    => up/down variations of systematic
['nominal', 'UEPS_ISRUp', 'PDFaS_weightDown', 'UEPS_ISRDown', 'PDFaS_weightUp', 'scalevar_muRDown', 'UEPS_FSRDown', 'aS_weightDown', 'ele_RecoDown', 'scalevar_muFDown', 
'aS_weightUp', 'scalevar_muR_muFDown', 'mu_IDDown', 'ele_RecoUp', 'UEPS_FSRUp', 'scalevar_muR_muFUp', 'puweightUp', 'ele_IDUp', 'PDF_weightUp', 'mu_IsoUp', 'scalevar_muRUp',
'puweightDown', 'ele_IDDown', 'PDF_weightDown', 'mu_IDUp', 'mu_IsoDown', 'scalevar_muFUp']
Branches to write: ['PDF_weight_weight' 'PDFaS_weight_weight' 'SelElectron' 'SelJet'
 'SelMuon' 'UEPS_FSR_weight' 'UEPS_ISR_weight' 'aS_weight_weight'
 'dr_mujet0' 'dr_mujet1' 'ele_ID_weight' 'ele_Reco_weight'
 'genweight_weight' 'mu_ID_weight' 'mu_Iso_weight' 'njet'
 'puweight_weight' 'scalevar_muF_weight' 'scalevar_muR_muF_weight'
 'scalevar_muR_weight' 'weight']
  • change the default CI checks

@mondalspandan
Copy link
Contributor

Hi, it looks like the output array_... directory contains two subdirectories: nominal and n. I did not check where this n is coming from, but is this expected?

Comment on lines 198 to 202
self,
events[event_level],
events,
None,
"nominal",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, it looks like the output array_... directory contains two subdirectories: nominal and n. I did not check where this n is coming from, but is this expected?

Hi @mondalspandan thanks for checking it, no it wasn't expected, but I think this is due to nominal in empty event was not a list, need quick fix :p

@mondalspandan
Copy link
Contributor

I see an issue with the JME shifts when using isSysts all

  File "/isilon/data/users/smondal5/btvnano/250128_BTVNanoCommissioning/src/BTVNanoCommissioning/utils/correction.py", line 618, in JME_shifts
    jesuncmap = correct_map["JME"][f"{jecname}_Total_AK4PFPuppi"]
  File "/isilon/data/users/smondal5/miniconda3/envs/btv_coffea_3/lib/python3.10/site-packages/correctionlib/highlevel.py", line 394, in __getitem__
    corr = self._base.__getitem__(key)
IndexError: map::at

This happens with 2022 data, specifically when jecname is Summer22_22Sep2023_RunCD_V2_DATA.

@mondalspandan
Copy link
Contributor

Also seems something wrong specifically with the DY workflow's histogram writing:

  File "/isilon/data/users/smondal5/btvnano/250128_BTVNanoCommissioning/src/BTVNanoCommissioning/workflows/ctag_DY_valid_sf.py", line 267, in process_shift
    output = histo_writter(
  File "/isilon/data/users/smondal5/btvnano/250128_BTVNanoCommissioning/src/BTVNanoCommissioning/utils/histogrammer.py", line 968, in histo_writter
    output["dr_posljet"].fill(
  File "/isilon/data/users/smondal5/miniconda3/envs/btv_coffea_3/lib/python3.10/site-packages/hist/basehist.py", line 263, in fill
    return super().fill(*args, *data, weight=weight, sample=sample, threads=threads)
  File "/isilon/data/users/smondal5/miniconda3/envs/btv_coffea_3/lib/python3.10/site-packages/boost_histogram/_internal/hist.py", line 511, in fill
    self._hist.fill(*args_ars, weight=weight_ars, sample=sample_ars)  # type: ignore[arg-type]
ValueError: All arrays must be 1D

@Ming-Yan
Copy link
Collaborator Author

I see an issue with the JME shifts when using isSysts all

  File "/isilon/data/users/smondal5/btvnano/250128_BTVNanoCommissioning/src/BTVNanoCommissioning/utils/correction.py", line 618, in JME_shifts
    jesuncmap = correct_map["JME"][f"{jecname}_Total_AK4PFPuppi"]
  File "/isilon/data/users/smondal5/miniconda3/envs/btv_coffea_3/lib/python3.10/site-packages/correctionlib/highlevel.py", line 394, in __getitem__
    corr = self._base.__getitem__(key)
IndexError: map::at

This happens with 2022 data, specifically when jecname is Summer22_22Sep2023_RunCD_V2_DATA.

for JERC you don't need to run them in data, only MC @nurfikri89 please correct me if I am wrong:https://cms-nanoaod-integration.web.cern.ch/commonJSONSFs/summaries/JME_2022_Summer22_jet_jerc.html

and why you see this is Summer23 is we are using txt files version, so you see them...

Also seems something wrong specifically with the DY workflow's histogram writing:

  File "/isilon/data/users/smondal5/btvnano/250128_BTVNanoCommissioning/src/BTVNanoCommissioning/workflows/ctag_DY_valid_sf.py", line 267, in process_shift
    output = histo_writter(
  File "/isilon/data/users/smondal5/btvnano/250128_BTVNanoCommissioning/src/BTVNanoCommissioning/utils/histogrammer.py", line 968, in histo_writter
    output["dr_posljet"].fill(
  File "/isilon/data/users/smondal5/miniconda3/envs/btv_coffea_3/lib/python3.10/site-packages/hist/basehist.py", line 263, in fill
    return super().fill(*args, *data, weight=weight, sample=sample, threads=threads)
  File "/isilon/data/users/smondal5/miniconda3/envs/btv_coffea_3/lib/python3.10/site-packages/boost_histogram/_internal/hist.py", line 511, in fill
    self._hist.fill(*args_ars, weight=weight_ars, sample=sample_ars)  # type: ignore[arg-type]
ValueError: All arrays must be 1D

can you please point me to the file that I can reproduce the error...? I cannot reproduce from my side

@mondalspandan
Copy link
Contributor

for JERC you don't need to run them in data

I think JER is MC-only while JES is for both data and MC. So when isSyst is set to all, it should ideally skip JER when it's data. Can we add an automatic flag somewhere?

can you please point me to the file that I can reproduce the error...?

I get this with python runner.py --workflow ctag_DY_sf --json jsonfile --campaign Summer22 --year 2022 --outputdir 2022_Summer22 --executor dask/condor/brux -j 1 -s 1 --isArray --skipbadfiles --chunk 25000 --retries 5

json file is here:
https://gist.github.com/mondalspandan/3c83d5b5abf690c1d2d6cb2243778289

@Ming-Yan
Copy link
Collaborator Author

for JERC you don't need to run them in data

I think JER is MC-only while JES is for both data and MC. So when isSyst is set to all, it should ideally skip JER when it's data. Can we add an automatic flag somewhere?

can you please point me to the file that I can reproduce the error...?

I get this with python runner.py --workflow ctag_DY_sf --json jsonfile --campaign Summer22 --year 2022 --outputdir 2022_Summer22 --executor dask/condor/brux -j 1 -s 1 --isArray --skipbadfiles --chunk 25000 --retries 5

json file is here: https://gist.github.com/mondalspandan/3c83d5b5abf690c1d2d6cb2243778289

Done!

@mondalspandan
Copy link
Contributor

I can confirm that --isArray mode works as expected. And the DY arrays now contain all SelJets and not just the leading one.

Only one last concern: For data there should still be JES_up and JES_down variations inside the array dir (may be @nurfikri89 can confirm). I see only nominal now.

@nurfikri89
Copy link
Collaborator

Only one last concern: For data there should still be JES_up and JES_down variations inside the array dir (may be @nurfikri89 can confirm). I see only nominal now.

@mondalspandan The JES variations should only be for MC.

@mondalspandan
Copy link
Contributor

Okay thanks, then we're all set!

@Ming-Yan
Copy link
Collaborator Author

Merging this branch

@Ming-Yan Ming-Yan merged commit c1e1eda into cms-btv-pog:master Jan 30, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants