Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error messages observed with real HSI (TLU) at ICEBERG in one run of several #56

Open
bieryAtFnal opened this issue Jun 6, 2022 · 4 comments

Comments

@bieryAtFnal
Copy link
Contributor

log_hsi_3333.txt:31-05-22 11:43:22.088798 [139829662848768] ERROR - Expected 28-byte UDP payload from target "ipbusudp-2.0://192.168.121.10:50001", but only received 12 bytes. Validating returned data to work out where error occurred.
log_hsi_3333.txt:31-05-22 11:43:22.089121 [139829662848768] ERROR - Bad response code (0x4 = 'bus error on read') received for Non-incrementing read at base address 0x1030 (offset 0x1, node "TOP.endpoint0.hsi.buf.data"). URI: "ipbusudp-2.0://192.168.121.10:50001". Sent/received headers: 0x22d7052f / 0x22d70124 (transaction 1/1 bytes into IPbus payload)
log_hsi_3333.txt:31-05-22 11:43:22.089482 [139829325469440] WARNING - Closing Socket since exception detected.
log_hsi_3333.txt:2022-May-31 11:43:22,089 ERROR [virtual void dunedaq::timinglibs::HSIReadout::do_hsievent_work(std::atomic&) at /tmp/root/spack-stage/spack-stage-timinglibs-N22-05-31-neo75ck4gbtr4wlo3pmjfumvo4z4p6nn/spack-src/plugins/HSIReadout.cpp:239] Failed to read HSI events.
log_hsi_3333.txt: was caused by: 2022-May-31 11:43:22,089 ERROR [ers::Issue::Issue(const ers::Context&, const std::exception&) at /tmp/root/spack-stage/spack-stage-ers-N22-05-31-4hqqotgwe7ldhrszdiqw6zil2segjapy/spack-src/src/Issue.cpp:91] Bad response code (0x4 = 'bus error on read') received for Non-incrementing read at base address 0x1030 (offset 0x1, node "TOP.endpoint0.hsi.buf.data"). URI: "ipbusudp-2.0://192.168.121.10:50001". Sent/received headers: 0x22d7052f / 0x22d70124 (transaction 1/1 bytes into IPbus payload)

@bieryAtFnal
Copy link
Contributor Author

Here is another way to see the error, via the TRACE log...

  65 06-06 11:51:58.557459 2505645 2505664  28          ModuleLevelTrigger LOG do_pause: ******* Triggers PAUSED! *********
  83 06-06 11:50:58.215686 2505645 2505664  23          ModuleLevelTrigger LOG do_resume: ******* Triggers RESUMED! *********
 118 06-06 11:50:58.134404 2505645 2505664  23          ModuleLevelTrigger NFO do_start: Start of run 246
 202 06-06 11:50:54.505567 2505645 2505664  19          ModuleLevelTrigger LOG do_pause: ******* Triggers PAUSED! *********
 226 06-06 11:49:54.263633 2505645 2505664  28          ModuleLevelTrigger LOG do_resume: ******* Triggers RESUMED! *********
 255 06-06 11:49:54.172393 2505645 2505664  28          ModuleLevelTrigger NFO do_start: Start of run 245
 339 06-06 11:49:50.469787 2505645 2505664  26          ModuleLevelTrigger LOG do_pause: ******* Triggers PAUSED! *********
 357 06-06 11:48:50.229653 2505645 2505664  25          ModuleLevelTrigger LOG do_resume: ******* Triggers RESUMED! *********
 392 06-06 11:48:50.148458 2505645 2505664  24          ModuleLevelTrigger NFO do_start: Start of run 244
 476 06-06 11:48:46.499826 2505645 2505664  18          ModuleLevelTrigger LOG do_pause: ******* Triggers PAUSED! *********
 494 06-06 11:47:46.255770 2505645 2505664  17          ModuleLevelTrigger LOG do_resume: ******* Triggers RESUMED! *********
 529 06-06 11:47:46.174746 2505645 2505664  17          ModuleLevelTrigger NFO do_start: Start of run 243
 613 06-06 11:47:42.236953 2505645 2505664  31          ModuleLevelTrigger LOG do_pause: ******* Triggers PAUSED! *********
 631 06-06 11:46:41.992890 2505645 2505664  18          ModuleLevelTrigger LOG do_resume: ******* Triggers RESUMED! *********
 666 06-06 11:46:41.911961 2505645 2505664  18          ModuleLevelTrigger NFO do_start: Start of run 242
 684 06-06 11:46:41.870253 2505624 2506101  33                  HSIReadout ERR do_hsievent_work: Failed to read HSI events.;	caused by: 2022-Jun-06 11:46:41,870 ERROR [Issue(...) at Issue.cpp:91] Bad response code (0x4 = 'bus error on read') received for Non-incrementin
 751 06-06 11:46:37.950128 2505645 2505664  29          ModuleLevelTrigger LOG do_pause: ******* Triggers PAUSED! *********
 781 06-06 11:45:37.709361 2505645 2505664  28          ModuleLevelTrigger LOG do_resume: ******* Triggers RESUMED! *********
 832 06-06 11:45:37.617265 2505645 2505664  28          ModuleLevelTrigger NFO do_start: Start of run 241

@bieryAtFnal
Copy link
Contributor Author

Here are the commands that I used in the tests.

Timing Partition:
-----------------
daqconf_timing_gen --host-thi iceberg01-priv --host-tmc iceberg01-priv --master-device-name BOREAS_TLU_ICEBERG --clock-speed-hz 62500000 timing_partition_config
nanotimingrc timing_partition_config boot ${USER}_timing_partition init conf wait 3600 scrap terminate

DAQ Partition:
--------------
curl -o frames.bin -O https://cernbox.cern.ch/index.php/s/0XzhExSIMQJUsp0/download
daqconf_multiru_gen --use-hsi-hw --host-hsi iceberg01-priv --control-hsi-hw --hsi-device-name BOREAS_TLU_ICEBERG --hsi-source 1 --ttcm-s1 1 --hsi-re-mask 1 --host-timing iceberg01-priv --control-timing-partition --host-tprtc iceberg01-priv --timing-partition-master-device-name BOREAS_TLU_ICEBERG -t 1.0 --hsi-trigger-type-passthrough --use-fake-data-producers --clock-speed-hz 62500000 mdapp_fdp_hsipulser
runbase=220; runduration=60; waitAfterStop=2; local_backup log_*; let R1=${runbase}+1; let R2=${runbase}+2; let R3=${runbase}+3; let R4=${runbase}+4; let R5=${runbase}+5; let R6=${runbase}+6; nanorc mdapp_fdp_hsipulser/ boot ${USER}_test init conf start ${R1} wait ${runduration} stop wait ${waitAfterStop} start ${R2} wait ${runduration} stop wait ${waitAfterStop} start ${R3} wait ${runduration} stop wait ${waitAfterStop} start ${R4} wait ${runduration} stop wait ${waitAfterStop} start ${R5} wait ${runduration} stop wait ${waitAfterStop} start ${R6} wait ${runduration} stop wait ${waitAfterStop} scrap terminate

@strilov
Copy link
Collaborator

strilov commented Jun 13, 2022

HI @bieryAtFnal, could you please confirm the software and firmware versions used for this test? Also, are the full logs from the runs available somewhere?

@bieryAtFnal
Copy link
Contributor Author

I just reproduced the issue, and to do that I used the N22-06-13 nightly build of the software. I believe that the TLU at ICEBERG has the 6.4.0 firmware, but I'm not sure how to check that.
The logs from the test runs today are in iceberg01:/home/nfs/dunedaq/dunedaq/13JunDevTest/rundir.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants