You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be beneficial to log more information about the hosts to delete/disable in Zabbix, such as their current host groups, templates, etc.
Improved logs
Initially, the focus should be on logging more informations about hosts to disable. The challenge here is the refactoring required to do it consistently. Whether or not we hit the failsafe determines if we enter ZabbixHostUpdater.disable_host() or not, thus we cannot rely on adding/expanding the logging statements there.
"Too many hosts to change (failsafe=%d). Remove: %d, Add: %d. Aborting",
failsafe,
len(to_remove),
len(to_add),
)
raiseZACException("Failsafe triggered")
However, that is not without its own set of problems. Having that function be responsible for logging information about hosts to delete even if we don't trigger the failsafe doesn't make sense.
In all likelyhood, the best solution might be simply looping through each host to delete in ZabbixHostUpdater.do_updatebefore we check failsafe and before we attempt to delete the hosts.
Thoughts
I need to gather my thoughts on this for a bit. What do we want to achieve with the logging? Is it only to perform a better "post-mortem" once a host has been deleted, or is it to add better introspection before a host has been deleted (i.e. the logging needs to happen before failsafe checks)?
Adding better logging as we delete the host is easy. Adding better logging that also triggers if we hit the failsafe is harder.
More detailed JSON dumps
Another improvement would be dumping more information about the hosts to remove in the failsafe_hosts.json file. That would let us filter hosts by their Source-* host group to determine which source(s) they come from, and thus more easily troubleshoot a faulty source.
TUI
It could be very useful to add a command that displays a TUI with Textual to browse the detailed JSON dump. We could then dump a lot more information, and then use the TUI to visualize it in a human-readable way. The TUI should be able to filter host by name, templates and host groups at the bare minimum.
The text was updated successfully, but these errors were encountered:
It would be beneficial to log more information about the hosts to delete/disable in Zabbix, such as their current host groups, templates, etc.
Improved logs
Initially, the focus should be on logging more informations about hosts to disable. The challenge here is the refactoring required to do it consistently. Whether or not we hit the failsafe determines if we enter
ZabbixHostUpdater.disable_host()
or not, thus we cannot rely on adding/expanding the logging statements there.zabbix-auto-config/zabbix_auto_config/processing.py
Lines 1231 to 1239 in a42e0d1
One solution would be to pass more information to
failsafe.check_failsafe_hosts
in the form of thezabbix_hosts
mapping:zabbix-auto-config/zabbix_auto_config/failsafe.py
Lines 12 to 31 in a42e0d1
However, that is not without its own set of problems. Having that function be responsible for logging information about hosts to delete even if we don't trigger the failsafe doesn't make sense.
In all likelyhood, the best solution might be simply looping through each host to delete in
ZabbixHostUpdater.do_update
before we check failsafe and before we attempt to delete the hosts.Thoughts
I need to gather my thoughts on this for a bit. What do we want to achieve with the logging? Is it only to perform a better "post-mortem" once a host has been deleted, or is it to add better introspection before a host has been deleted (i.e. the logging needs to happen before failsafe checks)?
Adding better logging as we delete the host is easy. Adding better logging that also triggers if we hit the failsafe is harder.
More detailed JSON dumps
Another improvement would be dumping more information about the hosts to remove in the
failsafe_hosts.json
file. That would let us filter hosts by theirSource-*
host group to determine which source(s) they come from, and thus more easily troubleshoot a faulty source.TUI
It could be very useful to add a command that displays a TUI with Textual to browse the detailed JSON dump. We could then dump a lot more information, and then use the TUI to visualize it in a human-readable way. The TUI should be able to filter host by name, templates and host groups at the bare minimum.
The text was updated successfully, but these errors were encountered: