Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slurm file processing not working if an entry contains a wrong network address #127

Closed
DonOtuseGH opened this issue Oct 1, 2024 · 3 comments

Comments

@DonOtuseGH
Copy link

Hello,

we are running some RTRTR instances on Kubernetes clusters using a custom image:

  • Alpine 3.20.3 base image and build env
  • RTRTR 0.3.0, built with:
    • cargo 1.78.0
    • rustc 1.78.0

RTRTR usually runs against our own Routinator instance, but to demonstrate the issue, the following config can be used as well:

rtrtr.conf:

log_level = "debug"
log_target = "stderr"
http-listen = ["0.0.0.0:8323"]
[units.json]
type = "json"
uri = "https://console.rpki-client.org/vrps.json"
refresh = 60
[units.slurm]
type = "slurm"
source = "json"
files = [ "/home/rtrtr/slurm.json" ]
[targets.rtr]
type = "rtr"
listen = [ "0.0.0.0:3323" ]
unit = "slurm"
client-metrics = true
[targets.http]
type = "http"
path = "/json"
format = "json"
unit = "slurm"

We realized that RTRTR does not start correctly, does not process the slurm file at all and does not give an error message if the slurm file contains an invalid network address in the prefix value of prefixAssertions.

slurm.json with wrong entry (10.10.10.164/27 is invalid/wrong, should be 10.10.10.160/27)

{
  "slurmVersion": 1,
  "validationOutputFilters": {
    "prefixFilters": [],
    "bgpsecFilters": []
  },
  "locallyAddedAssertions": {
    "prefixAssertions": [
      {
        "asn": 64546,
        "prefix": "192.168.255.0/24",
        "maxPrefixLength": 24,
        "comment": "RTR Health Check"
      },
      {
        "asn": 65535,
        "prefix": "10.10.10.164/27",
        "maxPrefixLength": 32
      }
    ],
    "bgpsecAssertions": []
  }
}

rtrtr log doesn't show anything about the issue, no slurm file processing, no target information...

[DEBUG] HTTP server listening on 0.0.0.0:8323
[DEBUG] Target http: link status: healthy
[DEBUG] starting new connection: https://console.rpki-client.org/
[DEBUG] RTR: Got reset query.
[DEBUG] Unit json: successfully updated.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] Unit json: successfully updated.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] Unit json: update without changes.
...

local target isn't working (expected result according to the missing log entries from above):

$ rtrclient -e -t csv -o /dev/stdout tcp 127.0.0.1 3323 2>/dev/null | wc -l
===> times out

Of course everything is working fine, if we correct the network address of the prefix to a valid one:

slurm.json with valid entries:

{
  "slurmVersion": 1,
  "validationOutputFilters": {
    "prefixFilters": [],
    "bgpsecFilters": []
  },
  "locallyAddedAssertions": {
    "prefixAssertions": [
      {
        "asn": 64546,
        "prefix": "192.168.255.0/24",
        "maxPrefixLength": 24,
        "comment": "RTR Health Check"
      },
      {
        "asn": 65535,
        "prefix": "10.10.10.160/27",
        "maxPrefixLength": 32
      }
    ],
    "bgpsecAssertions": []
  }
}

rtrtr log looks as expected:

[DEBUG] HTTP server listening on 0.0.0.0:8323
[DEBUG] Target http: link status: healthy
[DEBUG] starting new connection: https://console.rpki-client.org/
[DEBUG] Updated Slurm file /home/rtrtr/slurm.json
[DEBUG] Unit json: successfully updated.
[DEBUG] Unit slurm: file /home/rtrtr/slurm.json: added 2, removed 0.
[DEBUG] Target rtr: Got update (615244 entries)
[DEBUG] Target http: Got update (615244 entries)
[DEBUG] Target http: link status: healthy
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] Unit json: successfully updated.
[DEBUG] Unit slurm: file /home/rtrtr/slurm.json: added 2, removed 0.
[DEBUG] Target rtr: Got update (615246 entries)
[DEBUG] Target http: Got update (615246 entries)
[DEBUG] Target http: link status: healthy
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
[DEBUG] RTR: Got reset query.
...

local target gives the correct count of VRPs:

$ rtrclient -e -t csv -o /dev/stdout tcp 127.0.0.1 3323 2>/dev/null | wc -l
615248
@DonOtuseGH
Copy link
Author

What we would expect

It would be great to have an error message in the log, that there's something wrong, while processing the slurm file. Of course it could be helpful to show the wrong/invalid entries in the log as well. This would simplify troubleshooting considerably, especially if the slurm file contains several hundred locallyAddedAssertions ;-)

@partim
Copy link
Member

partim commented Jan 3, 2025

Apologies for the very late response. I had notification for new PRs turned off during my vacation and forgot to check after.

Currently, the slurm unit delays any processing until the first successful load of the SLURM set and, for some reason, just ignores any error. I was going to release 0.3.1 today, but I am instead going to add logging an error and release another RC so this will get into 0.3.1.

@partim
Copy link
Member

partim commented Jan 6, 2025

Error logging got added in 40ee2ee – I accidentally committed to main instead of making a PR. This commit is part of the 0.3.1-rc3 release.

@partim partim closed this as completed Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants