Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PostgreSQL and Redis existingSecret not respected on ArgoCD sync #450

Open
salcinad opened this issue Dec 16, 2024 · 13 comments
Open

PostgreSQL and Redis existingSecret not respected on ArgoCD sync #450

salcinad opened this issue Dec 16, 2024 · 13 comments
Labels
bug Something isn't working more info More information required from the reporter

Comments

@salcinad
Copy link

salcinad commented Dec 16, 2024

The Helm chart version

5.0.0-beta.167

Environment Versions

Kubernetes: 1.28.8
ArgoCD: v2.13.0+347f221

Custom chart values

        superuser:
          name: admin
          email: [email protected]
          existingSecret: "netbox-superuser"
        existingSecret: "netbox-config"
        postgresql:
          auth:
            existingSecret: "netbox-postgresql"
            secretKeys:
              adminPasswordKey: "postgres-password"
              userPasswordKey: "password"
        redis:
          auth:
            existingSecret: "netbox-redis"
            existingSecretPasswordKey: "redis-password"

Current Behavior & Steps to Reproduce

Tested with different option in valuey.yaml for existing secrets but did it just do not work. Once the secret is overwritten, the worker complains that password is incorrect, and the whole netbox pod does not start.
django.db.utils.OperationalError: connection failed: connection to server at "10.95.18.196", port 5432 failed: FATAL: password authentication failed for user "netbox"

Even if we do not use existingSecrets and let ArgoCD create secrets, they are getting regenerated over every ArgoCD sync. No issue with other applications.

Expected Behavior

The existingSecrets should not be overwriten by netbox chart.

NetBox Logs

Not an NetBox issue.
@salcinad salcinad added the bug Something isn't working label Dec 16, 2024
@LeoColomb
Copy link
Member

Thanks for filing this issue, @salcinad
I see this is a direct follow-up of #420.

Did you manage to get one deployment working? If not, I'd advocate to uninstall completely the chart and try again in a proper deployment.

Even if we do not use existingSecrets and let ArgoCD create secrets, they are getting regenerated over every ArgoCD sync.

That is very surprising, as password management is handled by the chart itself and should keep any previously defined/generated value.
And this is probably an issue on ArgoCD or on your configuration.

The existingSecrets should not be overwriten by netbox chart.

Can you share the secrets as defined in your Kubernetes cluster, before and after an upgrade or a sync? You can, of course, change the actual values before posting in here, but please keep a way to see if the values are the same or different between versions of the secrets.

@LeoColomb LeoColomb added the more info More information required from the reporter label Dec 16, 2024
@salcinad
Copy link
Author

salcinad commented Dec 16, 2024

I see this is a direct follow-up of #420.

Indeed, as that Issue is closed, I did not realize that, sorry for bumping closed issue.

Did you manage to get one deployment working? If not, I'd advocate to uninstall completely the chart and try again in a proper deployment.

First deployment is always Ok. But any other update, it would like to sync netbox-postgres and netbox-redis secrets, other two secrets (netbox-config, netbox-superuser) are fine. And when testing I am aways deleting Application and the namespace on k8s cluster so everything is clear.

That is very surprising, as password management is handled by the chart itself and should keep any previously defined/generated value.
And this is probably an issue on ArgoCD or on your configuration.

This can ofcourse be an ArgoCD issue on our side, but we also have same bitnami chart for standalone postgres and not having this issue, or for example Harbor is also using redis as sub chart (same bitnami chart) and no issue there. We are still looking, but noting obivisly is popuig up.

Can you share the secrets as defined in your Kubernetes cluster, before and after an upgrade or a sync? You can, of course, change the actual values before posting in here, but please keep a way to see if the values are the same or different between versions of the secrets.

Sure, will test tomorrow in different cluster and let provided the info here.

Edit # 1
When I delete the Application in ArgoCD and then also the namespace where netbox is deployed I see following in ArgoCD, even we have Secrets (sealed-secrets, but also tested without sealed-secrets). But even I don't have secrets, and have values.yaml as in first post then pod should complain about missing secrets and not creating its own.

/Secret/netbox/netbox-postgresql
apiVersion: v1
data:
  password: ++++++++
  postgres-password: ++++++++
kind: Secret
metadata:
  labels:
    app.kubernetes.io/instance: netbox
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: postgresql
    app.kubernetes.io/version: 17.2.0
    argocd.argoproj.io/instance: netbox-dev
    helm.sh/chart: postgresql-16.3.1
  name: netbox-postgresql
  namespace: netbox
type: Opaque


/Secret/netbox/netbox-redis
apiVersion: v1
data:
  redis-password: ++++++++
kind: Secret
metadata:
  labels:
    app.kubernetes.io/instance: netbox
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: redis
    app.kubernetes.io/version: 7.4.1
    argocd.argoproj.io/instance: netbox-dev
    helm.sh/chart: redis-20.5.0
  name: netbox-redis
  namespace: netbox
type: Opaque

@LeoColomb LeoColomb changed the title bundled postgres and radis existingSecret are changed on every ArgoCD Sync PostgreSQL and Redis existingSecret not respected on ArgoCD sync Dec 16, 2024
@LeoColomb
Copy link
Member

Thanks for your input.
I'm going a little ahead and suggest some potential root causes.

  • If you first generate secret with the chart and then reference them with existingSecret, this will not work: generated secrets will not be rendered anymore and thus not kept in the cluster, previous version will be deleted.
  • If using auto-management inherited from the chart is not working as expected (i.e. passwords are not kept between upgrade/sync, which is surprising already but might happen), then you must declare first the secrets manually before referencing them into the values.

@salcinad
Copy link
Author

salcinad commented Dec 17, 2024

  • If you first generate secret with the chart and then reference them with existingSecret, this will not work: generated secrets will not be rendered anymore and thus not kept in the cluster, previous version will be deleted.

This is how we did Initial setup, we let chart deploy without setting existingSecret or setting passwords manually. then exported these secrets to yaml, deleted unneeded data from yaml (see below), deleted application, deleted namespace where it was installed, used then existingSecret to reference to these secrets exported yaml, and added them to values.yaml under extraDeploy , updated ArgoCD Application.

$ cat netbox-postgresql-dev.yaml
apiVersion: v1
data:
  password: bXB5bXSd05TVlid05TRQ==
  postgres-password: WG40FpbSlW9YWG40TQ==
kind: Secret
metadata:
  name: netbox-postgresql
  namespace: netbox
type: Opaque

$ cat netbox-redis-dev.yaml
apiVersion: v1
data:
  redis-password: TKZGbjVMW2mpQQ==
kind: Secret
metadata:
  name: netbox-redis
  namespace: netbox
type: Opaque

$

In my comment above the application deployed successfully, the ArgoCD deployed sealed-secrets which are applied to correct namespace in correct k8s cluster and NetBox is available over Ingress and login (superuser - which is also existing secret) its working just fine, all pods up (see below), as well as LDAPs.

$ k get pods -n netbox
NAME                                 READY   STATUS      RESTARTS      AGE
netbox-869547468c-6mdqv              1/1     Running     5 (9h ago)    10h
netbox-housekeeping-28906560-qnc9v   0/1     Completed   0             7h6m
netbox-postgresql-0                  1/1     Running     0             10h
netbox-redis-master-0                1/1     Running     0             10h
netbox-redis-replicas-0              1/1     Running     0             10h
netbox-redis-replicas-1              1/1     Running     0             10h
netbox-redis-replicas-2              1/1     Running     0             10h
netbox-worker-7cc9d556d7-5pffx       1/1     Running     2 (10h ago)   10h
$

But now ArgoCD is showing that this Application is OutOfSync, as there are these two secret which we did not sync in fist name. We also checked the secret in namespace with password in yaml files and are identical.
This is somehow weird, but we are out of Ideas what could be wrong.
grafik

@LeoColomb
Copy link
Member

Thanks for the details.
Can you try following the same procedure but changing the name of the manual secrets?

  1. Raw install
  2. Secrets exports
  3. Clear labels and change the name (e.g. with -2 prefix)
  4. Add existingSecret with the new names

If this still does not work, can you confirm the secrets values do not get base64 encoded twice?

@Delta1977
Copy link

if i read right in first row of secret template ( {{- if not .Values.existingSecret }})
all Secrets were autorendered when existing secret is not set.

@LeoColomb
Copy link
Member

LeoColomb commented Dec 20, 2024

@Delta1977 Not all secrets, each secret has its own condition to be rendered.
In any case, the secrets template in NetBox chart has nothing in relation with bundled Bitnami charts (PostgreSQL / Redis).

@LeoColomb
Copy link
Member

@salcinad Have you been able to resolve the issue?
If not, have you tried the suggested steps?

@salcinad
Copy link
Author

salcinad commented Jan 9, 2025

Thanks for the details. Can you try following the same procedure but changing the name of the manual secrets?

1. Raw install

2. Secrets exports

3. Clear labels and change the name (e.g. with `-2` prefix)

4. Add `existingSecret` with the new names

If this still does not work, can you confirm the secrets values do not get base64 encoded twice?

I did deployed yesterday with helm only and without changing anything just did helm upgrade, when comparing what changed with helm-diff plugin, i see that checksum is different for redis

$ previous_release="$(helm history -n netbox-dev netbox | tail -2 | head -1 | cut -f 1 -d' ')"
$ helm diff -n netbox-dev revision netbox "$previous_release"
netbox-dev, netbox-config, Secret (v1) has changed:
..
    email_password: 'REDACTED # (0 bytes)'
-   secret_key: '-------- # (60 bytes)'
+   secret_key: '++++++++ # (60 bytes)'

...
netbox-dev, netbox-redis-master, StatefulSet (apps) has changed:
..
-         checksum/secret: 7baef362ee982134ba8753456e11074766c9757df9fdd3edf991c121b98d09c2
+         checksum/secret: a9c6331012a7379282f97f715c4e18b77794d282502dbe19cb86b2837279e586

...
netbox-dev, netbox-redis-replicas, StatefulSet (apps) has changed:
..
-         checksum/secret: bc3c09cdec721b1bb1288642521702ab3cec072e6d8e1905937ae067c8449ea8
+         checksum/secret: a9c6331012a7379282f97f715c4e18b77794d282502dbe19cb86b2837279e586

..

netbox-dev, netbox-superuser, Secret (v1) has changed:
..
-   api_token: '-------- # (36 bytes)'
+   api_token: '++++++++ # (36 bytes)'
    email: 'REDACTED # (18 bytes)'
    password: 'REDACTED # (10 bytes)'
    username: 'REDACTED # (5 bytes)'

...

Edit 1:
Also every time when we do diff upgrade, with autogenerated secrets and also with exported and imported secrets with name -2, we see this from redis subchart:

$ helm diff upgrade netbox oci://ghcr.io/netbox-community/netbox-chart/netbox -f values.yaml --version 5.0.9 -n netbox-dev
Error: Failed to render chart: exit status 1: Pulled: ghcr.io/netbox-community/netbox-chart/netbox:5.0.9
Digest: sha256:fadf6bf3368b58a0fe2e7a3318f060e67777b114fc735b659c0f5aebbd0327ec
Error: execution error at (netbox/charts/redis/templates/replicas/application.yaml:52:35):
PASSWORDS ERROR: You must provide your current passwords when upgrading the release.
                 Note that even after reinstallation, old credentials may be needed as they may be kept in persistent volume claims.
                 Further information can be obtained at https://docs.bitnami.com/general/how-to/troubleshoot-helm-chart-issues/#credential-errors-while-upgrading-chart-releases

    'global.redis.password' must not be empty, please add '--set global.redis.password=$REDIS_PASSWORD' to the command. To get the current value:

        export REDIS_PASSWORD=$(kubectl get secret --namespace "netbox-dev" netbox-redis -o jsonpath="{.data.redis-password}" | base64 -d)


Use --debug flag to render out invalid YAML

Error: plugin "diff" exited with error
$

Without helm diff upgrade, we do not see this about redis, so we hope this is safe to ignore. Only this warning:

WARNING: There are "resources" sections in the chart not set. Using "resourcesPreset" is not recommended for production. For production installations, please set the following values according to your workload needs:
  - resources
  - worker.resources
+info https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

I am checking that with point 3. from your recommendation.

@LeoColomb
Copy link
Member

@salcinad Thanks for your replies, but I'm afraid none of this makes sense.
Everything is pointing to a misconfiguration on your side, though, and the only possible last step would be dumping your complete values.yaml and the output of a helm template with these values.
Could you do this into a Gist or similar?

@salcinad
Copy link
Author

salcinad commented Jan 20, 2025

Today I did test on production with -2 prefix and pushed this to our gitlab repository, but argo-cd only show me that the existingSecret for supervisor and config are changed but not the one from subchart of postresql and redis, which is not true. I did export all secrets and changed the name and removed metadata simular like on this comment

Image

$ k get secrets -n netbox
NAME                                 TYPE                       DATA   AGE
netbox-config                        Opaque                     3      34d
netbox-config-2                      Opaque                     3      7h24m
netboxXXXXXXXXXXXX-tls   kubernetes.io/tls          2      34d
netbox-postgresql                    Opaque                     2      34d
netbox-postgresql-2                  Opaque                     2      7h24m
netbox-redis                         Opaque                     1      34d
netbox-redis-2                       Opaque                     1      7h24m
netbox-superuser                     kubernetes.io/basic-auth   4      34d
netbox-superuser-2                   kubernetes.io/basic-auth   4      7h24m
$

and this is part from values.yaml

       superuser:
          name: admin
          email: XXXXXX
          existingSecret: "netbox-superuser-2"
        existingSecret: "netbox-config-2"
        postgresql:
          auth:
            existingSecret: "netbox-postgresql-2"
            secretKeys:
              adminPasswordKey: "postgres-password"
              userPasswordKey: "password"
        redis:
          auth:
            existingSecret: "netbox-redis-2"
            existingSecretPasswordKey: "redis-password"

As the postgresql and redis part is just ignored or not formated properly, but the indent is same for supervisor, existingSecret(config), postgresql, redis..

@LeoColomb
Copy link
Member

Thank you for your reply, @salcinad.
I'm afraid, again, it's nearly impossible to extract any valuable information from this to diagnostic and provide support.
This helm chart is reaching 10k installations, and your report only so far about this fundamental problem…
Everything is pointing to a misconfiguration on your side, though, and the only possible last step would be dumping your complete values.yaml and the output of a helm template with these values.
Could you do this into a Gist or similar?

@Delta1977

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working more info More information required from the reporter
Projects
None yet
Development

No branches or pull requests

3 participants