Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collection does not seem to work #1485

Closed
sebasmannem opened this issue Jun 6, 2023 · 7 comments
Closed

Garbage collection does not seem to work #1485

sebasmannem opened this issue Jun 6, 2023 · 7 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@sebasmannem
Copy link

Hi,
we are building an operator where we have a higher level object called PAAS, which in turn creates Namespaces, ClusterResourceQuota's, Groups, etc. All works as expected:

  • Creating a Paas also creates namespaces, ClusterResourceQuotas and Groups
  • Removing a Paas also cleans namespaces (we use controllerutil.SetControllerReference(paas, ns, r.Scheme) for that)

But

  • Removing a Paas does not clean ClusterResourceQuotas and Groups

Which is weird because

  • we also use controllerutil.SetControllerReference(paas, quota, r.Scheme) for quotas, and same for the group
  • The expected .metadata.ownerReferences items appears as it should
    It is just that the garbage collector does not seem to work for objects that come from the OpenShift/api project

My questions:

  • Is this expected behavior, or considered a bug?
  • Is there any way to resolve this, and how?
    Note, btw that de quota names are derived from the paas name, and so I was able to write a hack.
    But that hack will not work for Groups, and I don't like the hack (even for quotas)...
@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 5, 2023
@sebasmannem
Copy link
Author

So, since I did not receive any updates on this, my assumption is that this is either a bug with low priority, or it is expected behaviour.
I investigated how this is supposed to work (looking at k8s code) and how it is implemented within OpenShift.
Without help I cannot proceed investigation and/or propose fixes.
I can only update my issue with my personal fixes as into how I implemented it.

  1. Namespaces are automatically cleaned, no surprises there
  2. ArgoCD objects, RoleBindings and Secrets reside within the namespace and are cleaned together with the namespaces. I have not put any efforts into investigating wether this is also cleaned without cleaning the namespace itself.
  3. OpenShift objects (Quota, users, groups,
  4. Other items are within objects (such as items in ArgoCD AppSet listgenerators, and lines in ConfigMap Data) and are managed by custom finaliser code of our operator

@sebasmannem
Copy link
Author

For the OpenShift objects (item 3) I have currently implemented the following custom code:

  • controller detects that is being removed and runs generic finalize code
  • finalize code runs specific finalize codes for specific items (item 3 and 4, Quota, groups, users, and stuff in resources)
  • finalize code for groups deducts which groups where created by our operator (from main resource body) and removes as required. SImilar for other openshift stuff.

I would have loved it to be taken care of out of the box.
That would saved time and be less error prone, but we now have something that works for us.
I hope that this helps anyone else that might run into this.
If any questions, or an ambition to actually fix this, just ping us with comments in this issue.

@sebasmannem
Copy link
Author

Decided to leave this issue stale. Auto-close when rotten is fine...

@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 6, 2023
@openshift-bot
Copy link

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci openshift-ci bot closed this as completed Nov 5, 2023
Copy link
Contributor

openshift-ci bot commented Nov 5, 2023

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

2 participants