Skip to content
This repository has been archived by the owner on Oct 20, 2022. It is now read-only.

Cluster scaling issue #129

Open
riccardo-salamanna opened this issue Sep 2, 2021 · 2 comments
Open

Cluster scaling issue #129

riccardo-salamanna opened this issue Sep 2, 2021 · 2 comments

Comments

@riccardo-salamanna
Copy link

Bug Report

What did you do?
Trying to scale up and scale down a cluster
What did you expect to see?
The cluster scaling up and caling down when i add nodes
What did you see instead? Under which circumstances?
The scaling down does not happen, only the scaling up (and also every other configuration change does trigger a refresh). The log of the operator are also filled with errors and it's CPU usage does spike and stay high.

Environment

  • nifikop version: 0.6.3-release

  • go version:

  • Kubernetes version information:1.19

  • Kubernetes cluster kind: EKS

  • NiFi version: tried multiple, 1.11.3, 1.12.1, 1.13.2

Possible Solution
I sincerely do not know.

Additional context
here's the output log for the operator pod

2021-09-02T17:21:20.541Z	ERROR	nifi_client	Error during preparing the request	{"error": "The target node id doesn't exist in the cluster", "errorVerbose": "The target node id doesn't exist in the cluster\ngithub.com/Orange-OpenSource/nifikop/pkg/nificlient.init\n\t/workspace/pkg/nificlient/common.go:27\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:5652\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:5647\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:5647\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:5647\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:191\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1374"}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:132
github.com/Orange-OpenSource/nifikop/pkg/nificlient.(*nifiClient).GetClusterNode
	/workspace/pkg/nificlient/system.go:49
github.com/Orange-OpenSource/nifikop/pkg/clientwrappers/scale.CheckIfNCActionStepFinished
	/workspace/pkg/clientwrappers/scale/scale.go:166
github.com/Orange-OpenSource/nifikop/controllers.(*NifiClusterTaskReconciler).checkNCActionStep
	/workspace/controllers/nificlustertask_controller.go:324
github.com/Orange-OpenSource/nifikop/controllers.(*NifiClusterTaskReconciler).handlePodRunningTask
	/workspace/controllers/nificlustertask_controller.go:251
github.com/Orange-OpenSource/nifikop/controllers.(*NifiClusterTaskReconciler).Reconcile
	/workspace/controllers/nificlustertask_controller.go:89
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:198
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155
k8s.io/apimachinery/pkg/util/wait.BackoffUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185
k8s.io/apimachinery/pkg/util/wait.UntilWithContext
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99
@riccardo-salamanna
Copy link
Author

actually even after a simple deployment, of a simple cluster example, the cpu does spike and the logs start filling with these issues in a constant loop:

2021-09-02T17:34:38.407Z	INFO	controllers.NifiClusterTask	Nifi cluster task is still running	{"actionStep": "CONNECTED"}
2021-09-02T17:34:38.407Z	INFO	controllers.NifiClusterTask	nc action step: CONNECTED: Nifi cluster task is still running
2021-09-02T17:34:38.407Z	ERROR	controller-runtime.manager.controller.nificluster	Reconciler error	{"reconciler group": "nifi.orange.com", "reconciler kind": "NifiCluster", "name": "nifikop-dev", "namespace": "nifi", "error": "nc action step: CONNECTED: Nifi cluster task is still running", "errorVerbose": "Nifi cluster task is still running\ngithub.com/Orange-OpenSource/nifikop/controllers.(*NifiClusterTaskReconciler).checkNCActionStep\n\t/workspace/controllers/nificlustertask_controller.go:389\ngithub.com/Orange-OpenSource/nifikop/controllers.(*NifiClusterTaskReconciler).handlePodRunningTask\n\t/workspace/controllers/nificlustertask_controller.go:251\ngithub.com/Orange-OpenSource/nifikop/controllers.(*NifiClusterTaskReconciler).Reconcile\n\t/workspace/controllers/nificlustertask_controller.go:89\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:198\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1374\nnc action step: CONNECTED"}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:132
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:267
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:198
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155
k8s.io/apimachinery/pkg/util/wait.BackoffUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185
k8s.io/apimachinery/pkg/util/wait.UntilWithContext
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99

@bruckwubete
Copy link

i am also seeing this on EKS deployment.

2021-10-18T04:58:10.955Z	DEBUG	controllers.NifiCluster	resource is in sync	{"component": "nifi", "clusterName": "simplenifi", "clusterNamespace": "nifi", "kind": "*v1.Pod"}
2021-10-18T04:58:10.956Z	ERROR	nifi_client	Error during preparing the request	{"error": "The target node id doesn't exist in the cluster", "errorVerbose": "The target node id doesn't exist in the cluster\ngithub.com/Orange-OpenSource/nifikop/pkg/nificlient.init\n\t/workspace/pkg/nificlient/common.go:27\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:5652\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:5647\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:5647\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:5647\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:5647\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:5647\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:5647\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:191\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1374"}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:132
github.com/Orange-OpenSource/nifikop/pkg/nificlient.(*nifiClient).GetClusterNode
	/workspace/pkg/nificlient/system.go:65
github.com/Orange-OpenSource/nifikop/pkg/clientwrappers/scale.CheckIfNCActionStepFinished
	/workspace/pkg/clientwrappers/scale/scale.go:166
github.com/Orange-OpenSource/nifikop/controllers.(*NifiClusterTaskReconciler).checkNCActionStep
	/workspace/controllers/nificlustertask_controller.go:367
github.com/Orange-OpenSource/nifikop/controllers.(*NifiClusterTaskReconciler).handlePodRunningTask
	/workspace/controllers/nificlustertask_controller.go:280
github.com/Orange-OpenSource/nifikop/controllers.(*NifiClusterTaskReconciler).Reconcile
	/workspace/controllers/nificlustertask_controller.go:91
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:198
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155
k8s.io/apimachinery/pkg/util/wait.BackoffUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185
k8s.io/apimachinery/pkg/util/wait.UntilWithContext
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99
2021-10-18T04:58:10.956Z	INFO	controllers.NifiClusterTask	Nifi cluster task is still running	{"actionStep": "CONNECTED"}
2021-10-18T04:58:10.956Z	INFO	controllers.NifiClusterTask	nc action step: CONNECTED: Nifi cluster task is still running

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants