Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unknown func bpf_perf_event_read_value#55 in eBPF module since 0.7.10 #1483

Open
rootfs opened this issue May 31, 2024 · 3 comments
Open

unknown func bpf_perf_event_read_value#55 in eBPF module since 0.7.10 #1483

rootfs opened this issue May 31, 2024 · 3 comments
Assignees
Labels
kind/bug report bug issue

Comments

@rootfs
Copy link
Contributor

rootfs commented May 31, 2024

What happened?

In 0.7.10 and latest, ebpf module load crashed with the following trace. It looks the root cause is unknown func bpf_perf_event_read_value#55

This doesn't happen in 0.7.8 and older.

libbpf: prog 'kepler_sched_switch_trace': -- BEGIN PROG LOAD LOG --
; if (SAMPLE_RATE > 0) {
0: (18) r2 = 0xffff9fa324c082d0
2: (61) r2 = *(u32 *)(r2 +0)
 R1=ctx(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=4,vs=12,imm=0) R10=fp0
3: (67) r2 <<= 32
4: (c7) r2 s>>= 32
5: (b7) r4 = 1
; if (SAMPLE_RATE > 0) {
6: (6d) if r4 s> r2 goto pc+13
last_idx 6 first_idx 0
regs=10 stack=0 before 5: (b7) r4 = 1
last_idx 6 first_idx 0
regs=4 stack=0 before 5: (b7) r4 = 1
regs=4 stack=0 before 4: (c7) r2 s>>= 32
regs=4 stack=0 before 3: (67) r2 <<= 32
regs=4 stack=0 before 2: (61) r2 = *(u32 *)(r2 +0)
; prev_pid = ctx->prev_pid;
        app.kubernetes.io/name: kepler-exporter
20: (61) r1 = *(u32 *)(r1 +24)
21: (7b) *(u64 *)(r10 -160) = r1
; prev_pid = ctx->prev_pid;
22: (63) *(u32 *)(r10 -20) = r1
; pid_tgid = bpf_get_current_pid_tgid();
23: (85) call bpf_get_current_pid_tgid#14
24: (bf) r6 = r0
; cur_pid = pid_tgid & 0xffffffff;
25: (63) *(u32 *)(r10 -28) = r6
; cgroup_id = bpf_get_current_cgroup_id();
26: (85) call bpf_get_current_cgroup_id#80
27: (7b) *(u64 *)(r10 -184) = r0
; cpu_id = bpf_get_smp_processor_id();
28: (85) call bpf_get_smp_processor_id#8
29: (bf) r9 = r0
; cpu_id = bpf_get_smp_processor_id();
30: (63) *(u32 *)(r10 -24) = r9
; cur_ts = bpf_ktime_get_ns();
31: (85) call bpf_ktime_get_ns#5
32: (bf) r8 = r0
33: (b7) r7 = 0
; struct bpf_perf_event_value c = {};
34: (7b) *(u64 *)(r10 -128) = r7
last_idx 34 first_idx 32
regs=80 stack=0 before 33: (b7) r7 = 0
35: (7b) *(u64 *)(r10 -136) = r7
36: (7b) *(u64 *)(r10 -144) = r7
; &cpu_cycles_event_reader, *cpu_id, &c, sizeof(c));
37: (67) r9 <<= 32
38: (77) r9 >>= 32
39: (bf) r3 = r10
; prev_pid = ctx->prev_pid;
40: (07) r3 += -144
; error = bpf_perf_event_read_value(
41: (18) r1 = 0xffff9f8bacc80000
43: (bf) r2 = r9
44: (b7) r4 = 24
45: (85) call bpf_perf_event_read_value#55
unknown func bpf_perf_event_read_value#55
processed 31 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1

What did you expect to happen?

This happens in 0.7.10 and latest.

How can we reproduce it (as minimally and precisely as possible)?

It happens on ubuntu 5.4 kernels

Anything else we need to know?

No response

Kepler image tag

0.7.10

Kubernetes version

$ kubectl version
# paste output here

v1.27.3

Cloud provider or bare metal

kind

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
5.4.0-164-generic #181-Ubuntu SMP

Install tools

Kepler deployment config

For on kubernetes:

$ KEPLER_NAMESPACE=kepler

# provide kepler configmap
$ kubectl get configmap kepler-cfm -n ${KEPLER_NAMESPACE} 
# paste output here

# provide kepler deployment description
$ kubectl describe deployment kepler-exporter -n ${KEPLER_NAMESPACE} 

For standalone:

put your Kepler command argument here

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@rootfs rootfs added the kind/bug report bug issue label May 31, 2024
@rootfs
Copy link
Contributor Author

rootfs commented Jun 1, 2024

here is the explanation

@sthaha
Copy link
Collaborator

sthaha commented Jun 2, 2024

I will take this up.

@dave-tucker
Copy link
Collaborator

See: #1398

With these changes applied the minimum supported kernel version for Kepler is 5.12 due to:

bpf_read_perf_event_value - which is available in tracepoint contexts in 5.12
bpf fentry/fexit programs - which added in 5.11

I think this is a pretty reasonable trade off if you read the man page of bpf_perf_event_read_value

If you really want 5.4 then we can discuss that as it's not trivial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug report bug issue
Projects
None yet
Development

No branches or pull requests

3 participants