-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High CPU usage - #120
Comments
Hi Anto79-ops, thanks for opening this issue. Let me have a look. :) First of all, thanks for the detailed report. That always helps. 👍 However, in order to really dig into this issue, I need a stack trace from within the Python-part of your executable. Something like py-spy will do (also available through piwheels). Like your existing suite of tools, py-spy attaches to a running process. Specifically, I'd like to see This way, we can connect the dots between the busy-looping syscalls seen in Edit: I reread your issue and saw that you already tried with py-spy. I'd like to see the stack trace dumps of that. 👍 ~Frederik |
hi and thanks, @frederikaalund why I forgot to include the py-spy file here, oops. This instance caught the high CPU (stuck) cpu that I speak of above, it said that it collected over 4 millions data points and 700 errors when I stopped it. |
Thanks! Maybe it's just me, but the SVG seems to have lost it's interactivity. In any case, the output of |
Alternatively, try to upload the SVG to a file sharing service. I think the SVG looses it's interactivity when uploaded directly as an image on GitHub. 👍 |
Hi @frederikaalund, While @Anto79-ops gathers data, chiming in to say that I'm the owner of |
thanks @frederikaalund and @bachya! Try getting it from my google drive, here: https://drive.google.com/file/d/1xxhPbsG7cMJY3ORv5DwnWSWy4c-PjcMv/view?usp=sharing does the above work for you? as for
once the stuck CPU happens or do I have to start recording before the issues happens? I ask because the pid of the process is unknown until I start the script, and then wait (hours to days) for the stuck to CPU to appear. |
Thank you. 👍 The SVG from your Google Drive link worked. As for I'll have a look at the SVG later. @bachya Thanks! I'll let you know once I've had a closer look at the stack traces / SVGs. Hope you find asyncio-mqtt useful, btw. 👍 |
Without any doubt – when I started my project, I thought I'd have to wade into the depths of |
@bachya Maybe you'll question that joy in a bit. 😅
|
Ah, okay – I didn't know that. I would expect that when the context manager ends, it closes everything nicely so the same object can be used again... Does something happen during Client
I did this for a couple of reasons:
(2) is ultimately irrelevant, but what about (1)? Do I need to implement my own reconnection logic?
Ah, great point – I can absolutely afford to include |
@frederikaalund One additional question re: ^^^. I noticed your advanced example uses an |
In general, context managers are single-use unless otherwise specified. As for
Yes, for now you would have to. My suggestion is to use a retry loop similar to that found in the advanced example in the readme file. For this to work, you must ensure that exceptions (e.g., due to network errors) propagate up to the retry loop. E.g., via an
You can stick with the built-in primitives. 👍 That's what I do in asyncio-mqtt. That being said, if I had known about anyio (or structured concurrency in general) back when I created this library, then I would have used anyio. No doubt about that. anyio's task groups make everything much easier to reason about. I stick to raw asyncio for now to maintain backwards compatibility (and avoid too many dependencies). There was a discussion about this in the past: #44. Let me know if you two have any other questions. Also, I'd still like to see the log files from a CPU-bound run. Until then, the above is just speculation. EDIT: Fixed link and typo. |
For what it's worth, it does "work" in that most users successfully publish multiple messages with the same client (re-entered). Whether that's correct practice (or causing issues under the surface) is obviously a different matter.
Yep, got it.
Got it. I'm looking for less work, so I'll check out |
@frederikaalund I'd be happy to provide the logs, and as much as I enjoy troubleshooting, I'm not good at it. What do I have to do generate or get those logs for you? Thanks |
FYI, digging in and keeping a single connection open won't work with EDIT: I lied. 😂 bachya/ecowitt2mqtt#236 |
Sorry about the silence—I was on a short vacation. Glad that you figured it out. Feel free to open new issues/discussions/PRs if you find something in anyio-mqtt that you would like to add/change/fix. 👍 |
Hi,
trying to get some help figuring out a problem im having, and have landed here. Running a script (called ecowitt2mqtt) on a RPi 4 Bullseye, that dumps data to my mqtt broker (data it obtains from my local weather station), and then HA discoveres it. The script host (RPi 4), the broker (Ubuntu 22.04), and HA are 3 different instances on the same network. After hours or days, one of the cores on my RPi 4 that is running the script chokes, and runs at 100% (or ~30% CPU total).
Ran a pyspy on the instance, and caught 700 errors, but nothing too conclusive. Running:
$ strace -p <pid> -f -s 4096
on the stuck process, yields this:
and then
$ lsof -p <pid> -n
yields these descriptors:
FYI
192.168.1.130:55201 (the RPi running the script
192.168.1.139:1883 (my mqtt broker)
it has been suggested by someone much more knowledgeable than me that the issue could be here:
https://github.com/sbtinstruments/asyncio-mqtt/blob/6b02071227635fa532698b55c5159755f4e411b2/asyncio_mqtt/client.py#L524
I am running the latest version asyncio-mqtt on the RPi.
anybody know why this resources becomes unavailable and chokes my RPi?
thanks!
The text was updated successfully, but these errors were encountered: