Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT]: add support for some Tegra NVDEC/NVENC/VIC monitoring, generic GPU usage, and EMC usage #1322

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 12 additions & 7 deletions data/MangoHud.conf
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,10 @@
gpu_stats
# gpu_temp
# gpu_junction_temp
# gpu_core_clock
gpu_core_clock
gpu_nvdec_clock
gpu_nvenc_clock
gpu_vic_clock
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't these just fit under the gpu_core_clock param?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdym? like enable all of them if only the gpu_core_clock param is enabled? I thought the granularity would be nice since not all users will want to waste space showing all of them if they only want one.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might not be understand the structure correct.
Can all 3-4 of these be monitored at the same time?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are different hardware units. their clocks are different and independent

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I fully understand, one device will only have one of these?
In that case I don't see a point in the granularity

Copy link
Author

@theofficialgman theofficialgman May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do not understand. no

these are different hardware engines on the same SOC. nvdec (nvidia decode engine), nvenc (nvidia encode engine), and vic (nvidia video image compositor) are all present on the same SOC.

example during chromium hardware accelerated video decode and playback:

Screenshot_20240528_133546

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for clarifying. It seems fine to have different params for this.
As for how it's displayed, this seems a bit clunky but I also don't know a better way to do it

# gpu_mem_temp
# gpu_mem_clock
# gpu_power
Expand All @@ -94,7 +97,7 @@ gpu_stats

### Display the current CPU information
cpu_stats
# cpu_temp
cpu_temp
# cpu_power
# cpu_text=
# cpu_mhz
Expand All @@ -103,7 +106,7 @@ cpu_stats
# cpu_load_color=39F900,FDFD09,B22222

### Display the current CPU load & frequency for each core
# core_load
core_load
# core_load_change

### Display IO read and write for the app (not system)
Expand All @@ -112,7 +115,9 @@ cpu_stats

### Display system vram / ram / swap space usage
# vram
# ram
ram
ram_clock
ram_bandwidth
# swap

### Display per process memory usage
Expand Down Expand Up @@ -147,9 +152,9 @@ throttling_status
#throttling_status_graph

### Display miscellaneous information
# engine_version
engine_version
# engine_short_names
# gpu_name
gpu_name
# vulkan_driver
# wine
# exec_name
Expand Down Expand Up @@ -196,7 +201,7 @@ frame_timing
# show_fps_limit

### Display the current resolution
# resolution
resolution

### Display custom text
# custom_text=
Expand Down
25 changes: 23 additions & 2 deletions src/cpu.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -533,8 +533,29 @@ bool CPUStats::GetCpuFile() {
}
}
if (path.empty() || (!file_exists(input) && !find_fallback_input(path, "temp", input))) {
SPDLOG_ERROR("Could not find cpu temp sensor location");
return false;
std::string type, path, input;
std::string thermal = "/sys/class/thermal/";

auto dirs = ls(thermal.c_str());
for (auto& dir : dirs) {
path = thermal + dir;
type = read_line(path + "/type");
SPDLOG_DEBUG("thermal: sensor type: {}", type);

if (type == "CPU-therm") {
input = path + "/temp";
break;
} else {
path.clear();
}
}
if (path.empty() || (!file_exists(input) && !find_fallback_input(path, "temp", input))) {
SPDLOG_ERROR("Could not find cpu temp sensor location");
return false;
} else {
SPDLOG_DEBUG("thermal: using input: {}", input);
m_cpuTempFile = fopen(input.c_str(), "r");
}
} else {
SPDLOG_DEBUG("hwmon: using input: {}", input);
m_cpuTempFile = fopen(input.c_str(), "r");
Expand Down
45 changes: 45 additions & 0 deletions src/gpu.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@

#include "amdgpu.h"

#include "file_utils.h"

using namespace std::chrono_literals;

struct gpuInfo gpu_info {};
Expand Down Expand Up @@ -83,6 +85,49 @@ void getNvidiaGpuInfo(const struct overlay_params& params){
#ifdef _WIN32
nvapi_util();
#endif

// FIXME: generic GPU sensor data
// load
gpu_info.load = std::stoi(read_line("/sys/devices/gpu.0/load")) / 10;

// temporary strings
std::string type, path, input;

// temperature
// this runs every reading. needs to be changed to be like cpu.cpp where the location is stored
std::string thermal = "/sys/class/thermal/";
for (auto& dir : ls(thermal.c_str())) {
path = thermal + dir;
type = read_line(path + "/type");
if (type == "GPU-therm") {
input = path + "/temp";
gpu_info.temp = std::stoi(read_line(input)) / 1000;
break;
} else {
path.clear();
}
}

// gpu clocks
// this runs every reading. needs to be changed to be like cpu.cpp where the location is stored
std::string devfreq = "/sys/devices/gpu.0/devfreq/";
for (auto& dir : ls(devfreq.c_str())) {
path = devfreq + dir;
input = path + "/cur_freq";
gpu_info.CoreClock = std::stoi(read_line(input)) / 1000000 ;
break;
}

// nvdev/nvenc/vic clocks
if (file_exists("/sys/kernel/debug/clk/nvdec/clk_rate")) {
if (read_line("/sys/kernel/debug/clk/nvdec/clk_state") == "1") gpu_info.NVDECClock = std::stoi(read_line("/sys/kernel/debug/clk/nvdec/clk_rate")) / 1000000 ; else gpu_info.NVDECClock = 0 ;
}
if (file_exists("/sys/kernel/debug/clk/nvenc/clk_rate")) {
if (read_line("/sys/kernel/debug/clk/nvenc/clk_state") == "1") gpu_info.NVENCClock = std::stoi(read_line("/sys/kernel/debug/clk/nvenc/clk_rate")) / 1000000 ; else gpu_info.NVENCClock = 0 ;
}
if (file_exists("/sys/kernel/debug/clk/vic03/clk_rate")) {
if (read_line("/sys/kernel/debug/clk/vic03/clk_state") == "1") gpu_info.VICClock = std::stoi(read_line("/sys/kernel/debug/clk/vic03/clk_rate")) / 1000000 ; else gpu_info.VICClock = 0 ;
}
}

void getAmdGpuInfo(){
Expand Down
3 changes: 3 additions & 0 deletions src/gpu.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ struct gpuInfo{
float memoryTotal;
int MemClock;
int CoreClock;
int NVDECClock;
int NVENCClock;
int VICClock;
float powerUsage;
float apu_cpu_power;
int apu_cpu_temp;
Expand Down
85 changes: 85 additions & 0 deletions src/hud_elements.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,39 @@ void HudElements::gpu_stats(){
HUDElements.TextColored(HUDElements.colors.text, "mV");
ImGui::PopFont();
}
if (HUDElements.params->enabled[OVERLAY_PARAM_ENABLED_gpu_nvdec_clock]){
ImGui::TableNextColumn();
ImGui::TextColored(HUDElements.colors.gpu, "%s", "NVDEC");
ImGui::TableNextColumn();
ImguiNextColumnOrNewRow();
right_aligned_text(text_color, HUDElements.ralign_width, "%i", gpu_info.NVDECClock);
ImGui::SameLine(0, 1.0f);
ImGui::PushFont(HUDElements.sw_stats->font1);
ImGui::Text("MHz");
ImGui::PopFont();
}
if (HUDElements.params->enabled[OVERLAY_PARAM_ENABLED_gpu_nvenc_clock]){
ImGui::TableNextColumn();
ImGui::TextColored(HUDElements.colors.gpu, "%s", "NVENC");
ImGui::TableNextColumn();
ImguiNextColumnOrNewRow();
right_aligned_text(text_color, HUDElements.ralign_width, "%i", gpu_info.NVENCClock);
ImGui::SameLine(0, 1.0f);
ImGui::PushFont(HUDElements.sw_stats->font1);
ImGui::Text("MHz");
ImGui::PopFont();
}
if (HUDElements.params->enabled[OVERLAY_PARAM_ENABLED_gpu_vic_clock]){
ImGui::TableNextColumn();
ImGui::TextColored(HUDElements.colors.gpu, "%s", "VIC");
ImGui::TableNextColumn();
ImguiNextColumnOrNewRow();
right_aligned_text(text_color, HUDElements.ralign_width, "%i", gpu_info.VICClock);
ImGui::SameLine(0, 1.0f);
ImGui::PushFont(HUDElements.sw_stats->font1);
ImGui::Text("MHz");
ImGui::PopFont();
}
}
}

Expand Down Expand Up @@ -545,6 +578,22 @@ void HudElements::ram(){
}
}

if (HUDElements.params->enabled[OVERLAY_PARAM_ENABLED_ram_clock]){
ImguiNextColumnOrNewRow();
right_aligned_text(HUDElements.colors.text, HUDElements.ralign_width, "%i", memclock);
ImGui::SameLine(0, 1.0f);
ImGui::PushFont(HUDElements.sw_stats->font1);
ImGui::Text("MHz");
ImGui::PopFont();
}

if (HUDElements.params->enabled[OVERLAY_PARAM_ENABLED_ram_bandwidth]){
ImguiNextColumnOrNewRow();
right_aligned_text(HUDElements.colors.text, HUDElements.ralign_width, "%i", membandwidth);
ImGui::SameLine(0, 1.0f);
ImGui::TextColored(HUDElements.colors.text,"%%");
}

if (HUDElements.params->enabled[OVERLAY_PARAM_ENABLED_ram] && HUDElements.params->enabled[OVERLAY_PARAM_ENABLED_swap]){
ImguiNextColumnOrNewRow();
right_aligned_text(HUDElements.colors.text, HUDElements.ralign_width, "%.1f", swapused);
Expand Down Expand Up @@ -1322,6 +1371,42 @@ void HudElements::graphs(){
HUDElements.TextColored(HUDElements.colors.engine, "%s", "GPU Core Clock");
}

if (value == "gpu_nvdec_clock"){
for (auto& it : graph_data){
arr.push_back(float(it.gpu_nvdec_clock));
}
if (int(arr.back()) > HUDElements.gpu_nvdec_max)
HUDElements.gpu_nvdec_max = arr.back();

HUDElements.max = HUDElements.gpu_nvdec_max;
HUDElements.min = 0;
ImGui::TextColored(HUDElements.colors.engine, "%s", "NVDEC Clock");
}

if (value == "gpu_nvenc_clock"){
for (auto& it : graph_data){
arr.push_back(float(it.gpu_nvenc_clock));
}
if (int(arr.back()) > HUDElements.gpu_nvenc_max)
HUDElements.gpu_nvenc_max = arr.back();

HUDElements.max = HUDElements.gpu_nvenc_max;
HUDElements.min = 0;
ImGui::TextColored(HUDElements.colors.engine, "%s", "NVENC Clock");
}

if (value == "gpu_vic_clock"){
for (auto& it : graph_data){
arr.push_back(float(it.gpu_vic_clock));
}
if (int(arr.back()) > HUDElements.gpu_vic_max)
HUDElements.gpu_vic_max = arr.back();

HUDElements.max = HUDElements.gpu_vic_max;
HUDElements.min = 0;
ImGui::TextColored(HUDElements.colors.engine, "%s", "VIC Clock");
}

if (value == "gpu_mem_clock"){
for (auto& it : graph_data){
arr.push_back(float(it.gpu_mem_clock));
Expand Down
4 changes: 2 additions & 2 deletions src/hud_elements.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,9 @@ class HudElements{
std::vector<Function> ordered_functions;
std::vector<float> gamescope_debug_latency {};
std::vector<float> gamescope_debug_app {};
int min, max, gpu_core_max, gpu_mem_max, cpu_temp_max, gpu_temp_max;
int min, max, gpu_core_max, gpu_nvdec_max, gpu_nvenc_max, gpu_vic_max, gpu_mem_max, cpu_temp_max, gpu_temp_max;
const std::vector<std::string> permitted_params = {
"gpu_load", "cpu_load", "gpu_core_clock", "gpu_mem_clock",
"gpu_load", "cpu_load", "gpu_core_clock", "gpu_nvdec_clock", "gpu_nvenc_max", "gpu_vic_max", "gpu_mem_clock",
"vram", "ram", "cpu_temp", "gpu_temp"
};
std::vector<exec_entry> exec_list;
Expand Down
3 changes: 3 additions & 0 deletions src/logging.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ struct logData{
int cpu_temp;
int gpu_temp;
int gpu_core_clock;
int gpu_nvdec_clock;
int gpu_nvenc_clock;
int gpu_vic_clock;
int gpu_mem_clock;
int gpu_power;
float gpu_vram_used;
Expand Down
9 changes: 8 additions & 1 deletion src/memory.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,11 @@
#include <thread>
#include <unistd.h>

#include "file_utils.h"

struct memory_information mem_info;
float memused, memmax, swapused, swapmax;
int memclock, membandwidth;
struct process_mem proc_mem {};

FILE *open_file(const char *file, int *reported) {
Expand Down Expand Up @@ -97,7 +100,11 @@ void update_meminfo(void) {

memused = (float(mem_info.memmax) - float(mem_info.memeasyfree)) / (1024 * 1024);
memmax = float(mem_info.memmax) / (1024 * 1024);


if (file_exists("/sys/kernel/debug/clk/emc/clk_rate")) {
memclock = std::stoi(read_line("/sys/kernel/debug/clk/emc/clk_rate")) / 1000000 ;
membandwidth = std::stoi(read_line("/sys/kernel/actmon_avg_activity/mc_all")) / memclock / 10 ;
}
swapused = (float(mem_info.swapmax) - float(mem_info.swapfree)) / (1024 * 1024);
swapmax = float(mem_info.swapmax) / (1024 * 1024);

Expand Down
1 change: 1 addition & 0 deletions src/memory.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
#include <thread>

extern float memused, memmax, swapused, swapmax, rss;
extern int memclock, membandwidth;

struct memory_information {
/* memory information in kilobytes */
Expand Down
3 changes: 3 additions & 0 deletions src/overlay.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,9 @@ void update_hw_info(const struct overlay_params& params, uint32_t vendorID)
currentLogData.gpu_load = gpu_info.load;
currentLogData.gpu_temp = gpu_info.temp;
currentLogData.gpu_core_clock = gpu_info.CoreClock;
currentLogData.gpu_nvdec_clock = gpu_info.NVDECClock;
currentLogData.gpu_nvenc_clock = gpu_info.NVENCClock;
currentLogData.gpu_vic_clock = gpu_info.VICClock;
currentLogData.gpu_mem_clock = gpu_info.MemClock;
currentLogData.gpu_vram_used = gpu_info.memoryUsed;
currentLogData.gpu_power = gpu_info.powerUsage;
Expand Down
15 changes: 12 additions & 3 deletions src/overlay_params.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -700,16 +700,25 @@ parse_overlay_env(struct overlay_params *params,
static void set_param_defaults(struct overlay_params *params){
params->enabled[OVERLAY_PARAM_ENABLED_fps] = true;
params->enabled[OVERLAY_PARAM_ENABLED_frame_timing] = true;
params->enabled[OVERLAY_PARAM_ENABLED_core_load] = false;
params->enabled[OVERLAY_PARAM_ENABLED_core_load] = true;
params->enabled[OVERLAY_PARAM_ENABLED_core_bars] = false;
params->enabled[OVERLAY_PARAM_ENABLED_cpu_temp] = false;
params->enabled[OVERLAY_PARAM_ENABLED_cpu_temp] = true;
params->enabled[OVERLAY_PARAM_ENABLED_cpu_power] = false;
params->enabled[OVERLAY_PARAM_ENABLED_gpu_temp] = false;
params->enabled[OVERLAY_PARAM_ENABLED_gpu_junction_temp] = false;
params->enabled[OVERLAY_PARAM_ENABLED_gpu_mem_temp] = false;
params->enabled[OVERLAY_PARAM_ENABLED_cpu_stats] = true;
params->enabled[OVERLAY_PARAM_ENABLED_gpu_stats] = true;
params->enabled[OVERLAY_PARAM_ENABLED_ram] = false;
params->enabled[OVERLAY_PARAM_ENABLED_gpu_core_clock] = true;
params->enabled[OVERLAY_PARAM_ENABLED_gpu_nvdec_clock] = true;
params->enabled[OVERLAY_PARAM_ENABLED_gpu_nvenc_clock] = true;
params->enabled[OVERLAY_PARAM_ENABLED_gpu_vic_clock] = true;
params->enabled[OVERLAY_PARAM_ENABLED_engine_version] = true;
params->enabled[OVERLAY_PARAM_ENABLED_gpu_name] = true;
params->enabled[OVERLAY_PARAM_ENABLED_resolution] = true;
params->enabled[OVERLAY_PARAM_ENABLED_ram] = true;
params->enabled[OVERLAY_PARAM_ENABLED_ram_clock] = true;
params->enabled[OVERLAY_PARAM_ENABLED_ram_bandwidth] = true;
params->enabled[OVERLAY_PARAM_ENABLED_swap] = false;
params->enabled[OVERLAY_PARAM_ENABLED_vram] = false;
params->enabled[OVERLAY_PARAM_ENABLED_read_cfg] = false;
Expand Down
5 changes: 5 additions & 0 deletions src/overlay_params.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ typedef unsigned long KeySym;
OVERLAY_PARAM_BOOL(cpu_stats) \
OVERLAY_PARAM_BOOL(gpu_stats) \
OVERLAY_PARAM_BOOL(ram) \
OVERLAY_PARAM_BOOL(ram_clock) \
OVERLAY_PARAM_BOOL(ram_bandwidth) \
OVERLAY_PARAM_BOOL(swap) \
OVERLAY_PARAM_BOOL(vram) \
OVERLAY_PARAM_BOOL(procmem) \
Expand All @@ -52,6 +54,9 @@ typedef unsigned long KeySym;
OVERLAY_PARAM_BOOL(io_write) \
OVERLAY_PARAM_BOOL(gpu_mem_clock) \
OVERLAY_PARAM_BOOL(gpu_core_clock) \
OVERLAY_PARAM_BOOL(gpu_nvdec_clock) \
OVERLAY_PARAM_BOOL(gpu_nvenc_clock) \
OVERLAY_PARAM_BOOL(gpu_vic_clock) \
OVERLAY_PARAM_BOOL(gpu_power) \
OVERLAY_PARAM_BOOL(arch) \
OVERLAY_PARAM_BOOL(media_player) \
Expand Down