Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes and Improvements #64

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from

Conversation

Martinski4GitHub
Copy link

@Martinski4GitHub Martinski4GitHub commented Mar 4, 2025

  1. IMPROVED: Modified all SQLite3 calls to capture and log errors in the system log.

  2. IMPROVED: Modified SQLite3 configuration parameters to improve the trimming of records from the database and then perform "garbage collection" of deleted entries to reclaim unused space & avoid excessive fragmentation.

  3. IMPROVED: Modified SQLite3 configuration parameters to improve the processing of database records.

  4. IMPROVED: Modified code to set the corresponding priority level of log entries when calling the built-in logger utility.

  5. IMPROVED: Modified the startup call made in the post-mount script to check if the USB-attached disk partition passed as argument has indeed Entware installed.

  6. IMPROVED: Added code to show the current database file size information on the CLI menu and the webGUI page.

  7. IMPROVED: Added code to show the "JFFS Available" space information for the "Data Storage Location" option on the CLI menu and the webGUI page.

  8. IMPROVED: Added code to check if sufficient JFFS storage space is available before moving database-related files/folders from the USB location to the JFFS partition. An error message is reported if not enough space is available, and the move request is aborted.

  9. IMPROVED: Added code to check if the available JFFS storage space falls below 20% of total space or 10MB (whichever is lower) and report a warning when it does. A warning message is also shown on the SSH CLI menu and WebGUI page.

  10. IMPROVED: Added and modified code so that every time the SSH CLI menu is run, it checks if the WebGUI page has already been mounted. If not found mounted, the script will run the code to remount the WebGUI.

  11. IMPROVED: Improved code that creates (during installation) and removes (during uninstallation) the "AddOns" menu tab entry for the WebGUI to make sure it checks for and takes into account other add-ons that may have been installed before or were later installed after the initial installation.

  12. IMPROVED: Added "export PATH" statement to give the built-in binaries higher priority than the equivalent Entware binaries.

  13. FIXED: Modified code to correctly detect when a WireGuard interface is up (connected) or down (disconnected) in addition to being not enabled. Since a WireGuard connection is essentially stateless, the new code improves on the detection method to provide the correct status so the user can select the desired interface to run a speed test.
    NOTE: This fix was provided by @ExtremeFiretop.

  14. Miscellaneous code improvements & fine-tuning.

1) IMPROVED: Modified all SQLite3 calls to capture and log errors in the system log.

2) IMPROVED: Modified SQLite3 configuration parameters to improve the trimming of records from the database and then perform "garbage collection" of deleted entries to reclaim unused space & avoid excessive fragmentation.

3) IMPROVED: Modified SQLite3 configuration parameters to improve the processing of database records.

4) IMPROVED: Modified code to set the corresponding priority level of log entries when calling the built-in logger utility.

5) IMPROVED: Modified the startup call made in the post-mount script to check if the USB-attached disk partition passed as argument has indeed Entware installed.

6) IMPROVED: Added code to show the current database file size information on the CLI menu and the webGUI page.

7) IMPROVED: Added code to show the "JFFS Available" space information for the "Data Storage Location" option on the CLI menu and the webGUI page.

8) IMPROVED: Added code to check if sufficient JFFS storage space is available before moving database-related files/folders from USB location to JFFS partition. An error message is reported if not enough space is available, and the move request is aborted.

9) IMPROVED: Added code to check if the available JFFS storage space falls below 20% of total space or 10MB (whichever is lower) and report a warning when it does. A warning message is also shown on the SSH CLI menu and WebGUI page.

10) IMPROVED: Added and modified code so that every time the SSH CLI menu is run, it checks if the WebGUI page has already been mounted. If not found mounted, the script will run the code to remount the WebGUI.

11) IMPROVED: Improved code that creates (during installation) and removes (during uninstallation) the "AddOns" menu tab entry for the WebGUI to make sure it checks for and takes into account other add-ons that may have been installed before or were later installed after the initial installation.

12) IMPROVED: Added "export PATH" statement to give the built-in binaries higher priority than the equivalent Entware binaries.

13) Miscellaneous code improvements & fine-tuning.
@Martinski4GitHub
Copy link
Author

@jackyaz & @ExtremeFiretop,

This PR includes UI improvements that were made to better handle the additional Wireguard interfaces that now show up on the CLI menu and the WebGUI page.

For example, in the CLI menu was this:

spdMerlin_v4 4 6_CLI_BEFORE

Now with the changes, it's shown like this:

spdMerlin_v4 4 6_CLI_AFTER

On the WebGUI page, the longer list of interfaces was split into 2 lines but at the wrong places:

spdMerlin_v4 4 6_WebGUI_BEFORE

Now with my changes, the list gets split into groups like this:

spdMerlin_v4 4 6_WebGUI_AFTER_01

I also added some tooltips to help indicate why an interface may not be selected:

spdMerlin_v4 4 6_WebGUI_AFTER_02

spdMerlin_v4 4 6_WebGUI_AFTER_03

There are other improvements and the above are the ones related to the additional Wireguard interfaces.

@ExtremeFiretop
Copy link

  1. IMPROVED: Modified all SQLite3 calls to capture and log errors in the system log.
  2. IMPROVED: Modified SQLite3 configuration parameters to improve the trimming of records from the database and then perform "garbage collection" of deleted entries to reclaim unused space & avoid excessive fragmentation.
  3. IMPROVED: Modified SQLite3 configuration parameters to improve the processing of database records.
  4. IMPROVED: Modified code to set the corresponding priority level of log entries when calling the built-in logger utility.
  5. IMPROVED: Modified the startup call made in the post-mount script to check if the USB-attached disk partition passed as argument has indeed Entware installed.
  6. IMPROVED: Added code to show the current database file size information on the CLI menu and the webGUI page.
  7. IMPROVED: Added code to show the "JFFS Available" space information for the "Data Storage Location" option on the CLI menu and the webGUI page.
  8. IMPROVED: Added code to check if sufficient JFFS storage space is available before moving database-related files/folders from the USB location to the JFFS partition. An error message is reported if not enough space is available, and the move request is aborted.
  9. IMPROVED: Added code to check if the available JFFS storage space falls below 20% of total space or 10MB (whichever is lower) and report a warning when it does. A warning message is also shown on the SSH CLI menu and WebGUI page.
  10. IMPROVED: Added and modified code so that every time the SSH CLI menu is run, it checks if the WebGUI page has already been mounted. If not found mounted, the script will run the code to remount the WebGUI.
  11. IMPROVED: Improved code that creates (during installation) and removes (during uninstallation) the "AddOns" menu tab entry for the WebGUI to make sure it checks for and takes into account other add-ons that may have been installed before or were later installed after the initial installation.
  12. IMPROVED: Added "export PATH" statement to give the built-in binaries higher priority than the equivalent Entware binaries.
  13. Miscellaneous code improvements & fine-tuning.

On man, you went hard! Love it!

Let me know if you need any assistance in testing it and reporting feedback.

I already quickly updated to the latest code from your repo and noticed the improvements to the new interface selection so everything lines up nicely now! Love this improvement! Looks cleaner!

Fine-tuned the margins on the WebGUI page to make interfaces checkboxes and radio button align vertically as needed.
@Martinski4GitHub
Copy link
Author

FYI,

Made a few changes to better align the interface checkboxes and radio buttons on the WebGUI.
This looks better, IMO, than it was before:

spdMerlin_v4 4 6_WebGUI_AFTER_01

@Martinski4GitHub
Copy link
Author

  1. IMPROVED: Modified all SQLite3 calls to capture and log errors in the system log.
  2. IMPROVED: Modified SQLite3 configuration parameters to improve the trimming of records from the database and then perform "garbage collection" of deleted entries to reclaim unused space & avoid excessive fragmentation.
  3. IMPROVED: Modified SQLite3 configuration parameters to improve the processing of database records.
  4. IMPROVED: Modified code to set the corresponding priority level of log entries when calling the built-in logger utility.
  5. IMPROVED: Modified the startup call made in the post-mount script to check if the USB-attached disk partition passed as argument has indeed Entware installed.
  6. IMPROVED: Added code to show the current database file size information on the CLI menu and the webGUI page.
  7. IMPROVED: Added code to show the "JFFS Available" space information for the "Data Storage Location" option on the CLI menu and the webGUI page.
  8. IMPROVED: Added code to check if sufficient JFFS storage space is available before moving database-related files/folders from the USB location to the JFFS partition. An error message is reported if not enough space is available, and the move request is aborted.
  9. IMPROVED: Added code to check if the available JFFS storage space falls below 20% of total space or 10MB (whichever is lower) and report a warning when it does. A warning message is also shown on the SSH CLI menu and WebGUI page.
  10. IMPROVED: Added and modified code so that every time the SSH CLI menu is run, it checks if the WebGUI page has already been mounted. If not found mounted, the script will run the code to remount the WebGUI.
  11. IMPROVED: Improved code that creates (during installation) and removes (during uninstallation) the "AddOns" menu tab entry for the WebGUI to make sure it checks for and takes into account other add-ons that may have been installed before or were later installed after the initial installation.
  12. IMPROVED: Added "export PATH" statement to give the built-in binaries higher priority than the equivalent Entware binaries.
  13. Miscellaneous code improvements & fine-tuning.

On man, you went hard! Love it!

Let me know if you need any assistance in testing it and reporting feedback.

Yes, thanks. It would be good to have more validation with Wireguard. I ran tests with OpenVPN only but not for long. A friend of mine will be testing this latest version with Wireguard sometime this week too.

I already quickly updated to the latest code from your repo and noticed the improvements to the new interface selection so everything lines up nicely now! Love this improvement! Looks cleaner!

Yeah, I made changes on the CLI as well to align the menu options more nicely (at least the ones I found).

Last night, I went to bed and had a nagging itch that I needed to scratch: on the WebGUI, the radio buttons and checkboxes for the VPN interfaces were not aligned nicely :>). At first, I was OK with that, and already too tired by then to continue. But the more I thought about it, the more it was nagging at me :>). So this morning I woke up and decided to make the changes (You know that's how I roll!!! LOL!!!).

@ExtremeFiretop
Copy link

  1. IMPROVED: Modified all SQLite3 calls to capture and log errors in the system log.
  2. IMPROVED: Modified SQLite3 configuration parameters to improve the trimming of records from the database and then perform "garbage collection" of deleted entries to reclaim unused space & avoid excessive fragmentation.
  3. IMPROVED: Modified SQLite3 configuration parameters to improve the processing of database records.
  4. IMPROVED: Modified code to set the corresponding priority level of log entries when calling the built-in logger utility.
  5. IMPROVED: Modified the startup call made in the post-mount script to check if the USB-attached disk partition passed as argument has indeed Entware installed.
  6. IMPROVED: Added code to show the current database file size information on the CLI menu and the webGUI page.
  7. IMPROVED: Added code to show the "JFFS Available" space information for the "Data Storage Location" option on the CLI menu and the webGUI page.
  8. IMPROVED: Added code to check if sufficient JFFS storage space is available before moving database-related files/folders from the USB location to the JFFS partition. An error message is reported if not enough space is available, and the move request is aborted.
  9. IMPROVED: Added code to check if the available JFFS storage space falls below 20% of total space or 10MB (whichever is lower) and report a warning when it does. A warning message is also shown on the SSH CLI menu and WebGUI page.
  10. IMPROVED: Added and modified code so that every time the SSH CLI menu is run, it checks if the WebGUI page has already been mounted. If not found mounted, the script will run the code to remount the WebGUI.
  11. IMPROVED: Improved code that creates (during installation) and removes (during uninstallation) the "AddOns" menu tab entry for the WebGUI to make sure it checks for and takes into account other add-ons that may have been installed before or were later installed after the initial installation.
  12. IMPROVED: Added "export PATH" statement to give the built-in binaries higher priority than the equivalent Entware binaries.
  13. Miscellaneous code improvements & fine-tuning.

On man, you went hard! Love it!
Let me know if you need any assistance in testing it and reporting feedback.

Yes, thanks. It would be good to have more validation with Wireguard. I ran tests with OpenVPN only but not for long. A friend of mine will be testing this latest version with Wireguard sometime this week too.

I'm off this week for my birthday and mostly sitting at home bored since my girl didn't manage to get time off her work. Which means I basically have a free week of scripting MerlinAU and testing whatever you need. I'll report back shortly as I run my entire /24 subnet through Wireguard

I already quickly updated to the latest code from your repo and noticed the improvements to the new interface selection so everything lines up nicely now! Love this improvement! Looks cleaner!

Yeah, I made changes on the CLI as well to align the menu options more nicely (at least the ones I found).

Last night, I went to bed and had a nagging itch that I needed to scratch: on the WebGUI, the radio buttons and checkboxes for the VPN interfaces were not aligned nicely :>). At first, I was OK with that, and already too tired by then to continue. But the more I thought about it, the more it was nagging at me :>). So this morning I woke up and decided to make the changes (You know that's how I roll!!! LOL!!!).

It's funny because last night when I went to bed I had a very similar itch, like you, I was happy to leave it, I told myself, it's functional and that's all that matters when i spent a grant total of about an hour implementing it between the 2 PRs.

But I wake up to find out you had the itch as well! 😀 Nothing wrong with making things look pretty to the eyes.

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 5, 2025

@Martinski4GitHub and @jackyaz

I am happy to report I have completed the testing of this PR on the latest firmware for the WiFi 7 models (3006.102.3)
Everything is looking good and running well!

Here are the results:
image
image

  1. IMPROVED: Modified SQLite3 configuration parameters to improve the trimming of records from the database and then perform "garbage collection" of deleted entries to reclaim unused space & avoid excessive fragmentation.
  2. IMPROVED: Modified SQLite3 configuration parameters to improve the processing of database records.

Unfortunately I don't have a month's worth of data since the last time I reset the database was while testing my PR in an upgrade situation.

The minimum number of days in the script by default is 15 days minimum; and I only have 2 days of data...
But for the fun of it; I lowered the local MINvalue=15 MAXvalue=365 #Days# in the script to 1 days, and tried to purge old data with /jffs/scripts/spdmerlin trimdb and had success!
image
image

  1. IMPROVED: Added code to show the current database file size information on the CLI menu and the webGUI page.

Looks beautiful! Working as intended!
image
image

  1. IMPROVED: Added code to show the "JFFS Available" space information for the "Data Storage Location" option on the CLI menu and the webGUI page.

Looks beautiful! Working as intended!
image
image

  1. IMPROVED: Added code to check if sufficient JFFS storage space is available before moving database-related files/folders from the USB location to the JFFS partition. An error message is reported if not enough space is available, and the move request is aborted.

I can tell more is happening when I attempt to move the database; it pauses for longer while it says "please wait" but I can't fill my USB or JFFs to test this, what I can do is fake the numbers to the script by removing the : return 0 in the new Check_JFFS_SpaceAvailable function which appears to work as designed:

image

  1. IMPROVED: Added and modified code so that every time the SSH CLI menu is run, it checks if the WebGUI page has already been mounted. If not found mounted, the script will run the code to remount the WebGUI.

Confirmed this is functional by removing the .asp file from the www directory; and it automatically remounted the WebUI:
image

This is preemptively addressing a possible scenario described with scMerlin on this SNB Forums post:

https://www.snbforums.com/threads/scmerlin-2-5-9-service-and-script-control-menu-for-asuswrt-merlin-feb-12-2025.89224/page-5#post-944870

I see this is the same fix we used for MerlinAU? NICE! PR: ExtremeFiretop/MerlinAutoUpdate-Router#403

  1. IMPROVED: Improved code that creates (during installation) and removes (during uninstallation) the "AddOns" menu tab entry for the WebGUI to make sure it checks for and takes into account other add-ons that may have been installed before or were later installed after the initial installation.

I noticed this myself actually while testing the other day; I was having odd behavior at times around the addons tab when trying to uninstall spdMerlin; happy you improved this!

  1. IMPROVED: Added "export PATH" statement to give the built-in binaries higher priority than the equivalent Entware binaries.

The added export path is a nice fix to resolve any entware binaries that conflict with the built in binaries (parameters may be passed differently between them for example) I see this is the same fix we used for MerlinAU? NICE!

@ExtremeFiretop
Copy link

In my humble opinion buddy; all appears functional and "improved" and is ready for merging :)
As you know; I'm just your level 3 support; but all appears clean and passing the environment tests on the Gnuton and 3006 router.

@Martinski4GitHub
Copy link
Author

@Martinski4GitHub and @jackyaz

I am happy to report I have completed the testing of this PR on the latest firmware for the WiFi 7 models (3006.102.3) Everything is looking good and running well!

Here are the results: image image

  1. IMPROVED: Modified SQLite3 configuration parameters to improve the trimming of records from the database and then perform "garbage collection" of deleted entries to reclaim unused space & avoid excessive fragmentation.
  2. IMPROVED: Modified SQLite3 configuration parameters to improve the processing of database records.

Unfortunately I don't have a month's worth of data since the last time I reset the database was while testing my PR in an upgrade situation.

The minimum number of days in the script by default is 15 days minimum; and I only have 2 days of data... But for the fun of it; I lowered the local MINvalue=15 MAXvalue=365 #Days# in the script to 1 days, and tried to purge old data with /jffs/scripts/spdmerlin trimdb and had success! image image

  1. IMPROVED: Added code to show the current database file size information on the CLI menu and the webGUI page.

Looks beautiful! Working as intended! image image

  1. IMPROVED: Added code to show the "JFFS Available" space information for the "Data Storage Location" option on the CLI menu and the webGUI page.

Looks beautiful! Working as intended! image image

  1. IMPROVED: Added code to check if sufficient JFFS storage space is available before moving database-related files/folders from the USB location to the JFFS partition. An error message is reported if not enough space is available, and the move request is aborted.

I can tell more is happening when I attempt to move the database; it pauses for longer while it says "please wait" but I can't fill my USB or JFFs to test this, what I can do is fake the numbers to the script by removing the : return 0 in the new Check_JFFS_SpaceAvailable function which appears to work as designed:

image

  1. IMPROVED: Added and modified code so that every time the SSH CLI menu is run, it checks if the WebGUI page has already been mounted. If not found mounted, the script will run the code to remount the WebGUI.

Confirmed this is functional by removing the .asp file from the www directory; and it automatically remounted the WebUI: image

This is preemptively addressing a possible scenario described with scMerlin on this SNB Forums post:

https://www.snbforums.com/threads/scmerlin-2-5-9-service-and-script-control-menu-for-asuswrt-merlin-feb-12-2025.89224/page-5#post-944870

I see this is the same fix we used for MerlinAU? NICE! PR: ExtremeFiretop/MerlinAutoUpdate-Router#403

  1. IMPROVED: Improved code that creates (during installation) and removes (during uninstallation) the "AddOns" menu tab entry for the WebGUI to make sure it checks for and takes into account other add-ons that may have been installed before or were later installed after the initial installation.

I noticed this myself actually while testing the other day; I was having odd behavior at times around the addons tab when trying to uninstall spdMerlin; happy you improved this!

  1. IMPROVED: Added "export PATH" statement to give the built-in binaries higher priority than the equivalent Entware binaries.

The added export path is a nice fix to resolve any entware binaries that conflict with the built in binaries (parameters may be passed differently between them for example) I see this is the same fix we used for MerlinAU? NICE!

Excellent testing & validation!!!
I appreciate your taking the time to do this and being another pair of eyes to take a look. You pretty much touched on all the key changes & improvements. Thanks, buddy!!

@Martinski4GitHub
Copy link
Author

In my humble opinion buddy; all appears functional and "improved" and is ready for merging :) As you know; I'm just your level 3 support; but all appears clean and passing the environment tests on the Gnuton and 3006 router.

Thanks again bud for the additional testing on Gnuton & the latest 3006 F/W.
Now it's up to @jackyaz to finish his own review and merge the PR into the 'develop' branch whenever he can.

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 5, 2025

Excellent testing & validation!!! I appreciate your taking the time to do this and being another pair of eyes to take a look. You pretty much touched on all the key changes & improvements. Thanks, buddy!!

It's not our project, but I figured I'd spend some hours today doing code review and testing of the new code since I had the free time. I hope this helps!

@Martinski4GitHub
Copy link
Author

Excellent testing & validation!!! I appreciate your taking the time to do this and being another pair of eyes to take a look. You pretty much touched on all the key changes & improvements. Thanks, buddy!!

It's not our project, but I figured I'd spend some hours today doing code review and testing of the new code since I had the free time. I hope this helps!

It definitely helps to get additional validation & more mileage with the latest changes. A few relatives and friends use this add-on so they also appreciate the improvements made and the additional Wireguard support. Thanks, bud!!

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 5, 2025

Excellent testing & validation!!! I appreciate your taking the time to do this and being another pair of eyes to take a look. You pretty much touched on all the key changes & improvements. Thanks, buddy!!

It's not our project, but I figured I'd spend some hours today doing code review and testing of the new code since I had the free time. I hope this helps!

It definitely helps to get additional validation & more mileage with the latest changes. A few relatives and friends use this add-on so they also appreciate the improvements made and the additional Wireguard support. Thanks, bud!!

I got the Wireguard support idea because I recently changed to ProtonVPN as my VPN provider and setup Wireguard, and noticed spdermlin didn't support it. ☹️

Then I went to the forums and saw @jackyaz mention that it should be easy to add; but he doesn't have the hardware to test anymore. So figured why not learn under the hood of spdmerlin. As you know I usually avoid opening others code unless I absolutely have too. But it sounded like he was giving the green light more or less.

At this point I've reviewed enough of the code to say it's truly a great addon and well coded like all his addons! I hope we continue to get it for many firmware versions to come!

@Martinski4GitHub
Copy link
Author

Excellent testing & validation!!! I appreciate your taking the time to do this and being another pair of eyes to take a look. You pretty much touched on all the key changes & improvements. Thanks, buddy!!

It's not our project, but I figured I'd spend some hours today doing code review and testing of the new code since I had the free time. I hope this helps!

It definitely helps to get additional validation & more mileage with the latest changes. A few relatives and friends use this add-on so they also appreciate the improvements made and the additional Wireguard support. Thanks, bud!!

I got the Wireguard support idea because I recently changed to ProtonVPN as my VPN provider and noticed it didn't support it. ☹️

Then I went to the forums and saw @jackyaz mention that it should be easy to add; but he doesn't have the hardware to test anymore. So figured why not learn under the hood of spdmerlin. As you know I usually avoid opening others code unless I absolutely have too. But it sounded like he was giving the green light more or less.

At this point I've reviewed enough of the code to say it's truly a great addon and well coded like all his addons! I hope we continue to get it for many firmware versions to come!

I saw the initial requests on the forums to get Wireguard support, but my only ASUS router at the time didn't have Wireguard, I don't use commercial VPN providers at all, and I didn't even have the add-on installed, so I was not going to try to modify the code without being able to do any "in-house" test & validation on my own router first. I'm glad you took the idea and ran with it!!

Yes, @jackyaz's code is fairly easy to read & follow because it's modular & well-structured. As long as the APIs for the Ookla speedtest executable do not change too drastically, it should be straightforward to maintain compatibility with future F/W versions.

@moonbuggy
Copy link
Contributor

On my RT-BE88U, /sys/class/net/wgcX/operstate seems to always contain "unknown". This means I hit the "$IFACE not up, please check. Skipping speedtest for $IFACE_NAME" warning, and also miss on the " #excluded - interface not up#" bits when it checks if operstate is "down".

It looks like you're seeing appropriate values in operstate on the AC86U, based on the screenshots. I don't know why it's differing for me, but there's a bunch of things I'm noticing are different in this 3006.102.3 firmware (having come directly from 384.13 on an AC3200). So far I've just been assuming that's to blame by default when something amtm installs doesn't quite work right.

It's possible I've screwed up the WireGuard client somehow though and it should be putting useful information into /sys/class/net. I don't know how I might have done that, but I've only had the router for a week, so I'm still poking at it to see how it works.

I had started to patch WireGuard in myself before I noticed this PR. I ended up going with:

if [ ! -f "/sys/class/net/wgc$index/operstate" ] || [ ! "$(ip -br a show wgc$index 2>/dev/null)" ]; then

I didn't spend a huge amount of time on it, but I didn't see anything I could easily and usefully pull out of /sys/class/net. The whole /sys/class/net/wgcX/ folder disappears when I take a WireGuard interface down, so [ ! -f "/sys/class/net/wgc$index/operstate" ] is fine, just operstate is useless.

If it's useful, a running interface looks like this:

moonbuggy@router:/tmp/home/root$ tail -n +1 /sys/class/net/wgc1/*
==> /sys/class/net/wgc1/addr_assign_type <==
0

==> /sys/class/net/wgc1/addr_len <==
0

==> /sys/class/net/wgc1/address <==


==> /sys/class/net/wgc1/bcm_mcastrouter <==
0

==> /sys/class/net/wgc1/broadcast <==


==> /sys/class/net/wgc1/carrier <==
1

==> /sys/class/net/wgc1/carrier_changes <==
0

==> /sys/class/net/wgc1/carrier_down_count <==
0

==> /sys/class/net/wgc1/carrier_up_count <==
0

==> /sys/class/net/wgc1/dev_id <==
0x0

==> /sys/class/net/wgc1/dev_port <==
0

==> /sys/class/net/wgc1/dormant <==
0

==> /sys/class/net/wgc1/duplex <==
tail: read error: Invalid argument

==> /sys/class/net/wgc1/flags <==
0x91

==> /sys/class/net/wgc1/gro_flush_timeout <==
0

==> /sys/class/net/wgc1/ifalias <==

==> /sys/class/net/wgc1/ifindex <==
39

==> /sys/class/net/wgc1/iflink <==
39

==> /sys/class/net/wgc1/link_mode <==
0

==> /sys/class/net/wgc1/mtu <==
1390

==> /sys/class/net/wgc1/name_assign_type <==
3

==> /sys/class/net/wgc1/netdev_group <==
0

==> /sys/class/net/wgc1/operstate <==
unknown

==> /sys/class/net/wgc1/phys_port_id <==
tail: read error: Operation not supported

==> /sys/class/net/wgc1/phys_port_name <==
tail: read error: Operation not supported

==> /sys/class/net/wgc1/phys_switch_id <==
tail: read error: Operation not supported

==> /sys/class/net/wgc1/proto_down <==
0

==> /sys/class/net/wgc1/queues <==
tail: read error: Is a directory

==> /sys/class/net/wgc1/speed <==
tail: read error: Invalid argument

==> /sys/class/net/wgc1/statistics <==
tail: read error: Is a directory

==> /sys/class/net/wgc1/subsystem <==
tail: read error: Is a directory

==> /sys/class/net/wgc1/tx_queue_len <==
1000

==> /sys/class/net/wgc1/type <==
65534

==> /sys/class/net/wgc1/uevent <==
DEVTYPE=wireguard
INTERFACE=wgc1
IFINDEX=39

Otherwise, things seem to generally work.

@moonbuggy
Copy link
Contributor

Actually, I just realised..

The operstate issue isn't what triggered the "$IFACE not up, please check. Skipping speedtest for $IFACE_NAME" warning. I was busy looking at the " #excluded - interface not up#" issue and saw the if statement next to the "not up" warning and assumed it was the same deal.

$IFACE is empty, for some reason, when it does an automated scan. Sorry, my bad.

Mar  7 14:12:01 spdMerlin: Starting speedtest using <server> for WAN interface
Mar  7 14:12:29 spdMerlin: Speedtest results - Download: 91.51 Mbps (data used: 111.7 MB) - Upload: 18.13 Mbps (data used: 22.6 MB)
Mar  7 14:12:29 spdMerlin: Connection quality - Idle Latency: 10.01 ms (jitter: 0.66ms, low: 8.91ms, high: 10.40ms) - Packet Loss: 0.0%
Mar  7 14:12:29 spdMerlin: Starting speedtest using <server> for VPNC2 interface
Mar  7 14:12:53 spdMerlin: Speedtest results - Download: 90.18 Mbps (data used: 108.3 MB) - Upload: 4.56 Mbps (data used: 5.6 MB)
Mar  7 14:12:53 spdMerlin: Connection quality - Idle Latency: 63.48 ms (jitter: 2.05ms, low: 61.82ms, high: 64.42ms) - Packet Loss: 0.0%
Mar  7 14:12:53 spdMerlin:  not up, please check. Skipping speedtest for WGVPN1
Mar  7 14:12:54 spdMerlin:  not up, please check. Skipping speedtest for WGVPN2
Mar  7 14:12:54 spdMerlin:  not up, please check. Skipping speedtest for WGVPN3
Mar  7 14:12:54 spdMerlin:  not up, please check. Skipping speedtest for WGVPN4
Mar  7 14:12:54 spdMerlin:  not up, please check. Skipping speedtest for WGVPN5

It goes alright when I scan all interfaces from the shell though:

Mar  7 14:22:16 spdMerlin: Starting speedtest using <server> for WAN interface
Mar  7 14:22:38 spdMerlin: Speedtest results - Download: 89.28 Mbps (data used: 108.8 MB) - Upload: 18.17 Mbps (data used: 22.6 MB)
Mar  7 14:22:38 spdMerlin: Connection quality - Idle Latency: 9.42 ms (jitter: 0.78ms, low: 8.57ms, high: 10.37ms) - Packet Loss: 0.0%
Mar  7 14:22:38 spdMerlin: Starting speedtest using <server> for VPNC2 interface
Mar  7 14:23:02 spdMerlin: Speedtest results - Download: 85.24 Mbps (data used: 97.6 MB) - Upload: 3.21 Mbps (data used: 4.2 MB)
Mar  7 14:23:02 spdMerlin: Connection quality - Idle Latency: 62.70 ms (jitter: 0.92ms, low: 61.87ms, high: 64.08ms) - Packet Loss: 0.0%
Mar  7 14:23:03 spdMerlin: Starting speedtest using <server> for WGVPN1 interface
Mar  7 14:23:26 spdMerlin: Speedtest results - Download: 88.23 Mbps (data used: 107.4 MB) - Upload: 17.42 Mbps (data used: 21.7 MB)
Mar  7 14:23:26 spdMerlin: Connection quality - Idle Latency: 9.41 ms (jitter: 0.16ms, low: 9.36ms, high: 9.93ms) - Packet Loss: 0.0%
Mar  7 14:23:26 spdMerlin: Starting speedtest using <server> for WGVPN2 interface
Mar  7 14:23:55 spdMerlin: Speedtest results - Download: 85.91 Mbps (data used: 80.3 MB) - Upload: 13.83 Mbps (data used: 15.7 MB)
Mar  7 14:23:55 spdMerlin: Connection quality - Idle Latency: 293.06 ms (jitter: 2.26ms, low: 290.06ms, high: 294.86ms) - Packet Loss: Not available.
Mar  7 14:23:55 spdMerlin: Starting speedtest using <server> for WGVPN3 interface
Mar  7 14:24:24 spdMerlin: Speedtest results - Download: 79.08 Mbps (data used: 79.6 MB) - Upload: 11.79 Mbps (data used: 12.9 MB)
Mar  7 14:24:24 spdMerlin: Connection quality - Idle Latency: 274.91 ms (jitter: 1.86ms, low: 272.39ms, high: 277.84ms) - Packet Loss: 0.0%
<etc..>

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 7, 2025

@moonbuggy

This PR has not been merged yet into development branch. Are you testing with this PR specifically or the code from the development branch?

When did you last update to 4.4.6?
I would run a forceupdate and try again. Using:

/jffs/scripts/spdmerlin develop
/jffs/scripts/spdmerlin forceupdate

This PR doesn't specifically add WireGuard support; that was already merged in develop 4 days ago on March 3rd.
You'll notice in my screenshots I am running a GT-BE98 Pro with 3006 firmware and don't have any of these issues.

@moonbuggy
Copy link
Contributor

moonbuggy commented Mar 7, 2025

I pulled the code from this PR and pasted it in. I'd just finished adding support for WireGuard myself and was about to start looking at what I needed to do for the web UI when I came across this PR, so I did a reinstall of the official package and stuck the PR on top of it because the Addons menu was being weird for me, and hoped it might fix it.

But yeah, maybe I've screwed something up. I'll do the re-install again, and see what happens.

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 7, 2025

I pulled the code from this PR and pasted it in. I'd just finished adding support for WireGuard myself and was about to start looking at what I needed to do for the web UI when I came across this PR, so I did a reinstall of the official package and stuck your PR on top of it because the Addons menu was being weird for me, and hoped it might fix it.

But yeah, maybe I've screwed something up. I'll do the re-install again, and see what happens.

To be of assistance; I would put whatever work you've done aside somewhere and just not touch it for now so we don't mix anything up.

Do this and report back:

  1. Uninstall spdmerlin (or whatever is left)

  2. Reinstall through AMTM

  3. Run the following commands in order:

/jffs/scripts/spdmerlin develop
/jffs/scripts/spdmerlin forceupdate

  1. Load spdmerlin and configure the interfaces. Take note of the results in the shell script and the WebUI (functional? still issues?)

Because this PR isn't merged yet; we can test it manually; but as you mentioned you need to replace the correct files (.sh and .asp) in the correct directories

being: /www/user

and: /jffs/addons/spdmerlin.d

And of course the .sh file in: /jffs/scripts/

@moonbuggy
Copy link
Contributor

Oh, yeah, that's why I did a reinstall before I pasted this PR in - to remove my own code. I'd only touched spdmerlin.sh, which in turn changed config slightly, but my changes were definitely gone.

I definitely made sure to paste the ASP code in as well, when I added this PR.

Anyway, full reinstall now - an actual uninstall first, not just a forceupdate, and I haven't stuck this PR back in (so this is probably not the appropriate place to continue this discussion :)).

(Something had definitely gone screwy somewhere before I did the uninstall, because the forceupdate alone had the webUI and shell script with opposing settings - webUI said 'usb' storage, shell script said 'jffs' and when I toggled it on either it toggled on both, so I could change it to 'jffs' in the webUI but it would change the shell script to 'usb' when I did. :))

The operstate is still "unknown" for wgcX interfaces, so that issue is still present for me, although by the sounds of it it's not a problem for you on the firmware..? So I don't know what's up with that.

The automated speedtests do seem to run now though.

So yeah, in summary, it looks like I screwed up the install and then commented on the wrong PR. :) Sorry.

And if you do definitely have something other than "unknown" in your operstate on the same firmware, it looks like I've screwed up WireGuard too. :) Would you mind just confirming that for me?

@ExtremeFiretop
Copy link

Oh, yeah, that's why I did a reinstall before I pasted this PR in - to remove my own code. I'd only touched spdmerlin.sh, which in turn changed config slightly, but my changes were definitely gone.

I definitely made sure to paste the ASP code in as well, when I added this PR.

Fair enough, it's just we aren't aware of what changes may or may not have been done before hand, and without a PR of your own for us to review we would just be guessing so best we start with a clean slate and we are all on the same page. (Or code) Haha.

So yeah, in summary, it looks like I screwed up the install and then commented on the wrong PR. :) Sorry.

And if you do definitely have something other than "unknown" in your operstate on the same firmware, it looks like I've screwed up WireGuard too. :) Would you mind just confirming that for me?

No worries, I'm happy to hear it's functioning better now, I'm actually just sitting down for dinner but give me about 15 minutes and I'll be able to get back to you on this last point.

@ExtremeFiretop
Copy link

Oh, yeah, that's why I did a reinstall before I pasted this PR in - to remove my own code. I'd only touched spdmerlin.sh, which in turn changed config slightly, but my changes were definitely gone.

I definitely made sure to paste the ASP code in as well, when I added this PR.

Anyway, full reinstall now - an actual uninstall first, not just a forceupdate, and I haven't stuck this PR back in (so this is probably not the appropriate place to continue this discussion :)).

(Something had definitely gone screwy somewhere before I did the uninstall, because the forceupdate alone had the webUI and shell script with opposing settings - webUI said 'usb' storage, shell script said 'jffs' and when I toggled it on either it toggled on both, so I could change it to 'jffs' in the webUI but it would change the shell script to 'usb' when I did. :))

The operstate is still "unknown" for wgcX interfaces, so that issue is still present for me, although by the sounds of it it's not a problem for you on the firmware..? So I don't know what's up with that.

The automated speedtests do seem to run now though.

So yeah, in summary, it looks like I screwed up the install and then commented on the wrong PR. :) Sorry.

And if you do definitely have something other than "unknown" in your operstate on the same firmware, it looks like I've screwed up WireGuard too. :) Would you mind just confirming that for me?

Interestingly; no I also get operstate "unknown" on this firmware.
As seen below:

image

And I've confirmed that interface is up. So it seems to be a generic issue right now; or something has changed with 3006 F/W
Keep in mine 3006 is still in heavy flux, lots of issues being found and reported all the time.

That being said; I don't see anything different between my wgc1 and yours when it comes to operstate which means we may want to consider an alternative before it goes to production.

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 7, 2025

@moonbuggy

Thank you for reporting; so based on the current code; it's assumed up, but if it goes down; it doesn't properly detect the interface going down; I've confirmed that on my side as well. (This is with the interface down)

image

This seems to be for 3006 firmware; we can poke at an alternative to detecting the interface going down.

Edit:

The reason it currently "works" around is due to:

if [ ! -f "/sys/class/net/wgc$index/operstate" ]

image

Which gives you your message seen.

@moonbuggy
Copy link
Contributor

Fair enough, it's just we aren't aware of what changes may or may not have been done before hand, and without a PR of your own for us to review we would just be guessing so best we start with a clean slate and we are all on the same page.

I was mentioning that I'd started adding WireGuard myself mostly as an "I'm an idiot for not checking the dev branch first" sort of deal. :) But also because it gave some context for the alternative conditional statement I provided.

It's been a fairly hectic week. AC3200->BE88U because the AC3200 started randomly losing the WAN link and refusing SSH connections and a re-flash+reset didn't make a difference. Decided I might as well migrate dhcpd->Kea while I was at it, which turned into having to build a hook library to let Kea talk to dnsmasq via a script. Then I had to update a bunch of my own stuff to handle WireGuard, because VPN Director doesn't prioritise IP rules in a sensible way.

Meanwhile, my family just want to watch Netflix and Prime, and I'm forcing them out a VPN tunnel in another country because it's the best I can manage with my half-broken adhoc setup, and neither Netflix, Prime nor my family are happy about it :)

Interestingly; no I also get operstate "unknown" on this firmware.

Thanks for confirming that for me. I'm relieved that at least one of the issues in my comment is a real thing and not something I've accidentally done to myself somehow. :) (Although even that small victory is tainted by my commenting on the wrong PR. Apologies again. I need more sleep.)

For what it's worth, your screenshots of the issue match my screen.

@ExtremeFiretop
Copy link

It's been a fairly hectic week. AC3200->BE88U because the AC3200 started randomly losing the WAN link and refusing SSH connections and a re-flash+reset didn't make a difference. Decided I might as well migrate dhcpd->Kea while I was at it, which turned into having to build a hook library to let Kea talk to dnsmasq via a script. Then I had to update a bunch of my own stuff to handle WireGuard, because VPN Director doesn't prioritise IP rules in a sensible way.

Meanwhile, my family just want to watch Netflix and Prime, and I'm forcing them out a VPN tunnel in another country because it's the best I can manage with my half-broken adhoc setup, and neither Netflix, Prime nor my family are happy about it :)

. Apologies again. I need more sleep.)

For what it's worth, your screenshots of the issue match my screen.

No worries I've been there done that.

We all have oversights and sometimes lack of sleep! I'll poke at a different solution myself tomorrow and see if we can find something slick that works with the current limitations of the Wireguard client on 3006.

I appreciate you bringing this up to our attention even if at the wrong PR haha. I'm sure Martinski will probably poke at this some more as well now that we know about it.

@moonbuggy
Copy link
Contributor

moonbuggy commented Mar 7, 2025

I just had a look around, a little bit deeper than my previous investigation when I was monkey patching. I see people saying the unpopulated /sys/class/net/wgcX is intentional, with WireGuard generally wanting to be stealthy and stateless.

WireGuard has a PDF that says things like:

Key exchanges, connections, disconnections, reconnections, discovery, and so forth happen behind the scenes transparently and reliably, and the administrator does not need to worry about these details. In other words, from the perspective of administration, the WireGuard interface appears to be stateless

I can't test it myself, it doesn't run on the AC3200 afaik, but I'd suggest you might want @Martinski4GitHub to check the operstate on their (presumably) 388.8 firmware. It looks like it's a WireGuard issue, not a 3006.102.3 issue as I had been assuming.

Looking for the interface with ip <whatever> or similar may be the only viable way to do that test.

@ExtremeFiretop
Copy link

I can't test it myself, it doesn't run on the AC3200 afaik, but I'd suggest you might want @Martinski4GitHub to check the operstate on their (presumably) 388.8 firmware. It looks like it's a WireGuard issue, not a 3006.102.3 issue as I had been assuming.

My plan to test this myself today was to reinstall it on my Gnuton router. It's still running 3004 firmware.

While I did install it once and test it for Martinski a few days ago, I just made sure the interface ran a speed test and the database was updating, I didn't go as far as disconnecting the interface on Gnuton which is what we are talking about today.

I'll report back more later today.

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 7, 2025

I just had a look around, a little bit deeper than my previous investigation when I was monkey patching. I see people saying the unpopulated /sys/class/net/wgcX is intentional, with WireGuard generally wanting to be stealthy and stateless.

WireGuard has a PDF that says things like:

Key exchanges, connections, disconnections, reconnections, discovery, and so forth happen behind the scenes transparently and reliably, and the administrator does not need to worry about these details. In other words, from the perspective of administration, the WireGuard interface appears to be stateless

I can't test it myself, it doesn't run on the AC3200 afaik, but I'd suggest you might want @Martinski4GitHub to check the operstate on their (presumably) 388.8 firmware. It looks like it's a WireGuard issue, not a 3006.102.3 issue as I had been assuming.

Looking for the interface with ip <whatever> or similar may be the only viable way to do that test.

Okay; so final result is the same on 3004 firmware. That is regardless of the WireGuard configuration itself (NAT, no NAT, etc)

Seems your spot on with this WireGuard client just being stateless regardless of firmware version as all initial impressions suggested. (This was not just as you impression!) That means we will need to tweak the detection method for when the interface goes down in comparison to OpenVPN. Great catch!

Considering this limitation with the WireGuard client; there's 4 ways to go about it how I see it let me know what you think:

  1. We can lean on the nvram values for wgc(X)_enable=0 or wgc(X)_enable=1; this is similar to how other addons work.

image

  1. We can lean on some built-in binary commands as you mentioned; you identified ip -br a show we can also use: wg or ifconfig which also identify if the interface is UP or DOWN based on it's configuration.

image

When it comes to commands using built in binaries; I would lean towards "wg" since it's purpose built for this myself. I would be open to suggestions.

  1. We can lean on real world tests using ping for example:

image

However I dislike relying on external factors to determine configuration when possible; while technically google at 8.8.8.8 or Cloudflare should always be up, that doesn't mean something unexpected can't happen; we've seen it before in the past.
There is a benefit to this method, which is we can identify the difference between the interface being "configured/enabled" and "enabled but disconnected due to x reason"

  1. Finally; We can lean on the directory itself of the interface to determine up state; as you noticed if you disable WireGuard the entire directory becomes empty.

@moonbuggy
Copy link
Contributor

moonbuggy commented Mar 7, 2025

We can lean on the nvram values for wgc(X)_enable=0 or wgc(X)_enable=1; this is similar to how other addons work.

I actually already tested that on my BE88U. Sorry, I should have mentioned it but I was worried I'd already spammed up this PR. :)

The wgcX_enable variable only really indicates whether the interface is on/off in the webUI. It doesn't indicate an up/down state. I threw a random digit into the IP for the WireGuard server, the tunnel definitely wasn't "up", wgcX_enable was still 1.

We can lean on some built-in binary commands as you mentioned; you identified ip -br a show we can also use: wg or ifconfig which also identify if the interface is UP or DOWN based on it's configuration.

Upon reflection, I don't know if ip a gives us any more information than we get from the NVRAM. ip a still shows the interface even when I have a random IP in there for the server address, just like the NVRAM. Likewise in wg show I don't see a difference.

If you went this route, I would suggest that ip is more mature and unlikely to change significantly in future. I don't know either way for wg specifically, but I'd assume newer software is generally more likely to have an unstable interface and thus maybe require attention again in future.

We can lean on real world tests using ping for example

I'd agree that it's generally better to be self-contained and do as much as possible as locally as possible.

This may be the only option that gives a true up/down indication, but I wonder if it might be better to just let speedtest keep failing elegantly on broken tunnels and use one of the other three methods to exclude disabled servers only. Just to keep things self-cotnained. [shrug]

Finally; We can lean on the directory itself of the interface to determine up state; as you noticed if you disable WireGuard the entire directory becomes empty.

This is equivalent to using wgcX_enable from the NVRAM too. I don't know if it's cheaper to do a test -d /sys/class/net/wgcX, to run ip a /wg or to pull a value from the NVRAM, but as per above I'd default to whatever is more efficient.

@ExtremeFiretop
Copy link

We can lean on the nvram values for wgc(X)_enable=0 or wgc(X)_enable=1; this is similar to how other addons work.

I actually already tested that on my BE88U. Sorry, I should have mentioned it but I was worried I'd already spammed up this PR. :)

The wgcX_enable variable only really indicates whether the interface is on/off in the webUI. It doesn't indicate an up/down state. I threw a random digit into the IP for the WireGuard server, the tunnel definitely wasn't "up", wgcX_enable was still 1.

Based on all the options I just pointed out; none but a real world test gives you actual "connected" status.
You'll notice in every instance I said "configured" or "enabled"

Because that's all we will be able to get with the limitations of the WireGuard client without doing real world tests sadly.

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 7, 2025

but I wonder if it might be better to just let speedtest keep failing elegantly on broken tunnels and use one of the other three methods to exclude disabled servers only. Just to keep things self-cotnained. [shrug]

I'm leaning towards this myself. Just let it fail silently if it's configure, but "disconnected".

But maybe @Martinski4GitHub or @jackyaz has opinions as well?

@moonbuggy
Copy link
Contributor

You'll notice in every instance I said "configured" or "enabled"

Yes, sorry, I didn't mean to suggest you thought they were real up/down indicators. I was just thinking aloud as I went through the four options.

I'm leaning towards this myself. Just let it fail silently if it's configure, but "disconnected".

I agree. It seems cleaner to let it fail than to start sending ICMP packets around. I wonder about the robustness of the ping option as well. Routing and firewall configs will vary from case to case, I assume pinging won't work is at least some of those cases.

But maybe @Martinski4GitHub or @jackyaz has opinions as well?

I agree again. I'm just some dude rocking up late to the party, sleep deprived, and pasting a mix of real and self-inflicted problems in the completely wrong PR. :) I've shared my thoughts, but I'll be happy with whatever others decide.

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 7, 2025

You'll notice in every instance I said "configured" or "enabled"

Yes, sorry, I didn't mean to suggest you thought they were real up/down indicators. I was just thinking aloud as I went through the four options.

No worries; just wanted to be clear that in testing all of them in that order; all failed to give actual connected status, except for option 3. Maybe I didn't make that clear originally; but my intention was to say really none of the 4 options are ideal... Sadly. It's why I was throwing it back to you/the group for opinions lol!

I'm leaning towards this myself. Just let it fail silently if it's configure, but "disconnected".

I agree. It seems cleaner to let it fail than to start sending ICMP packets around. I wonder about the robustness of the ping option as well. Routing and firewall configs will vary from case to case, I assume pinging won't work is at least some of those cases.

ping in my eyes isn't very robust and hasn't been since what? the early 2000s maybe? lol. I had to throw it out there as the only option that worked for me to determine connected status. But I also pointed out I am not a fan of the option for those reasons as well.

But maybe @Martinski4GitHub or @jackyaz has opinions as well?

I agree again. I'm just some dude rocking up late to the party, sleep deprived, and pasting a mix of real and self-inflicted problems in the completely wrong PR. :) I've shared my thoughts, but I'll be happy with whatever others decide.

I am basically the same guy; so no worries there.
I mostly test for Martinski and as well develop with him as his padawan learner for MerlinAU. But mostly testing since I have a slue of routers now; Gnuton/Merlin, 3004/3006; setup and nodes and parent.
Sometimes it's hard to keep track of the tests I've done!

This was a good catch on your part; because it will require some rethinking about how we want to detect disabled and disconnected WireGuard interfaces.

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 7, 2025

@moonbuggy

So. In the WebUI. It actually gives us a "Connected" Status.
Maybe it's worth while reverse engineering where that status is coming from. Is it the Wireguard client doing its own live testing? If so, where are those results stored? Is that something we can reference?

@ExtremeFiretop
Copy link

So the status appears to be populated by: /www/ajax_vpn_status.asp

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 7, 2025

Okay so based on this file and function: https://github.com/RMerl/asuswrt-merlin.ng/blob/1d784fe35b8a78ebf15a6cf5f7ef880ff0e53b86/release/src/router/shared/vpn_utils.c#L398

This code below checks to see the last timestamp. That means if the last handshake was 3 minutes ago or less, it’s considered “connected.” Otherwise, it’s considered offline (return 0).

if (f_read_string(filename, buf, sizeof(buf)) > 0) {
    char *p = strstr(buf, "sec:");
    unsigned long long t = (p) ? strtoull (p + 4, NULL, 0) : 999;

    if (strstr(buf, "Now"))
        return 1;
    else if (t <= 180)
        return 1;
    else
        return 0;
}
else
    return 0;

We can replicate something like this:

IFACE="wgc1" 
THRESHOLD=180             # Seconds cutoff for “connected”

# Grab the first line from `wg show IFACE latest-handshakes`
HANDSHAKE_LINE="$(wg show "$IFACE" latest-handshakes 2>/dev/null | head -n1)"

# If there's no handshake line at all, consider disconnected
[ -z "$HANDSHAKE_LINE" ] && echo 0 && exit 0

# Extract the numeric epoch timestamp (the second column)
TIMESTAMP="$(echo "$HANDSHAKE_LINE" | awk '{print $2}')"

# If no valid timestamp was found, treat as disconnected
[ -z "$TIMESTAMP" ] && echo 0 && exit 0

# Get the current epoch time
NOW="$(date +%s)"

# Calculate how many seconds ago the handshake occurred
ELAPSED=$((NOW - TIMESTAMP))

# If handshake was within THRESHOLD seconds, return 1; else 0
if [ "$ELAPSED" -le "$THRESHOLD" ] && [ "$ELAPSED" -ge 0 ]; then
  echo 1
else
  echo 0
fi

And that would be basically a match for match with what the WireGuard client uses to determine connection status.

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 7, 2025

Another alternative I just thought of with would be:

rx_before=$(cat /sys/class/net/wgc1/statistics/rx_bytes)
tx_before=$(cat /sys/class/net/wgc1/statistics/tx_bytes)

# Wait a time
sleep 5

rx_after=$(cat /sys/class/net/wgc1/statistics/rx_bytes)
tx_after=$(cat /sys/class/net/wgc1/statistics/tx_bytes)

if [ "$rx_after" -gt "$rx_before" ] || [ "$tx_after" -gt "$tx_before" ]; then
    echo "WireGuard wgc1 is passing traffic => 'connected'"
else
    echo "No traffic on wgc1 => might not be connected"
fi

But it's technically possible for there to be minimal to no traffic for some period of time.

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 8, 2025

Here is my recommendation on proper detection of the interfaces going up and down live:

develop...ExtremeFiretop:spdMerlin-Jack:develop

I tested and and confirmed it works:

  • Interface Up:

image

  • Interface "DISCONNECTED"

image

  • Finally interface "DISABLED"

image

I'll now let @Martinski4GitHub respond; because I know he is online at this moment.
If he wants to include that in his PR; great, otherwise I can make my own base on his instead of Jack's dev branch

@moonbuggy
Copy link
Contributor

moonbuggy commented Mar 8, 2025

Just thought I'd throw another of my two cents in..

Good call on the handshake time, I reckon. A fuzzy up/down is probably the best we'll get out of WireGuard, and it's just sitting there ripe for the getting. Avoids screwing about with some external test, and speedtest still fails cleanly if it all goes wrong.

ping in my eyes isn't very robust and hasn't been since what? the early 2000s maybe?

Back in the good old days.. I'd finally gotten a modem fast enough that the internet was more than a telnet console with a MUD in it, Lynx was no longer my default browser, ping was still useful and Luba was the hotness. :)

@ExtremeFiretop
Copy link

Just thought I'd throw another of my two cents in..

Good call on the handshake time, I reckon. A fuzzy up/down is probably the best we'll get out of WireGuard, and it's just sitting there ripe for the getting. Avoids screwing about with some external test, and speedtest still fails cleanly if it all goes wrong.

ping in my eyes isn't very robust and hasn't been since what? the early 2000s maybe?

Back in the good old days.. I'd finally gotten a modem fast enough that the internet was more than a telnet console with a MUD in it, Lynx was no longer my default browser, ping was still useful and Luba was the hotness. :)

It's actually instantly live, while testing I realized the handshake time goes to zero when you "disconnect".

Which means basically live we can get a "down" status if the interface is enabled in nvram but the handshake is equal to 0!

I think this is the best it gets using a combination of the wg output and nvram.

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 8, 2025

@moonbuggy

btw I realize this is basically over, but in the future feel free to message me directly on snbforms if you have an account. :) I'm not against a chat, especially if you think you found something of value like you did today.

thats where me and Martinski usually chat to not spam a PR.

@Martinski4GitHub
Copy link
Author

Here is my recommendation on proper detection of the interfaces going up and down live:

develop...ExtremeFiretop:spdMerlin-Jack:develop

I tested and and confirmed it works:

* **Interface Up:**

...

* **Interface "DISCONNECTED"**

...

* **Finally interface "DISABLED"**

...

I'll now let @Martinski4GitHub respond; because I know he is online at this moment. If he wants to include that in his PR; great, otherwise I can make my own base on his instead of Jack's dev branch

Yeah, that sounds good. I can add those changes to my 'develop' branch which will add them to my current PR for JackYaz's repo. I'll do that before I go to sleep tonight.

BTW, working on MerlinAU now to add a password verification test...

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 8, 2025

Here is my recommendation on proper detection of the interfaces going up and down live:
develop...ExtremeFiretop:spdMerlin-Jack:develop
I tested and and confirmed it works:

* **Interface Up:**

...

* **Interface "DISCONNECTED"**

...

* **Finally interface "DISABLED"**

...
I'll now let @Martinski4GitHub respond; because I know he is online at this moment. If he wants to include that in his PR; great, otherwise I can make my own base on his instead of Jack's dev branch

Yeah, that sounds good. I can add those changes to my 'develop' branch which will add them to my current PR for JackYaz's repo. I'll do that before I go to sleep tonight.

Sounds good!

Thanks @Martinski4GitHub I'll let you run with it and i'll validate afterwards.

I just made another update to my develop branch; because I realized I didn't commit the right change, I didn't mean to keep the code modified in Create_Symlinks; that was just while testing; instead I just made it call Create_Symlinks force when you load the menu to make it reflect any changes to the interface detection when the menu is loaded.

BTW, working on MerlinAU now to add a password verification test...

Sounds good my friend! I will look forwards to testing that in the new WebUI for the addon as well!
Keep me posted!

@Martinski4GitHub
Copy link
Author

Here is my recommendation on proper detection of the interfaces going up and down live:
develop...ExtremeFiretop:spdMerlin-Jack:develop
I tested and and confirmed it works:

* **Interface Up:**

...

* **Interface "DISCONNECTED"**

...

* **Finally interface "DISABLED"**

...
I'll now let @Martinski4GitHub respond; because I know he is online at this moment. If he wants to include that in his PR; great, otherwise I can make my own base on his instead of Jack's dev branch

Yeah, that sounds good. I can add those changes to my 'develop' branch which will add them to my current PR for JackYaz's repo. I'll do that before I go to sleep tonight.

Sounds good!

Thanks @Martinski4GitHub I'll let you run with it and i'll validate afterwards.

I just made another update to my develop branch; because I realized I didn't commit the right change, I didn't mean to keep the code modified in Create_Symlinks; that was just while testing; instead I just made it call Create_Symlinks force when you load the menu to make it reflect any changes to the interface detection when the menu is loaded.

Got it.

BTW, working on MerlinAU now to add a password verification test...

Sounds good my friend! I will look forwards to testing that in the new WebUI for the addon as well! Keep me posted!

I'm almost done - just doing some more testing and making some final touches before submitting the PR.

Modified code to correctly detect when a WireGuard interface is up (connected) or down (disconnected) in addition to being not enabled. Since a WireGuard connection is essentially stateless, the new code improves on the detection method to provide the correct status so the user can select the desired interface to run a speed test.

NOTE: This fix was provided by @ExtremeFiretop.
@Martinski4GitHub
Copy link
Author

@jackyaz & @ExtremeFiretop,

I just submitted more changes to this PR to fix the WireGuard connection status detection.

NOTE: This fix was provided by @ExtremeFiretop.

Code was modified to correctly detect when a WireGuard interface is up (connected) or down (disconnected) in addition to being not enabled. Since a WireGuard connection is essentially stateless, the new code improves on the detection method to provide the correct status so the user can select the desired interface to run a speed test.

Fixed the correct capital letters for WireGuard.
@ExtremeFiretop
Copy link

@Martinski4GitHub

Thanks for this!
Will test and advise, as I had another possible "eureka" moment last night while catching up on some sleep.

@ExtremeFiretop
Copy link

ExtremeFiretop commented Mar 8, 2025

@Martinski4GitHub

I found a bug/oversight with using force; it came to me while while dreaming I realized why we weren't using the force parameter previously; essentially we are wiping out any specific user configuration they might of setup doing it this way (if they set up specific excludes for some reason)

But the good news is I found an alternative method which is more specific to WireGuard and it's limitations around states.

We will need to update the: Set_Interface_State function so include some new additional logic; but that will preserve the user configurations if they set it to exclude.

Find a working example here: Martinski4GitHub#1

Set_Interface_State(){
    interfaceline="$(sed "$1!d" "$SCRIPT_INTERFACES_USER" | awk '{$1=$1};1')"
    
    # Only process lines that begin with VPNC or WGVPN
    if echo "$interfaceline" | grep -qE "^(VPNC|WGVPN)"; then
        IFACE_NAME="$(echo "$interfaceline" | cut -f1 -d"#" | sed 's/ *$//')"
        IFACE_LOWER="$(Get_Interface_From_Name "$IFACE_NAME" | tr 'A-Z' 'a-z')"

        # Check if the interface is up vs down
        # For WGVPN => use Check_WG_Interface(index)
        # For VPNC => check /sys/class/net/<interface>/operstate
        interface_is_up=false
        if echo "$IFACE_NAME" | grep -q "WGVPN"; then
            # Extract the numeric suffix (e.g. WGVPN3 => index=3)
            index="$(echo "$IFACE_NAME" | sed 's/[^0-9]//g')"
            if Check_WG_Interface "$index"; then
                interface_is_up=true
            fi
        else
            # This is an OpenVPN client
            if [ -f "/sys/class/net/$IFACE_LOWER/operstate" ] &&
               [ "$(cat "/sys/class/net/$IFACE_LOWER/operstate")" = "up" ]; then
                interface_is_up=true
            fi
        fi

        # Decide how to update the #excluded marker based on up/down
        if echo "$interfaceline" | grep -q "#excluded"; then
            # The user has explicitly excluded this interface at some point
            if [ "$interface_is_up" = true ]; then
                # If it was #excluded - interface not up#, strip off the suffix
                sed -i "$1 s/#excluded - interface not up#/#excluded#/" "$SCRIPT_INTERFACES_USER"
            else
                # The interface is down: ensure we have "#excluded - interface not up#"
                if echo "$interfaceline" | grep -q "#excluded#$"; then
                    sed -i "$1 s/#excluded$/#excluded - interface not up#/" "$SCRIPT_INTERFACES_USER"
                fi
            fi

        else
            # No “#excluded” marker => user wanted it included
            if [ "$interface_is_up" = false ]; then
                # If it’s down, automatically exclude it with “- interface not up#”
                sed -i "$1 s/$/ #excluded - interface not up#/" "$SCRIPT_INTERFACES_USER"
            fi
        fi
    fi
}

@Martinski4GitHub
Copy link
Author

@Martinski4GitHub

I found a bug/oversight with using force; it came to me while while dreaming I realized why we weren't using the force parameter previously; essentially we are wiping out any specific user configuration they might of setup doing it this way (if they set up specific excludes for some reason)

But the good news is I found an alternative method which is more specific to WireGuard and it's limitations around states.

We will need to update the: Set_Interface_State function so include some new additional logic; but that will preserve the user configurations if they set it to exclude.

Find a working example here: Martinski4GitHub#1
...

Ah, yes. Good catch!!

Update spdmerlin.sh Set_Interface_State
Fixed 5 more places where checking for interface connection status to include WireGuard interfaces.
@ExtremeFiretop
Copy link

@Martinski4GitHub

Just tested your PR and all is golden, no errors in the debug console, and it's running all the tests perfectly and correctly detecting interfaces that go offline "disconnect" for me! :)

@Martinski4GitHub
Copy link
Author

@Martinski4GitHub

Just tested your PR and all is golden, no errors in the debug console, and it's running all the tests perfectly and correctly detecting interfaces that go offline "disconnect" for me! :)

OK, great. Thanks a lot for double-checking & testing. I appreciate your taking the time.

@Martinski4GitHub
Copy link
Author

@jackyaz,

After the initial merge of @ExtremeFiretop's PR to support WireGuard, several more fixes and improvements have been made to correctly detect the connection status ("up" vs "down") of the WireGuard interfaces.

Without these new fixes, some users running the currently available 'develop' branch version are getting false negatives where the WireGuard interface is being reported as "not up" even though it's both enabled and actively connected.

FYI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants