-
-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes and Improvements #64
base: develop
Are you sure you want to change the base?
Conversation
1) IMPROVED: Modified all SQLite3 calls to capture and log errors in the system log. 2) IMPROVED: Modified SQLite3 configuration parameters to improve the trimming of records from the database and then perform "garbage collection" of deleted entries to reclaim unused space & avoid excessive fragmentation. 3) IMPROVED: Modified SQLite3 configuration parameters to improve the processing of database records. 4) IMPROVED: Modified code to set the corresponding priority level of log entries when calling the built-in logger utility. 5) IMPROVED: Modified the startup call made in the post-mount script to check if the USB-attached disk partition passed as argument has indeed Entware installed. 6) IMPROVED: Added code to show the current database file size information on the CLI menu and the webGUI page. 7) IMPROVED: Added code to show the "JFFS Available" space information for the "Data Storage Location" option on the CLI menu and the webGUI page. 8) IMPROVED: Added code to check if sufficient JFFS storage space is available before moving database-related files/folders from USB location to JFFS partition. An error message is reported if not enough space is available, and the move request is aborted. 9) IMPROVED: Added code to check if the available JFFS storage space falls below 20% of total space or 10MB (whichever is lower) and report a warning when it does. A warning message is also shown on the SSH CLI menu and WebGUI page. 10) IMPROVED: Added and modified code so that every time the SSH CLI menu is run, it checks if the WebGUI page has already been mounted. If not found mounted, the script will run the code to remount the WebGUI. 11) IMPROVED: Improved code that creates (during installation) and removes (during uninstallation) the "AddOns" menu tab entry for the WebGUI to make sure it checks for and takes into account other add-ons that may have been installed before or were later installed after the initial installation. 12) IMPROVED: Added "export PATH" statement to give the built-in binaries higher priority than the equivalent Entware binaries. 13) Miscellaneous code improvements & fine-tuning.
This PR includes UI improvements that were made to better handle the additional Wireguard interfaces that now show up on the CLI menu and the WebGUI page. For example, in the CLI menu was this: Now with the changes, it's shown like this: On the WebGUI page, the longer list of interfaces was split into 2 lines but at the wrong places: Now with my changes, the list gets split into groups like this: I also added some tooltips to help indicate why an interface may not be selected: There are other improvements and the above are the ones related to the additional Wireguard interfaces. |
On man, you went hard! Love it! Let me know if you need any assistance in testing it and reporting feedback. I already quickly updated to the latest code from your repo and noticed the improvements to the new interface selection so everything lines up nicely now! Love this improvement! Looks cleaner! |
Fine-tuned the margins on the WebGUI page to make interfaces checkboxes and radio button align vertically as needed.
Yes, thanks. It would be good to have more validation with Wireguard. I ran tests with OpenVPN only but not for long. A friend of mine will be testing this latest version with Wireguard sometime this week too.
Yeah, I made changes on the CLI as well to align the menu options more nicely (at least the ones I found). Last night, I went to bed and had a nagging itch that I needed to scratch: on the WebGUI, the radio buttons and checkboxes for the VPN interfaces were not aligned nicely :>). At first, I was OK with that, and already too tired by then to continue. But the more I thought about it, the more it was nagging at me :>). So this morning I woke up and decided to make the changes (You know that's how I roll!!! LOL!!!). |
I'm off this week for my birthday and mostly sitting at home bored since my girl didn't manage to get time off her work. Which means I basically have a free week of scripting MerlinAU and testing whatever you need. I'll report back shortly as I run my entire /24 subnet through Wireguard
It's funny because last night when I went to bed I had a very similar itch, like you, I was happy to leave it, I told myself, it's functional and that's all that matters when i spent a grant total of about an hour implementing it between the 2 PRs. But I wake up to find out you had the itch as well! 😀 Nothing wrong with making things look pretty to the eyes. |
@Martinski4GitHub and @jackyaz I am happy to report I have completed the testing of this PR on the latest firmware for the WiFi 7 models (3006.102.3)
Unfortunately I don't have a month's worth of data since the last time I reset the database was while testing my PR in an upgrade situation. The minimum number of days in the script by default is 15 days minimum; and I only have 2 days of data...
Looks beautiful! Working as intended!
Looks beautiful! Working as intended!
I can tell more is happening when I attempt to move the database; it pauses for longer while it says "please wait" but I can't fill my USB or JFFs to test this, what I can do is fake the numbers to the script by removing the :
Confirmed this is functional by removing the .asp file from the www directory; and it automatically remounted the WebUI: This is preemptively addressing a possible scenario described with scMerlin on this SNB Forums post: I see this is the same fix we used for MerlinAU? NICE! PR: ExtremeFiretop/MerlinAutoUpdate-Router#403
I noticed this myself actually while testing the other day; I was having odd behavior at times around the addons tab when trying to uninstall spdMerlin; happy you improved this!
The added export path is a nice fix to resolve any entware binaries that conflict with the built in binaries (parameters may be passed differently between them for example) I see this is the same fix we used for MerlinAU? NICE! |
In my humble opinion buddy; all appears functional and "improved" and is ready for merging :) |
Excellent testing & validation!!! |
Thanks again bud for the additional testing on Gnuton & the latest 3006 F/W. |
It's not our project, but I figured I'd spend some hours today doing code review and testing of the new code since I had the free time. I hope this helps! |
It definitely helps to get additional validation & more mileage with the latest changes. A few relatives and friends use this add-on so they also appreciate the improvements made and the additional Wireguard support. Thanks, bud!! |
I got the Wireguard support idea because I recently changed to ProtonVPN as my VPN provider and setup Wireguard, and noticed spdermlin didn't support it. Then I went to the forums and saw @jackyaz mention that it should be easy to add; but he doesn't have the hardware to test anymore. So figured why not learn under the hood of spdmerlin. As you know I usually avoid opening others code unless I absolutely have too. But it sounded like he was giving the green light more or less. At this point I've reviewed enough of the code to say it's truly a great addon and well coded like all his addons! I hope we continue to get it for many firmware versions to come! |
I saw the initial requests on the forums to get Wireguard support, but my only ASUS router at the time didn't have Wireguard, I don't use commercial VPN providers at all, and I didn't even have the add-on installed, so I was not going to try to modify the code without being able to do any "in-house" test & validation on my own router first. I'm glad you took the idea and ran with it!! Yes, @jackyaz's code is fairly easy to read & follow because it's modular & well-structured. As long as the APIs for the Ookla speedtest executable do not change too drastically, it should be straightforward to maintain compatibility with future F/W versions. |
On my RT-BE88U, /sys/class/net/wgcX/operstate seems to always contain "unknown". This means I hit the "$IFACE not up, please check. Skipping speedtest for $IFACE_NAME" warning, and also miss on the " #excluded - interface not up#" bits when it checks if operstate is "down". It looks like you're seeing appropriate values in operstate on the AC86U, based on the screenshots. I don't know why it's differing for me, but there's a bunch of things I'm noticing are different in this 3006.102.3 firmware (having come directly from 384.13 on an AC3200). So far I've just been assuming that's to blame by default when something amtm installs doesn't quite work right. It's possible I've screwed up the WireGuard client somehow though and it should be putting useful information into /sys/class/net. I don't know how I might have done that, but I've only had the router for a week, so I'm still poking at it to see how it works. I had started to patch WireGuard in myself before I noticed this PR. I ended up going with:
I didn't spend a huge amount of time on it, but I didn't see anything I could easily and usefully pull out of /sys/class/net. The whole /sys/class/net/wgcX/ folder disappears when I take a WireGuard interface down, so If it's useful, a running interface looks like this:
Otherwise, things seem to generally work. |
Actually, I just realised.. The operstate issue isn't what triggered the "$IFACE not up, please check. Skipping speedtest for $IFACE_NAME" warning. I was busy looking at the " #excluded - interface not up#" issue and saw the if statement next to the "not up" warning and assumed it was the same deal. $IFACE is empty, for some reason, when it does an automated scan. Sorry, my bad.
It goes alright when I scan all interfaces from the shell though:
|
This PR has not been merged yet into development branch. Are you testing with this PR specifically or the code from the development branch? When did you last update to 4.4.6?
This PR doesn't specifically add WireGuard support; that was already merged in develop 4 days ago on March 3rd. |
I pulled the code from this PR and pasted it in. I'd just finished adding support for WireGuard myself and was about to start looking at what I needed to do for the web UI when I came across this PR, so I did a reinstall of the official package and stuck the PR on top of it because the Addons menu was being weird for me, and hoped it might fix it. But yeah, maybe I've screwed something up. I'll do the re-install again, and see what happens. |
To be of assistance; I would put whatever work you've done aside somewhere and just not touch it for now so we don't mix anything up. Do this and report back:
Because this PR isn't merged yet; we can test it manually; but as you mentioned you need to replace the correct files (.sh and .asp) in the correct directories being: and: And of course the .sh file in: |
Oh, yeah, that's why I did a reinstall before I pasted this PR in - to remove my own code. I'd only touched spdmerlin.sh, which in turn changed config slightly, but my changes were definitely gone. I definitely made sure to paste the ASP code in as well, when I added this PR. Anyway, full reinstall now - an actual uninstall first, not just a forceupdate, and I haven't stuck this PR back in (so this is probably not the appropriate place to continue this discussion :)). (Something had definitely gone screwy somewhere before I did the uninstall, because the forceupdate alone had the webUI and shell script with opposing settings - webUI said 'usb' storage, shell script said 'jffs' and when I toggled it on either it toggled on both, so I could change it to 'jffs' in the webUI but it would change the shell script to 'usb' when I did. :)) The operstate is still "unknown" for wgcX interfaces, so that issue is still present for me, although by the sounds of it it's not a problem for you on the firmware..? So I don't know what's up with that. The automated speedtests do seem to run now though. So yeah, in summary, it looks like I screwed up the install and then commented on the wrong PR. :) Sorry. And if you do definitely have something other than "unknown" in your operstate on the same firmware, it looks like I've screwed up WireGuard too. :) Would you mind just confirming that for me? |
Fair enough, it's just we aren't aware of what changes may or may not have been done before hand, and without a PR of your own for us to review we would just be guessing so best we start with a clean slate and we are all on the same page. (Or code) Haha.
No worries, I'm happy to hear it's functioning better now, I'm actually just sitting down for dinner but give me about 15 minutes and I'll be able to get back to you on this last point. |
Interestingly; no I also get operstate "unknown" on this firmware. And I've confirmed that interface is up. So it seems to be a generic issue right now; or something has changed with 3006 F/W That being said; I don't see anything different between my wgc1 and yours when it comes to operstate which means we may want to consider an alternative before it goes to production. |
Thank you for reporting; so based on the current code; it's assumed up, but if it goes down; it doesn't properly detect the interface going down; I've confirmed that on my side as well. (This is with the interface down) This seems to be for 3006 firmware; we can poke at an alternative to detecting the interface going down. Edit: The reason it currently "works" around is due to: if [ ! -f "/sys/class/net/wgc$index/operstate" ] Which gives you your message seen. |
I was mentioning that I'd started adding WireGuard myself mostly as an "I'm an idiot for not checking the dev branch first" sort of deal. :) But also because it gave some context for the alternative conditional statement I provided. It's been a fairly hectic week. AC3200->BE88U because the AC3200 started randomly losing the WAN link and refusing SSH connections and a re-flash+reset didn't make a difference. Decided I might as well migrate dhcpd->Kea while I was at it, which turned into having to build a hook library to let Kea talk to dnsmasq via a script. Then I had to update a bunch of my own stuff to handle WireGuard, because VPN Director doesn't prioritise IP rules in a sensible way. Meanwhile, my family just want to watch Netflix and Prime, and I'm forcing them out a VPN tunnel in another country because it's the best I can manage with my half-broken adhoc setup, and neither Netflix, Prime nor my family are happy about it :)
Thanks for confirming that for me. I'm relieved that at least one of the issues in my comment is a real thing and not something I've accidentally done to myself somehow. :) (Although even that small victory is tainted by my commenting on the wrong PR. Apologies again. I need more sleep.) For what it's worth, your screenshots of the issue match my screen. |
No worries I've been there done that. We all have oversights and sometimes lack of sleep! I'll poke at a different solution myself tomorrow and see if we can find something slick that works with the current limitations of the Wireguard client on 3006. I appreciate you bringing this up to our attention even if at the wrong PR haha. I'm sure Martinski will probably poke at this some more as well now that we know about it. |
I just had a look around, a little bit deeper than my previous investigation when I was monkey patching. I see people saying the unpopulated /sys/class/net/wgcX is intentional, with WireGuard generally wanting to be stealthy and stateless. WireGuard has a PDF that says things like:
I can't test it myself, it doesn't run on the AC3200 afaik, but I'd suggest you might want @Martinski4GitHub to check the operstate on their (presumably) 388.8 firmware. It looks like it's a WireGuard issue, not a 3006.102.3 issue as I had been assuming. Looking for the interface with |
My plan to test this myself today was to reinstall it on my Gnuton router. It's still running 3004 firmware. While I did install it once and test it for Martinski a few days ago, I just made sure the interface ran a speed test and the database was updating, I didn't go as far as disconnecting the interface on Gnuton which is what we are talking about today. I'll report back more later today. |
Okay; so final result is the same on 3004 firmware. That is regardless of the WireGuard configuration itself (NAT, no NAT, etc) Seems your spot on with this WireGuard client just being stateless regardless of firmware version as all initial impressions suggested. (This was not just as you impression!) That means we will need to tweak the detection method for when the interface goes down in comparison to OpenVPN. Great catch! Considering this limitation with the WireGuard client; there's 4 ways to go about it how I see it let me know what you think:
When it comes to commands using built in binaries; I would lean towards "wg" since it's purpose built for this myself. I would be open to suggestions.
However I dislike relying on external factors to determine configuration when possible; while technically google at 8.8.8.8 or Cloudflare should always be up, that doesn't mean something unexpected can't happen; we've seen it before in the past.
|
I actually already tested that on my BE88U. Sorry, I should have mentioned it but I was worried I'd already spammed up this PR. :) The wgcX_enable variable only really indicates whether the interface is on/off in the webUI. It doesn't indicate an up/down state. I threw a random digit into the IP for the WireGuard server, the tunnel definitely wasn't "up", wgcX_enable was still 1.
Upon reflection, I don't know if If you went this route, I would suggest that ip is more mature and unlikely to change significantly in future. I don't know either way for wg specifically, but I'd assume newer software is generally more likely to have an unstable interface and thus maybe require attention again in future.
I'd agree that it's generally better to be self-contained and do as much as possible as locally as possible. This may be the only option that gives a true up/down indication, but I wonder if it might be better to just let
This is equivalent to using wgcX_enable from the NVRAM too. I don't know if it's cheaper to do a |
Based on all the options I just pointed out; none but a real world test gives you actual "connected" status. Because that's all we will be able to get with the limitations of the WireGuard client without doing real world tests sadly. |
I'm leaning towards this myself. Just let it fail silently if it's configure, but "disconnected". But maybe @Martinski4GitHub or @jackyaz has opinions as well? |
Yes, sorry, I didn't mean to suggest you thought they were real up/down indicators. I was just thinking aloud as I went through the four options.
I agree. It seems cleaner to let it fail than to start sending ICMP packets around. I wonder about the robustness of the
I agree again. I'm just some dude rocking up late to the party, sleep deprived, and pasting a mix of real and self-inflicted problems in the completely wrong PR. :) I've shared my thoughts, but I'll be happy with whatever others decide. |
No worries; just wanted to be clear that in testing all of them in that order; all failed to give actual connected status, except for option 3. Maybe I didn't make that clear originally; but my intention was to say really none of the 4 options are ideal... Sadly. It's why I was throwing it back to you/the group for opinions lol!
ping in my eyes isn't very robust and hasn't been since what? the early 2000s maybe? lol. I had to throw it out there as the only option that worked for me to determine connected status. But I also pointed out I am not a fan of the option for those reasons as well.
I am basically the same guy; so no worries there. This was a good catch on your part; because it will require some rethinking about how we want to detect disabled and disconnected WireGuard interfaces. |
So. In the WebUI. It actually gives us a "Connected" Status. |
So the status appears to be populated by: |
Okay so based on this file and function: https://github.com/RMerl/asuswrt-merlin.ng/blob/1d784fe35b8a78ebf15a6cf5f7ef880ff0e53b86/release/src/router/shared/vpn_utils.c#L398 This code below checks to see the last timestamp. That means if the last handshake was 3 minutes ago or less, it’s considered “connected.” Otherwise, it’s considered offline (return 0).
We can replicate something like this:
And that would be basically a match for match with what the WireGuard client uses to determine connection status. |
Another alternative I just thought of with would be:
But it's technically possible for there to be minimal to no traffic for some period of time. |
Here is my recommendation on proper detection of the interfaces going up and down live: develop...ExtremeFiretop:spdMerlin-Jack:develop I tested and and confirmed it works:
I'll now let @Martinski4GitHub respond; because I know he is online at this moment. |
Just thought I'd throw another of my two cents in.. Good call on the handshake time, I reckon. A fuzzy up/down is probably the best we'll get out of WireGuard, and it's just sitting there ripe for the getting. Avoids screwing about with some external test, and speedtest still fails cleanly if it all goes wrong.
Back in the good old days.. I'd finally gotten a modem fast enough that the internet was more than a telnet console with a MUD in it, Lynx was no longer my default browser, ping was still useful and Luba was the hotness. :) |
It's actually instantly live, while testing I realized the handshake time goes to zero when you "disconnect". Which means basically live we can get a "down" status if the interface is enabled in nvram but the handshake is equal to 0! I think this is the best it gets using a combination of the wg output and nvram. |
btw I realize this is basically over, but in the future feel free to message me directly on snbforms if you have an account. :) I'm not against a chat, especially if you think you found something of value like you did today. thats where me and Martinski usually chat to not spam a PR. |
Yeah, that sounds good. I can add those changes to my 'develop' branch which will add them to my current PR for JackYaz's repo. I'll do that before I go to sleep tonight. BTW, working on MerlinAU now to add a password verification test... |
Sounds good! Thanks @Martinski4GitHub I'll let you run with it and i'll validate afterwards. I just made another update to my develop branch; because I realized I didn't commit the right change, I didn't mean to keep the code modified in Create_Symlinks; that was just while testing; instead I just made it call
Sounds good my friend! I will look forwards to testing that in the new WebUI for the addon as well! |
Got it.
I'm almost done - just doing some more testing and making some final touches before submitting the PR. |
Modified code to correctly detect when a WireGuard interface is up (connected) or down (disconnected) in addition to being not enabled. Since a WireGuard connection is essentially stateless, the new code improves on the detection method to provide the correct status so the user can select the desired interface to run a speed test. NOTE: This fix was provided by @ExtremeFiretop.
I just submitted more changes to this PR to fix the WireGuard connection status detection. NOTE: This fix was provided by @ExtremeFiretop. Code was modified to correctly detect when a WireGuard interface is up (connected) or down (disconnected) in addition to being not enabled. Since a WireGuard connection is essentially stateless, the new code improves on the detection method to provide the correct status so the user can select the desired interface to run a speed test. |
Fixed the correct capital letters for WireGuard.
Thanks for this! |
I found a bug/oversight with using force; it came to me while while dreaming I realized why we weren't using the force parameter previously; essentially we are wiping out any specific user configuration they might of setup doing it this way (if they set up specific excludes for some reason) But the good news is I found an alternative method which is more specific to WireGuard and it's limitations around states. We will need to update the: Set_Interface_State function so include some new additional logic; but that will preserve the user configurations if they set it to exclude. Find a working example here: Martinski4GitHub#1
|
Ah, yes. Good catch!! |
Update spdmerlin.sh Set_Interface_State
Fixed 5 more places where checking for interface connection status to include WireGuard interfaces.
Just tested your PR and all is golden, no errors in the debug console, and it's running all the tests perfectly and correctly detecting interfaces that go offline "disconnect" for me! :) |
OK, great. Thanks a lot for double-checking & testing. I appreciate your taking the time. |
After the initial merge of @ExtremeFiretop's PR to support WireGuard, several more fixes and improvements have been made to correctly detect the connection status ("up" vs "down") of the WireGuard interfaces. Without these new fixes, some users running the currently available 'develop' branch version are getting false negatives where the WireGuard interface is being reported as "not up" even though it's both enabled and actively connected. FYI. |
IMPROVED: Modified all SQLite3 calls to capture and log errors in the system log.
IMPROVED: Modified SQLite3 configuration parameters to improve the trimming of records from the database and then perform "garbage collection" of deleted entries to reclaim unused space & avoid excessive fragmentation.
IMPROVED: Modified SQLite3 configuration parameters to improve the processing of database records.
IMPROVED: Modified code to set the corresponding priority level of log entries when calling the built-in logger utility.
IMPROVED: Modified the startup call made in the post-mount script to check if the USB-attached disk partition passed as argument has indeed Entware installed.
IMPROVED: Added code to show the current database file size information on the CLI menu and the webGUI page.
IMPROVED: Added code to show the "JFFS Available" space information for the "Data Storage Location" option on the CLI menu and the webGUI page.
IMPROVED: Added code to check if sufficient JFFS storage space is available before moving database-related files/folders from the USB location to the JFFS partition. An error message is reported if not enough space is available, and the move request is aborted.
IMPROVED: Added code to check if the available JFFS storage space falls below 20% of total space or 10MB (whichever is lower) and report a warning when it does. A warning message is also shown on the SSH CLI menu and WebGUI page.
IMPROVED: Added and modified code so that every time the SSH CLI menu is run, it checks if the WebGUI page has already been mounted. If not found mounted, the script will run the code to remount the WebGUI.
IMPROVED: Improved code that creates (during installation) and removes (during uninstallation) the "AddOns" menu tab entry for the WebGUI to make sure it checks for and takes into account other add-ons that may have been installed before or were later installed after the initial installation.
IMPROVED: Added "export PATH" statement to give the built-in binaries higher priority than the equivalent Entware binaries.
FIXED: Modified code to correctly detect when a WireGuard interface is up (connected) or down (disconnected) in addition to being not enabled. Since a WireGuard connection is essentially stateless, the new code improves on the detection method to provide the correct status so the user can select the desired interface to run a speed test.
NOTE: This fix was provided by @ExtremeFiretop.
Miscellaneous code improvements & fine-tuning.