forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[6.12] Track ClearLinux kernel performance patches #34
Draft
kakra
wants to merge
38
commits into
base-6.12
Choose a base branch
from
rebase-6.12/clearlinux-tweaks
base: base-6.12
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Author: Arjan van de Ven <[email protected]> Signed-off-by: Miguel Bernal Marin <[email protected]> Signed-off-by: Jose Carlos Venegas Munoz <[email protected]>
Both the VM and EXT4 have a "commit to disk after X seconds" time. Currently the EXT4 time is shorter than our VM time, which is a bit suboptional, it's better for performance to let the VM do the writeouts in bulk rather than something deep in the journalling layer. (DISTRO TWEAK -- NOT FOR UPSTREAM) Signed-off-by: Arjan van de Ven <[email protected]> Signed-off-by: Jose Carlos Venegas Munoz <[email protected]>
Reduce wakeups for PME checks, which are a workaround for miswired boards (sadly, too many of them) in laptops. Signed-off-by: Kai Krakow <[email protected]>
Increase target_residency in cpuidle cstate Tune intel_idle to be a bit less agressive; Clear linux is cleaner in hygiene (wakupes) than the average linux, so we can afford changing these in a way that increases performance while keeping power efficiency Signed-off-by: Kai Krakow <[email protected]>
NO point recalibrating for known-constant tsc ... saves 200ms+ of boot time. Signed-off-by: Kai Krakow <[email protected]>
…default Signed-off-by: Kai Krakow <[email protected]>
As Clear Linux boots fast the device is not ready when the mounting code is reached, so a retry device scan will be performed every 0.5 sec for at least 40 sec and synchronize the async task. Signed-off-by: Miguel Bernal Marin <[email protected]>
Add module.sig_unenforce boot parameter to allow loading unsigned kernel modules. Parameter is only effective if CONFIG_MODULE_SIG_FORCE is enabled and system is *not* SecureBooted. Signed-off-by: Brett T. Warden <[email protected]> Signed-off-by: Miguel Bernal Marin <[email protected]>
Prefer the order of specific version before generic and /etc before /lib to enable the user to give specific overrides for generic firmware and distribution firmware. Signed-off-by: Kai Krakow <[email protected]>
These settings are needed to prevent networking issues when the networking modules come up by default without explicit settings, which breaks some cases. We don't want the modprobe settings to be read at boot time if we're not going to do anything else ever. Signed-off-by: Kai Krakow <[email protected]>
Kvmtool and clear containers supports using user attributes to label host files with the virtual uid/guid of the file in the container. This allows an end user to manage their files and a complete uid space without all the ugly namespace stuff. The one gap in the support is symlinks because an end user can change the ownership of a symbolic link. We support attributes on these files as you can already (as root) set security attributes on them. The current rules seem slightly over-paranoid and as we have a use case this patch enables updating the attributes on a symbolic link IFF you are the owner of the synlink (as permissions are not usually meaningful on the link itself). Signed-off-by: Alan Cox <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
tweak rwsem owner spinning a bit Signed-off-by: Kai Krakow <[email protected]>
Change libahci to ignore firmware's staggered spin-up flag. End-users who wish to honor firmware's SSS flag can add the following kernel parameter to a new file at /etc/kernel/cmdline.d/ignore_sss.conf: libahci.ignore_sss=0 And then run sudo clr-boot-manager update Signed-off-by: Joe Konno <[email protected]>
print cpu number when we print a crash Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
On systems with overclocking enabled, CPPC Highest Performance can be hard coded to 0xff. In this case even if we have cores with different highest performance, ITMT can't be enabled as the current implementation depends on CPPC Highest Performance. On such systems we can use MSR_HWP_CAPABILITIES maximum performance field when CPPC.Highest Performance is 0xff. Due to legacy reasons, we can't solely depend on MSR_HWP_CAPABILITIES as in some older systems CPPC Highest Performance is the only way to identify different performing cores. Signed-off-by: Srinivas Pandruvada <[email protected]>
make sure there's at least 1024 per cpu pages... a reasonably small amount for todays system Signed-off-by: Kai Krakow <[email protected]>
gcc12/build workarounds Signed-off-by: Kai Krakow <[email protected]>
Instead of using jiffies and waiting for jiffies to wrap before measuring use the higher precision local_time for benchmarking. Measure 2500 loops, which works out to be accurate enough for benchmarking the raid algo data rates. Also add division by zero checking in case timing measurements are bogus. Speeds up raid benchmarking from 48,000 usecs to 4000 usecs, saving 0.044 seconds on boot. Signed-off-by: Colin Ian King <[email protected]>
Printing initcall timings that successfully return after 0 usecs provides not much useful information and takes a small amount of time to do so. Disable the initcall timings for these specific cases. On an Alderlake i9-12900 this reduces kernel boot time by 0.67% (timed up to the invocation of systemd starting) based on 10 boot measurements. Signed-off-by: Colin Ian King <[email protected]>
Author: Intel ClearLinux <unknown> Place libraries right below the binary for PIE binaries, this helps code locality (and thus performance). Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Some misguided apps hammer sched_yield() in a tight loop (they should be using futexes instead) which causes massive lock contention even if there is little work to do or to yield to. rare limit yielding since the base scheduler does a pretty good job already about just running the right things Signed-off-by: Colin Ian King <[email protected]>
Author: Intel ClearLinux <unknown> Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Author: Intel ClearLinux <unknown> Signed-off-by: Kai Krakow <[email protected]>
Enabling SLAB_HWCACHE_ALIGN for the ACPI object caches improves boot speed in the ACPICA core for object allocation and free'ing especially in the AML parsing and execution phases in boot. Testing with 100 boots shows an average boot saving in acpi_init of ~35000 usecs compared to the unaligned version. Most of the ACPI objects being allocated and free'd are of very short life times in the critical paths for parsing and execution, so the extra memory used for alignment isn't too onerous. Signed-off-by: Colin Ian King <[email protected]>
… pr_err For x86 targets it's more pertinant to check for lack of MWAIT than AMD specific cpus, so swap the order of tests. Also make the pr_err a pr_warn to align with other ENODEV warning messages. Signed-off-by: Colin Ian King <[email protected]>
Making vmx_init a late initcall improves QEMU kernel boot times to get to the init process. Average of 100 boots, QEMU boot average reduced from 0.776 seconds to 0.622 seconds (~19.8% faster) on Alderlake i9-12900 and ~0.5% faster for non-QEMU UEFI boots. Signed-off-by: Colin Ian King <[email protected]>
Author: Intel ClearLinux <unknown> Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
scale these by a factor of 4 to improve socket performance Signed-off-by: Colin Ian King <[email protected]>
Conflicts: Gentoo kernel 6.12.1 with BMQ/PDS patches, see description |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Export patch series: https://github.com/kakra/linux/pull/34.patch
ClearLinux performance patches: a selected set of ClearLinux kernel patches which are supposed to improve performance, gaming experience, or general compatibility with latest Intel CPUs (e.g. asymmetric CPU cores of 12th gen or later)
Conflicts:
USE="experimental"
), useUNIPATCH_EXCLUDE="5020 5021"
via package.env to exclude those conflicting patches