forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[6.6] Track ClearLinux kernel performance patches #28
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Author: Arjan van de Ven <[email protected]> Signed-off-by: Miguel Bernal Marin <[email protected]> Signed-off-by: Jose Carlos Venegas Munoz <[email protected]>
Both the VM and EXT4 have a "commit to disk after X seconds" time. Currently the EXT4 time is shorter than our VM time, which is a bit suboptional, it's better for performance to let the VM do the writeouts in bulk rather than something deep in the journalling layer. (DISTRO TWEAK -- NOT FOR UPSTREAM) Signed-off-by: Arjan van de Ven <[email protected]> Signed-off-by: Jose Carlos Venegas Munoz <[email protected]>
Reduce wakeups for PME checks, which are a workaround for miswired boards (sadly, too many of them) in laptops.
Increase target_residency in cpuidle cstate Tune intel_idle to be a bit less agressive; Clear linux is cleaner in hygiene (wakupes) than the average linux, so we can afford changing these in a way that increases performance while keeping power efficiency
Few distro-tweaks to add printk's to visualize boot time better Author: Arjan van de Ven <[email protected]> Signed-off-by: Miguel Bernal Marin <[email protected]>
NO point recalibrating for known-constant tsc ... saves 200ms+ of boot time.
ATA init is the long pole in the boot process, and its asynchronous. move the graphics init after it so that ata and graphics initialize in parallel
As Clear Linux boots fast the device is not ready when the mounting code is reached, so a retry device scan will be performed every 0.5 sec for at least 40 sec and synchronize the async task. Signed-off-by: Miguel Bernal Marin <[email protected]>
Add module.sig_unenforce boot parameter to allow loading unsigned kernel modules. Parameter is only effective if CONFIG_MODULE_SIG_FORCE is enabled and system is *not* SecureBooted. Signed-off-by: Brett T. Warden <[email protected]> Signed-off-by: Miguel Bernal Marin <[email protected]>
Prefer the order of specific version before generic and /etc before /lib to enable the user to give specific overrides for generic firmware and distribution firmware.
These settings are needed to prevent networking issues when the networking modules come up by default without explicit settings, which breaks some cases. We don't want the modprobe settings to be read at boot time if we're not going to do anything else ever.
Kvmtool and clear containers supports using user attributes to label host files with the virtual uid/guid of the file in the container. This allows an end user to manage their files and a complete uid space without all the ugly namespace stuff. The one gap in the support is symlinks because an end user can change the ownership of a symbolic link. We support attributes on these files as you can already (as root) set security attributes on them. The current rules seem slightly over-paranoid and as we have a use case this patch enables updating the attributes on a symbolic link IFF you are the owner of the synlink (as permissions are not usually meaningful on the link itself). Signed-off-by: Alan Cox <[email protected]>
tweak rwsem owner spinning a bit
Change libahci to ignore firmware's staggered spin-up flag. End-users who wish to honor firmware's SSS flag can add the following kernel parameter to a new file at /etc/kernel/cmdline.d/ignore_sss.conf: libahci.ignore_sss=0 And then run sudo clr-boot-manager update Signed-off-by: Joe Konno <[email protected]>
print cpu number when we print a crash
On systems with overclocking enabled, CPPC Highest Performance can be hard coded to 0xff. In this case even if we have cores with different highest performance, ITMT can't be enabled as the current implementation depends on CPPC Highest Performance. On such systems we can use MSR_HWP_CAPABILITIES maximum performance field when CPPC.Highest Performance is 0xff. Due to legacy reasons, we can't solely depend on MSR_HWP_CAPABILITIES as in some older systems CPPC Highest Performance is the only way to identify different performing cores. Signed-off-by: Srinivas Pandruvada <[email protected]>
make sure there's at least 1024 per cpu pages... a reasonably small amount for todays system
Instead of using jiffies and waiting for jiffies to wrap before measuring use the higher precision local_time for benchmarking. Measure 2500 loops, which works out to be accurate enough for benchmarking the raid algo data rates. Also add division by zero checking in case timing measurements are bogus. Speeds up raid benchmarking from 48,000 usecs to 4000 usecs, saving 0.044 seconds on boot. Signed-off-by: Colin Ian King <[email protected]>
Printing initcall timings that successfully return after 0 usecs provides not much useful information and takes a small amount of time to do so. Disable the initcall timings for these specific cases. On an Alderlake i9-12900 this reduces kernel boot time by 0.67% (timed up to the invocation of systemd starting) based on 10 boot measurements. Signed-off-by: Colin Ian King <[email protected]>
Place libraries right below the binary for PIE binaries, this helps code locality (and thus performance). Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Some misguided apps hammer sched_yield() in a tight loop (they should be using futexes instead) which causes massive lock contention even if there is little work to do or to yield to. rare limit yielding since the base scheduler does a pretty good job already about just running the right things Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Enabling SLAB_HWCACHE_ALIGN for the ACPI object caches improves boot speed in the ACPICA core for object allocation and free'ing especially in the AML parsing and execution phases in boot. Testing with 100 boots shows an average boot saving in acpi_init of ~35000 usecs compared to the unaligned version. Most of the ACPI objects being allocated and free'd are of very short life times in the critical paths for parsing and execution, so the extra memory used for alignment isn't too onerous. Signed-off-by: Colin Ian King <[email protected]>
… pr_err For x86 targets it's more pertinant to check for lack of MWAIT than AMD specific cpus, so swap the order of tests. Also make the pr_err a pr_warn to align with other ENODEV warning messages. Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Obsolete, see #34 instead. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Export patch series: https://github.com/kakra/linux/pull/28.patch
ClearLinux performance patches: a selected set of ClearLinux kernel patches which are supposed to improve performance, gaming experience, or general compatibility with latest Intel CPUs (e.g. asymmetric CPU cores of 12th gen or later)