-
-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Optimizer (float arithmetic and more) #472
base: master
Are you sure you want to change the base?
Conversation
News: - Now only works on JIT without debug. - Using SSE for float arithmetic. - Change Helpers.asm to C++ code.
fix include name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Shouldn't amxexecn.asm also be updated?
-
It'd be best if @Arkshine compiled himself the obj files
-
There isn't a short correct way of rounding to floor/ceil without setting the MXCSR register (slow) using SSE. You can check the code popular compilers emit for such functions.
-
This review only includes the asm parts, I didn't check the rest
jne .Ceil | ||
;else if (arg2 == 1) FLOOR | ||
;{ | ||
cvttss2si eax, dword [esi+4] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect code. floor(-10.0)
results in -11.
jne .Zero | ||
;else if (arg2 == 2) CEIL | ||
;{ | ||
movss xmm0, dword [esi+4] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also incorrect. You're basically doing round(x + 0.5)
which is not equivalent to ceil rounding. This way ceil(1.0f)
results in 2.
jne .Floor | ||
;if (arg2 == 0) ROUND | ||
;{ | ||
cvtss2si eax, dword [esi+4] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct but breaks compatibility. Oddly, amxmodx doesn't do regular "banker's rounding", but instead does basically floor(x + 0.5)
. Therefore, without this change, rounding 2.5 gives you 3, and with your change rounding 2.5 gives you 2.
Personally, I wouldn't mind breaking compatibility here, but considering this could easily break some plugins, Arkshine and others might not agree with this change, and they'd probably be right.
See this PR for some extra info.
@@ -10,6 +10,7 @@ | |||
#ifndef __AMXXLOG_H__ | |||
#define __AMXXLOG_H__ | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This boosts performance by 1000%
Do you guys plan to merge this? A performance boost is always good news. |
Like @mo0nsniper said. It'll be good if it's merged. I really wanna try it. Hope this is merged with other pending pull requests. |
There no answers from the author. |
@Destro- do you plan to update and commit your optimizations? |
Can't it be still merged ? |
I think that's a pretty clear sign that it can't be merged yet. You should chase the author of this PR to complete it. |
Definitely random builds of obj files shouldn’t be included in the tree. They should be trusted by an AM peer or the build automated (running these through nasm/yasm wouldn’t be hard). |
@Destro- Are you still around? |
Average speed improvement:
float arithmetic: 45%
int to float: 25%
float to int: 400%
float compare: 2% (same shit)
I'm also testing to optimize the xs_vec functions, it is very remarkable the performance improvement in heavy plugins.