Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Optimizer (float arithmetic and more) #472

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Update Optimizer (float arithmetic and more) #472

wants to merge 3 commits into from

Conversation

Destro-
Copy link

@Destro- Destro- commented Dec 6, 2017

  • Now only works on JIT without debug.
  • Using SSE for float arithmetic.
  • Change Helpers.asm to C++ code.

Average speed improvement:
float arithmetic: 45%
int to float: 25%
float to int: 400%
float compare: 2% (same shit)


I'm also testing to optimize the xs_vec functions, it is very remarkable the performance improvement in heavy plugins.

Destro- and others added 2 commits December 6, 2017 12:41
News:
- Now only works on JIT without debug.
- Using SSE for float arithmetic.
- Change Helpers.asm to C++ code.
fix include name
Copy link
Contributor

@IgnacioFDM IgnacioFDM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Shouldn't amxexecn.asm also be updated?

  2. It'd be best if @Arkshine compiled himself the obj files

  3. There isn't a short correct way of rounding to floor/ceil without setting the MXCSR register (slow) using SSE. You can check the code popular compilers emit for such functions.

  4. This review only includes the asm parts, I didn't check the rest

jne .Ceil
;else if (arg2 == 1) FLOOR
;{
cvttss2si eax, dword [esi+4]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect code. floor(-10.0) results in -11.

jne .Zero
;else if (arg2 == 2) CEIL
;{
movss xmm0, dword [esi+4]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also incorrect. You're basically doing round(x + 0.5) which is not equivalent to ceil rounding. This way ceil(1.0f) results in 2.

jne .Floor
;if (arg2 == 0) ROUND
;{
cvtss2si eax, dword [esi+4]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct but breaks compatibility. Oddly, amxmodx doesn't do regular "banker's rounding", but instead does basically floor(x + 0.5). Therefore, without this change, rounding 2.5 gives you 3, and with your change rounding 2.5 gives you 2.

Personally, I wouldn't mind breaking compatibility here, but considering this could easily break some plugins, Arkshine and others might not agree with this change, and they'd probably be right.

See this PR for some extra info.

@@ -10,6 +10,7 @@
#ifndef __AMXXLOG_H__
#define __AMXXLOG_H__


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This boosts performance by 1000%

@mo0nsniper
Copy link

Do you guys plan to merge this? A performance boost is always good news.

@renderer5819
Copy link

Like @mo0nsniper said. It'll be good if it's merged. I really wanna try it. Hope this is merged with other pending pull requests.

@Arkshine
Copy link
Member

There no answers from the author.

@mo0nsniper
Copy link

@Destro- do you plan to update and commit your optimizations?

@renderer5819
Copy link

Can't it be still merged ?
Also I tried compiling the merge with the latest amx build. The server crashes with the Segmentation Fault Core dumped error. I had tried with ReHLDS before it crashed. Did with normal HLDS ( the latest build) still crashes. Can anyone help?

@rsKliPPy
Copy link
Contributor

I think that's a pretty clear sign that it can't be merged yet. You should chase the author of this PR to complete it.

@dvander
Copy link
Member

dvander commented Feb 18, 2018

Definitely random builds of obj files shouldn’t be included in the tree. They should be trusted by an AM peer or the build automated (running these through nasm/yasm wouldn’t be hard).

@Arkshine
Copy link
Member

@Destro- Are you still around?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants