Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve set_task tasks timing and performance #380

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Improve set_task tasks timing and performance #380

wants to merge 1 commit into from

Conversation

IgnacioFDM
Copy link
Contributor

@IgnacioFDM IgnacioFDM commented Aug 9, 2016

Half-rewrite of task system. The highlights are improved performance due to use of std::vector instead of a linked list, which is faster due to locality, and ordering tasks by execution time so it's not necessary to iterate over the entire task list every frame.

In addition, code is now more modern (C++11 stuff) and should be safer.

Tasks before would be executed in batches every 0.1 seconds, I guess so as to avoid checking every frame if a certain task was due or not.
This however leads to bad performance and many issues.

Bad performance because

  • If there are many tasks, instead of them executing in a spread fashion, they will all or most execute together in a single frame. This might lead to the frame not meeting it's allocated time (if sys_ticrate 1000, if a single frame takes more than 1ms this will result in server fps drop of course)
  • Because of the limitation and poor timing of tasks (can't have a 0.05 task, or a 0.15 task will be really imprecise), it's a widespread recommendation/workaround to use entities and hook think. This method is obviously slower and has more overhead than a plain task, but was the only valid alternative often.

Because of this, although it might seem counterintuitive, checking for due tasks every frame actually improves performance, since it's all about not having any slow frames and not average CPU usage (besides the fact that it takes negligible CPU time)

Besides, it fixes the following issues
*Need to use entities because of the known limitations
*Huge timing drifts (imagine how much will a repetitive 0.15s task drift over time).

I've tested it with a monster size plugin where the slightest compatibility issue would arise, and seems to be fine.

@Nextra
Copy link
Contributor

Nextra commented Aug 9, 2016

I think a more thorough improvement of internal task handling could push timing and performance further. If the TaskList managed by the CTaskMgr was split up into free and used tasks (similar to how other handles are managed), the used task descriptors could actually be sorted by execution time. That means only checking the very first task suffices on a majority of server frames, making the switch to "frame-perfect" checks have no downside at all. There would have to be a separate handling of tasks (ab)using the task flags, but a cleanup of executeIfRequired would be a very welcomed side-effect.

Also as a side note, the impact of lifting the timer resolution restrictions might be a bcompat concern, as the current system interferes with the timer intervals quite a bit.

@IgnacioFDM
Copy link
Contributor Author

@Nextra
I'll see if I can squeeze in some time this week to further optimize task system. I thought about doing something like that, however I was a bit lazy since CTask code is a bit messy and I would probably end up rewriting most of it. I also wanted to do some "non-strict" mode (probably a set_task flag) for situations where perfect timing isn't critical (the non-strict mode would anyway be more precise than current situation), so the scheduler would push a frame or a few (depending maybe on tickrate) some task if many are to be executed this frame, in other words the scheduler would spread tasks.

This would be handy since it's very common to have many tasks scheduled to a same time (imagine plugin_init scheduling many infinite tasks 0.1s tasks, they will coincide forever), and it would also allow plugins to use set_task as a way to spread out some heavy computation into many steps .

Also as a side note, the impact of lifting the timer resolution restrictions might be a bcompat concern, as the current system interferes with the timer intervals quite a bit.

What do you mean by this? Can you expand please

@Nextra
Copy link
Contributor

Nextra commented Aug 9, 2016

What do you mean by this? Can you expand please.

I said this because of your mentioning of "poor timing". You haven't changed this in the code (yet), but with your current changes a 0.05 timer is still impossible. This is because set_task forces a minimum interval of 0.1 seconds.

I also wanted to do some "non-strict" mode

I don't think you should stress out too much over many tasks coinciding. You can not safe-guard against plugins shooting themselves in the foot, and introducing additional hard to document API options into the already messy task system is not a good idea in my opinion. The task system should actually be simplified drastically were it not for backwards compatibility. You should take a look at SourceMods timers to see a clean system without all of the weirdly specific behaviors of AMXX.

@IgnacioFDM
Copy link
Contributor Author

Lol, I forgot to git add amxmodx.cpp, I did intend to allow arbitrarily small delays. Luckily you noticed. Yes, I thought about the fact that this would change behavior of plugins with delays <0.1, although don't think it's an issue, since it's probably very rare to have plugins with delays <0.1 (since they didn't "work"), and even then, I can't imagine in these rare instances a plugin to break because of faster execution (in fact it would probably work "better" if <0.1s is what the author intended). Anyway, I'm always strongly against breaking bcompat, but I believe in this case the potential issues are outweighed by not needing entities as workaround and improved timing overall.

If you think it's better to have set_task2 or something to fix bcompat, let's do it that way then, but either way I'd say it's kind of needed being able to have <0.1s tasks.

About the non-strict mode, I don't think adding another flag

"e" - low priority task, might fire up to 0.1s later in order to improve performance

is that complicated or really allows people to shoot themselves in the foot in some new, non obvious way.

@WPMGPRoSToTeMa
Copy link
Contributor

Will be good to create player tasks (specific tasks only for players), because tasks are often used for players. They can be automatically killed on player disconnect and there can be some useful flags, like KillOnDeath, KillOnSpawn etc. I think it would be very convenient.

@IgnacioFDM
Copy link
Contributor Author

It's a nice idea, however I don't have time right now to implement it. I'll be working this thursday/friday on optimizing this PR as nextra said using std::priority_queue and whatnot

@Nextra
Copy link
Contributor

Nextra commented Aug 10, 2016

About the non-strict mode, I don't think adding another flag is that complicated or really allows people to shoot themselves in the foot in some new, non obvious way.

That's not what I'm getting at: The new flag won't prevent plugins shooting themselves in the foot either, and it tackles an issue that I'd view as fairly minor.

As I said above the task system is already filled with a ton of rarely-used special-case stuff that shouldn't be there in the first place. Adding more flags is pretty much the last thing I personally would want to do, even disregarding the fact that documenting anything that is not a definitive upgrade (such as pcvars) with "this might improve performance" can be considered a mistake in this community.

Will be good to create player tasks

This is what I think shouldn't happen at all. The task system is perfectly capable of doing this with minimal bookkeeping on the plugin's side. This doesn't need to be core functionality.

@rsKliPPy
Copy link
Contributor

rsKliPPy commented Aug 10, 2016

A complete overhaul of task/timer system and deprecation of the old one is needed in my opinion. Task IDs are awfully bad, and what's even worse, you can look for tasks outside your plugin and destroy or change them. Task/timers should be handles, right now tasks can't be truly unique.

Will be good to create player tasks

It's a nice idea for an API for plugin makers out there, it could help other scripters deal with tasks easier and with less errors if done right. However, as Nextra said, it's not something that should be core functionality.

@WPMGPRoSToTeMa
Copy link
Contributor

Task IDs are awfully bad, and what's even worse, you can look for tasks outside your plugin and destroy or change them.

Who's using the outside parameter?

@rsKliPPy
Copy link
Contributor

rsKliPPy commented Aug 10, 2016

I hope nobody is, but you never know. Its presence isn't good.
Still, it's not just about the outside parameter, there's other stuff Nextra and I talked about too.

@Nextra
Copy link
Contributor

Nextra commented Aug 10, 2016

Ideally we could retrofit a handle system into the tasks, but I have no idea how sane that would be. A new timer system would be a bit odd I think. The benefits of a cleaner task implementation would be mostly internal, which is defeated by the fact that AMXX would then contain two competing systems that each don't have obvious benefits from an API standpoint.

@rsKliPPy
Copy link
Contributor

rsKliPPy commented Aug 10, 2016

First point is that each task would be truly unique. The way it is currently, two or more tasks can have same IDs. I think it's proper to refer to a task as an object (a handle), making it unique.
It's also ugly to deal with. Having to do stuff like

#define MY_TASK 720
#define MY_OTHER_TASK 752
// ...
set_task(42.0, "MyTaskCallback", id + MY_TASK);
set_task(69.0, "MyOtherTaskCallback", id + MY_OTHER_TASK);

just to make them unique is really bad in my opinion, and still isn't safe - if id was any entity other than a player, you could get into other task's "range". You also deal too much with "magic numbers".

And, as we now have DataPacks, we can transfer data to the callback function with them. Yes, it is possible right now as it is, but you have to deal with arrays, while passing a single handle would be much better. That is what DataPacks should be used for.

I think there are many benefits, starting off from uniqueness of every task, which is the most important. So changes would not only be internal, as plugins will have to treat tasks like handles.

EDIT:
And we could have proper flags (if there will be any) instead of just using letters. A named constant is much easier to remember than what 'a' flag does.

Half-rewrite of task system. The highlights are improved performance due to use of std::vector instead of a linked list, which is faster due to locality, and ordering tasks by execution time so it's not necessary to iterate over the entire task list every frame.

In addition, code is now more modern (C++11 stuff) and should be safer.

Tasks before would be executed in batches every 0.1 seconds, I guess so as to avoid checking every frame if a certain task was due or not.
This however leads to bad performance and many issues.

Bad performance because
- If there are many tasks, instead of them executing in a spread fashion, they will all or most execute together in a single frame. This might lead to the frame not meeting it's allocated time (if sys_ticrate 1000, if a single frame takes more than 1ms this will result in server fps drop of course)
- Because of the limitation and poor timing of tasks (can't have a 0.05 task, or a 0.15 task will be really imprecise), it's a widespread recommendation/workaround to use entities and hook think. This method is obviously slower and has more overhead than a plain task, but was the only valid alternative often.

Because of this, although it might seem counterintuitive, checking for due tasks every frame actually improves performance, since it's all about not having any slow frames and not average CPU usage (besides the fact that it takes negligible CPU time)

Besides, it fixes the following issues
*Need to use entities because of the known limitations
*Huge timing drifts (imagine how much will a repetitive 0.15s task drift over time).

I've tested it with a monster size plugin where the slightest compatibility issue would arise, and seems to be fine.
@IgnacioFDM
Copy link
Contributor Author

@Nextra @WPMGPRoSToTeMa @rsKliPPy

So, I ended half-rewriting the thing. Check out the commit message for more info. I'd like your input.

typedef TaskList::iterator TaskListIter;

TaskList m_Tasks;
std::vector<CTask> m_Tasks;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://forums.alliedmods.net/showpost.php?p=1143191&postcount=8
I don't know if these statements still hold true, but I would advise against using std::vector - that's why we have AMTL. Use ke::Vector instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea about the compatibility part, but comparing to ke::Vector, from some quick benchmarking on MSVC, ke::Vector is consistently slower (around 50% slower for appends and 30% for some mixed workload), there doesn't seem to be a sort algorithm (maybe I'm blind though) which would be necessary here (I think std::sort should work, but then again, you'd be using std), there is no efficient erase remove operation (if you want to remove 100 objects, it's going to be incredibly slow because it will shift everything a 100 times), no move semantics which helps quite a bit with performance too, and whatnot, that's why I went with std.

Anyway, I'd understand if you want to reject this PR because the std dependency, and don't take this comparison personal or as a std vs amtl war, I'm probably wrong in many of the points I mentioned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can include <amtl/am-shared-ptr.h> and use ke::SharedPtr and ke::MakeShared instead of std::shared_ptr and std::make_shared.

As for std::vector, it might be better to just create a priority queue type of container. It should actually make the code simpler.
There's ke::PriorityQueue<T> in AMTL but it doesn't suit our needs as you can't remove items from it. What you would have to do is either expand that one or create a new more specific priority queue sort of container class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I currently don't have time nor need (since this patch already fills my needs), but your suggested changes sound good and I welcome anybody to PR them.

@gtteamamxx
Copy link

Why this PR is waiting so long?

@Arkshine
Copy link
Member

PR has been abandonned. Someone needs to take over to finish it. and due to the important changes, this needs to be carefully reviewed as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants