Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale itp calculation based on total throughput of user #1831

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

XInTheDark
Copy link

@XInTheDark XInTheDark commented Oct 8, 2023

Scale itp calculation based on total throughput instead of active runs count.

This distributes the number of cores more fairly between users, and allows for an effective way to transfer/split TP between runs. (See #1747 (comment))

e.g. It makes sense that 2 STCs, each at 50% TP, is now considered equivalent to 1 at 100% TP.

Refer to the table here for some examples: #1831 (comment)

This distributes the number of cores more fairly between users, and allows for an effective way to transfer/split TP between runs. (See official-stockfish#1747 (comment))
@vondele
Copy link
Member

vondele commented Oct 8, 2023

don't quite agree with this. A user that has 6 successful LTC tests running should be getting more resources, in the interest of the project. The goal is not divide resources as evenly as possible, but such that SF progress is fastest.

@XInTheDark
Copy link
Author

A user that has 6 successful LTC tests running should be getting more resources, in the interest of the project.

I disagree with this - if this was the case we should not have introduced #1747 at all and stick to LLR and TC-based distribution only.

And comparing this PR to current master, there is in fact an advantage - current master encourages users to only submit a few tests, and concentrate resources on these few tests only. But many times the user doesn't know which tests will succeed, so is inclined to first submit a number of variations, let them run for a bit, then choose the best variation and possibly improve on it. Each test, then, can be set to eg. 50% TP.

Let's say we have 5 STCs each at 100TP: Currently each test will get 100 * 36/(36+5*5) = 59 itp, for a total of 295 itp. But say we have 10 STCs each at 50TP: Each test will get 100 * 36/(36+10*10) = 26 itp, for a total of 260 itp only. Without regard for TP, the user is not incentivized to submit more tests. This is counterproductive to the goal of elo gain IMO.

This PR benefits this type of users without impacting the others - eg. for the example you gave, if the user has 6 LTC tests running at high LR, the same number of cores would still be given compared to master.

@vdbergh
Copy link
Contributor

vdbergh commented Oct 8, 2023

I think what Vondele wants to say is that since you reduce based on total itp, a user with many LTC's will get fewer resources (since they have higher itp).

@cngstefan
Copy link

cngstefan commented Oct 8, 2023

Also another case would be odd with this patch:
Consider i have running 6 tests with 100% TP. If i now set 50% TP for all of this tests i would expect that overall half the cores than before are used. But if we do the math its comes up that each tests runs with a 60% higher itp than before.

The main problem is the new quadratic count formula which decreases total itp for more then 6 tests. We should fix the total itp for count >= 6 at the maximum at count=6 and simply divide by count. Then par example if you have 6 tests running and add 6 more the have the same number of cores (as at count=6) but each single test have half the speed as @XInTheDark wants. And this has the advantage that you doesn't have to fiddle with the TP of the tests because it simple works automatically.

I think this code can do it (untested):

        # Malus for too many active runs
        # We scale based on total TP across all runs from this user
        capped_count = min(count, 6)
        itp *= 36.0 / (36.0 + capped_count * capped_count ) * capped_count  / count

@locutus2
Copy link
Member

locutus2 commented Oct 8, 2023

Sorry, wrong account again. I have to look if i can open links from an external application like discord in firefox in the correct container (i have different one for personal use and business).

@cngstefan
Copy link

cngstefan commented Oct 8, 2023

For visual validation i have plotted old and new total itp for n tests:
grafik

@locutus2
Copy link
Member

locutus2 commented Oct 8, 2023

Damn again. But i have no found how you can setup this. But my problem this reacts only to the hostname and i need to differentiate between my private and business repos at github and that seems not to work.

@XInTheDark
Copy link
Author

XInTheDark commented Oct 8, 2023

I think what Vondele wants to say is that since you reduce based on total itp, a user with many LTC's will get fewer resources (since they have higher itp).

Using TP would solve this problem I think? I realise that the calculation here does not really require itp since it's only a multiplier for the original itp.

@XInTheDark
Copy link
Author

The main problem is the new quadratic count formula which decreases total itp for more then 6 tests.

I agree with this, using a quadratic formula is weird IMO, as it instead decreases total itp when number of tests is large. This doesn't make sense - the most we should do is cap the total itp at 300, not decrease it beyond that.

@XInTheDark
Copy link
Author

I think this code can do it (untested):

        # Malus for too many active runs
        # We scale based on total TP across all runs from this user
        capped_count = min(count, 6)
        itp *= 36.0 / (36.0 + capped_count * capped_count ) * capped_count  / count

I think this works, updated!

@locutus2
Copy link
Member

locutus2 commented Oct 8, 2023

@XInTheDark
You miss understand. for my formula we doesn't use the total itp (its not even needed to get the desired effect). Simply use number of running tests. Else we will have again the odd problem i pointed out (and perhaps vondeles).

@XInTheDark
Copy link
Author

XInTheDark commented Oct 8, 2023

To summarise with a few examples:

Scenario PR Master
1x50% STC 50*0.97 = 49 itp per test same
6x100% STCs 100x0.5 = 50 itp per test, 300 itp total same
6x50% STCs 50x0.4 = 20 itp per test, 240 itp total 50x0.5 = 25 itp per test, 300 itp total
12x50% STCs 50x0.5 = 25 itp per test, 300 itp total 50x0.2 = 10 itp per test, 120 itp total
24x100% STCs 100x0.125 = 12.5 itp per test, 300 itp total 100x0.059 = 5.9 itp per test, 142 itp total
10x10% STCs 10x0.97 = 9.7 itp per test, 97 itp total 10x0.265 = 2.65 itp per test, 26.5 itp total

@XInTheDark
Copy link
Author

@XInTheDark
You miss understand. for my formula we doesn't use the total itp (its not even needed to get the desired effect). Simply use number of running tests. Else we will have again the odd problem i pointed out (and perhaps vondeles).

I don't think the problem occurs (refer to row 2 and 3 of the table above, the itp is indeed handled properly)

@XInTheDark XInTheDark changed the title Scale itp calculation based on total itp of user Scale itp calculation based on total throughput of user Oct 8, 2023
@locutus2
Copy link
Member

locutus2 commented Oct 8, 2023

@XInTheDark
But that is the point i made. If par example 6 STC 100% have total itp of 300 then 6 STC 50% should have half of this because thats the meaning of using only 50% throughput IMO. Else it becomes a bit intransparent (and i see no good reason for this behavior). My version avoids this. As said for number of tests <= 6 no changes needed it works as expected, only > 6 is odd. Why make it more complicate for no real benefit? Do it simple.

@XInTheDark
Copy link
Author

@locutus2 The specific problem observed with the simpler version is when there is a large number of tests, each with a low TP. For example 10x10% TP - master gives 2.65 itp per test, your version gives 5% , and this PR version gives 9.7%.

@locutus2
Copy link
Member

locutus2 commented Oct 8, 2023

@XInTheDark
But why will you have all tests with reduced TP? This makes not sense. Do simply the 10 tests with 100% and with my formula to get the total 300 itp (the maximum possible). Using here TP 10% means that i explicit say that i only want 10% core usage for this tests incomparison to 100% TP (as said i see no usecase here which makes sense, only that you want gives cores give back to fishtest so that it can be used for important tests of other users but normally you doesn't manage this).

Question:
Say you have 6 tests. 3 with 100% TP and the other 3 with 50% TP. How many cores should they use? For me its clear that the second group should use only half of the cores than the first one (and cores are directly proportional to itp). And that will not happen with your version because its skews the ratios, so throughput is not easy interpretable as it designed for (a simple direct ratio of core usage).

@XInTheDark
Copy link
Author

@locutus2 In the scenario that you mentioned, the 50% test will still have half the TP of the 100% test. The multiplier is based on total TP, so it’s the same multiplier for all tests.

@XInTheDark
Copy link
Author

In fact in this specific example, my PR gives 64/32 itp respectively, for a total of 288 itp. but your version gives just 50/25, for a total of 225 itp. Can be seen that this PR makes more sense in giving close to 300 itp.

@locutus2
Copy link
Member

locutus2 commented Oct 9, 2023

@XInTheDark
I think i know understand more from where comes the confusion.

Throughput like it currently used and expected should be a scaling factor for used cores for a sinlge test without affecting other tests. Which means par example if i half the throughput of a test exactly this test will use only half the cores but nothing is changed for the other tests. This has especially use for maintainers if an important test has to be pushed (par example for a new TCEC version).

The usage especially highlights the both cases 6x 100% vs 6x 50%: here the second group should have exactly half the cores then the first group because each test should use also half the cores. And this meaning should not be changed.

But you need some other tweaking because your target is mainly to redistribute the cores between your tests but not changing the total core budget. For this a new parameters (say Weight = W) should be added as relative weight to each test (default could be 100 but this is abitrary). Then an additional modificator could be applied which correct the itp according to ratio of test weight and average test weight:

average_weight = total_weight / count
itp *= run["args"]["weight"] / average_weight 

Then we can separately control the effects and don't mix it up:

  • throughput: changes proportionally the cores for a single test (not affecting other tests). This changes total core usage
  • weight: this changes only the relative core usage between your tests and doesn't change the total core usage. Here if a test has the double weight in comparison to another test it will get the double amount of cores than the other.

@XInTheDark
Copy link
Author

The concept looks fine, but I think it's too complicated to have two parameters to calculate the TP. Also, I think the problem of not getting exactly half the cores at 50% TP is not really important - with this PR the number of total itp for the user stays around the same with a sufficiently large number of tests. i.e. as long as total TP remains the same total itp will remain the same. So TP in that case would be used to adjust the relative proportions ("weight" as mentioned in your message) of the itp of the respective tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants