LeastExpensive Market Strategy not using number of threads of providers #1147

RoyKas · 2023-08-26T18:08:44Z

Below section of leastexpensive.py is not using the number of threads of the provider for the expected usage. This results in incorrect selection of providers.
Both the job duration and consumed cpu time are set to the same 'expected_usage', which is 60 seconds by default. This is only correct in case the provider is using a single thread. For higher number of threads, the consumed cpu time should be the same multiple of the job duration.
Either the job duration should be divider by the number of threads, or the consumed cpu time would need to be multiplied by the number of threads.
The first option will favor providers with higher number of threads, the second option will favor provider with lower number of threads.
In case both are undesirable, a third option is to adjust both using the square root of the number of threads, or some other in-between solution.

    expected_usage = []

    for resource in linear.usage_vector:
        if linear.price_for[resource] > self._max_price_for[resource]:
            self._logger.debug(
                "Rejected offer %s: price for '%s' higher than price cap %f.",
                offer.id,
                resource,
                self._max_price_for[resource],
            )
            return SCORE_REJECTED

        if linear.price_for[resource] < 0:
            self._logger.debug("Rejected offer %s: negative price for '%s'", offer.id, resource)
            return SCORE_REJECTED

        expected_usage.append(self._expected_time_secs)

The text was updated successfully, but these errors were encountered:

krunch3r76 · 2023-09-18T01:08:09Z

first and foremost, thanks for flagging this - such observations are crucial for the community to evolve and refine the project.

the essence of CPU time is that it aggregates the compute cycles used, regardless of the number of cores at play. even when multiple cores are working in parallel, their individual compute efforts, when tallied, present a consistent picture i.e. add up to the same total CPU time. put plainly, CPU time is divided across all the cores. these references can provide more granularity:

[1] CPU time formula
[2] CPU Second definition

i'm just another contributor like you, aiming to bring more clarity and efficiency to the project. your insights are invaluable, and i appreciate the time you took to articulate this.

shadeofblue · 2023-09-19T05:16:14Z

@RoyKas well, the default strategy is by necessity a simplification and does take liberty to err on one side or another in those specific cases that you described...

The assumption here is that given the fact that it's hard to arrive at an uncomplicated solution that covers all bases - as you have observed yourself - we're providing something that tries to give a more-or-less sensible score without differentiating between single-threaded vs multi-threaded tasks ...

As an app developer, you're free to implement a strategy that will give a better score for a scenario that you know you will be running...

krunch3r76 · 2023-09-19T05:50:09Z

expected_time_secs is fundamental to the scoring function when comparing offers, as it essentially serves as the sole scaling factor for CPU time and duration. to understand expected_time_secs role, let's examine relevant code snippets:

in the code:

yapapi/yapapi/strategy/least_expensive.py

Lines 66 to 84 in b89b132

    
           for resource in linear.usage_vector: 
        
               if linear.price_for[resource] > self._max_price_for[resource]: 
        
                   self._logger.debug( 
        
                       "Rejected offer %s: price for '%s' higher than price cap %f.", 
        
                       offer.id, 
        
                       resource, 
        
                       self._max_price_for[resource], 
        
                   ) 
        
                   return SCORE_REJECTED 
        
               if linear.price_for[resource] < 0: 
        
                   self._logger.debug("Rejected offer %s: negative price for '%s'", offer.id, resource) 
        
                   return SCORE_REJECTED 
        
               expected_usage.append(self._expected_time_secs) 
        
           # The higher the expected price value, the lower the score. 
        
           # The score is always lower than SCORE_TRUSTED and is always higher than 0. 
        
           score = SCORE_TRUSTED * 1.0 / (linear.calculate_cost(expected_usage) + 1.01)

expected_usage is populated with the value of expected_time_secs corresponding to the linear_coeffs, which are duration and cpu, where order is unimportant. this list is then passed to the calculate_cost method of the ComLinear class:

yapapi/yapapi/props/com.py

Lines 79 to 81 in b89b132

    
           def calculate_cost(self, usage: List): 
        
               usage = usage + [1.0]  # append the "usage" of the fixed component 
        
               return sum([c * usage[i] for (i, c) in enumerate(self.linear_coeffs)])

here, the first two coefficients from linear_coeffs are multiplied by its corresponding value in expected_usage, which, in this context, is a measure of CPU seconds scaled by default 60 to 1, duration is scaled for 60 to 1, and start 1 to 1. the results are summed.

after tracing the logic, i see your point that more threads should give a lower score when duration is less since expected time would be relatively less for duration vs cpu time. however, this strategy does not distinguish the two. as @shadeofblue said, a custom strategy, modeled after this one, would need to be developed.

and even scoring 1 core providers, clock speed may differ. then even multithreaded comparisons might need to factor clock speed. would 8 threads at 3GHz finish any faster than 4 threads at 6GHz, for example. . .

thanks for reading along, great discussion!

RoyKas · 2023-09-23T19:09:35Z

The point I'm trying to make is that the usage vector is populated with _expected_tim_sec, both for duration and for cpu.seconds, and therefore it implies a single core/threaded processor. For processors with more cores/threads the pricing for duration and for cpu-seconds will be weighed differently, e.g. cpu.seconds will be approximately an integer multiple of the duration, assuming the task is fully loading the processor.
I see that a solution will not be straightforward, but the code should be such that it is clear to requestors that this default market strategy is using this simplification.

krunch3r76 · 2023-09-24T02:59:55Z

good point. the docstring could make it explicit that the parameter expected_time_sec is a scaling factor for both env/hr and cpu/hr. perhaps if someone, like you or myself, were to add that to the docstring and submit a pull request, the issue would be closed as satisfactorily resolved when merged - after comments and subsequent revisions perhaps?

RoyKas · 2023-10-15T14:45:45Z

I haven't done any such code changes or pull requests before, would you be willing to take this on you?

krunch3r76 · 2023-11-06T05:42:55Z

I haven't done any such code changes or pull requests before, would you be willing to take this on you?

good to know that would resolve the issue! i will see what i can come up with in the near future! 😊

RoyKas added the bug Something isn't working label Aug 26, 2023

krunch3r76 mentioned this issue Sep 18, 2023

Use safe default values #1148

Open

shadeofblue added not a bug wontfix This will not be worked on and removed bug Something isn't working wontfix This will not be worked on labels Sep 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LeastExpensive Market Strategy not using number of threads of providers #1147

LeastExpensive Market Strategy not using number of threads of providers #1147

RoyKas commented Aug 26, 2023

krunch3r76 commented Sep 18, 2023 •

edited

Loading

shadeofblue commented Sep 19, 2023

krunch3r76 commented Sep 19, 2023 •

edited

Loading

RoyKas commented Sep 23, 2023

krunch3r76 commented Sep 24, 2023 •

edited

Loading

RoyKas commented Oct 15, 2023

krunch3r76 commented Nov 6, 2023

LeastExpensive Market Strategy not using number of threads of providers #1147

LeastExpensive Market Strategy not using number of threads of providers #1147

Comments

RoyKas commented Aug 26, 2023

krunch3r76 commented Sep 18, 2023 • edited Loading

shadeofblue commented Sep 19, 2023

krunch3r76 commented Sep 19, 2023 • edited Loading

RoyKas commented Sep 23, 2023

krunch3r76 commented Sep 24, 2023 • edited Loading

RoyKas commented Oct 15, 2023

krunch3r76 commented Nov 6, 2023

krunch3r76 commented Sep 18, 2023 •

edited

Loading

krunch3r76 commented Sep 19, 2023 •

edited

Loading

krunch3r76 commented Sep 24, 2023 •

edited

Loading