Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Horizontal Scaling #159

Merged
merged 12 commits into from
Nov 14, 2016
Merged

Horizontal Scaling #159

merged 12 commits into from
Nov 14, 2016

Conversation

inz
Copy link
Collaborator

@inz inz commented Oct 10, 2016

Adds a new optimization model for horizontal scaling that replaces the original MiniZinc model.
It should replace the original MiniZinc model, but currently the original functionality is not yet mirrored. (Closes #156. Eventually.)

While I initially wanted to have an unconstrained model that chooses the optimal combination and number of resources from all available instance types, I had to simplify the model to restrict recommendations per ingredient to one resource type and a multiplier (num_resources) representing the number of required instances. The original, unconstrained model is in the commit history for provenance. The unconstrained model unfortunately did not produce results in reasonable time, as the number of possible resource assignments grows unreasonably large for complex applications and large numbers of users.

In addition to the new MiniZinc model, we now also represent partial CPU cores as reported by the providers (e.g., for the t2.* Amazon bursting instances and for Google f1- and g1-). To accommodate this, CPU cores are represented as 1/100ths in the model.

The output of the optimization is compatible with the original model, with the addition of a num_resources array that represents the number of instances required to fulfill the ingredient constraints.

Things left to do/discuss

  • Currently, the model will only show recommendations for horizontal scaling. As discussed in Horizontal scaling  #156, we might want to let users select if they want vertical or horizontal scaling recommendations. For this, I see several possibilities:
    • a) At recommendation generation time (i.e., when hitting 'generate recommendation'), we add a 'vertical/horizontal' checkbox. This would restore the original functionality and allow for the generation of purely vertically scaled recommendations. (but: While I think that these vertical scaling recommendations are educational to clearly see savings potential, I'm not sure we want to prominently show recommendations that basically consist of SPOFs for all ingredients).
    • b) We add a 'scale horizontally' flag to CPU and RAM workloads. With this, we can mirror Cloudorado functionality. For each ingredient, we would compute constraints as before, but add array[Ingredients] of bool: distribute_cpu; and array[Ingredients] of bool: distribute_ram; to the optimization model. We update the model to restrict recommendations to resources that fulfill CPU and/or RAM constraints for ingredients with distribute_[cpu|ram] == false. With this approach, we could model ingredients that need, e.g., at least 1 GB of RAM for every instance and scale according to computed CPU constraints (also need to set RAM growth per user to 0 then, otherwise RAM requirement continues to grow).
    • c) We add more complex scaling policies, e.g., minimum and maximum number of instances per ingredient, minimum RAM/CPU per instance. With this, we can model more complex deployments than currently possible with Cloudorado. We need to supply additional information to the model, such as array[Ingredients] of var int: min_num_instances;, array[Ingredients] of var int: max_num_instances;, array[Ingredients] of var int: min_cpu_per_instance;, array[Ingredients] of var int: min_ram_per_instance. Overall ingredient constraints are computed as before; the 'per instance' fields would not change with the number of users. This would then allow to model constraints, such as, 'a PostgreSQL ingredient should have ≥2 instances, ≤5 instances, 12MB RAM per user, 1/2500th CPU per user, min 512MB RAM per instance, min 1 vCPU per instance'.
  • For UI adjustments to request and show horizontally scaled recommendations, see Horizontal Scaling cloud-stove-ui#66.

Implementation summary

  • A new MiniZinc model to generate horizontal scaling recommendations. Will find one resource type per ingredient, but with an additional number of instances, e.g. 4x n1-standard-2.
  • We now have scaling_constraints attached to ingredients with a max_number_of_instances attribute to restrict the number of allowed instances per resource for generated recommendations (with 0 for no restriction).
  • A new 'scaling_workload` attached to an ingredient lets users decide between vertical and horizontal scaling per ingredient.
  • Provided partial CPU cores reported by providers are now considered in the model (this change, while not closely related to horizontal scaling, came with the new MiniZinc recommendation model)

@inz inz added the discussion label Oct 10, 2016
@inz inz temporarily deployed to fathomless-escarpment-2-pr-159 October 10, 2016 08:31 Inactive
@joe4dev
Copy link
Collaborator

joe4dev commented Oct 10, 2016

👍👍 thanks

  • Considering partial CPUs makes totally sense to improve fairness
  • a) I think that a global switch doesn't make so much sense. The ability to scale out is rather component (i.e., ingredient) based.
  • b) This makes most sense for me. Some ingredients such as Web servers are eligible for horizontal scaling whereas for other ingredients such as DBs vertical scaling might be more appropriate. Why do we need to set the "RAM growth per user to 0"? Doesn't the workload model yield the required amount of RAM (based on min + growth) which subsequently can be used to calculate the distribution splitting (considering min including the number of users it can serve + growth)? It doesn't seem trivial to find the optimum though 🤔 (kinda knapsack problem we discussed earlier on)
  • c) At the current state, I don't think we gain much by demanding even more input from the user (especially regarding number of instance boundaries). However, I clearly see the benefit of min_cpu_per_instance and min_ram_per_instance😏 Can't we "reuse" the minimum values from the workloads for this purpose? 🤔

@inz
Copy link
Collaborator Author

inz commented Oct 10, 2016

  • a) agreed.
  • b) I think we have a slight misunderstanding here. If I understand correctly, you are thinking of a 'scale horizontally' flag per ingredient (which we'll call b.1), whereas I was thinking of one flag per workload, similar to Cloudorado, i.e., a 'scale CPU horizontally' and a 'scale RAM horizontally' flag.
    • b.1) I like that the UI with one 'scale horizontally' flag per ingredient would be simpler than with a more complex variant where distribution is decided per-workload. We could implement this with one additional parameter for the model, e.g., array[Ingredients] of int: max_num_instances where we set the maximum number of allowed instances to 0 for no restrictions and 1 for vertical scaling. This way, we could later introduce more complex scaling rules (i.e., max number of allowed instances) without changing the model.
    • b.2) My thinking here was the following (of course heavily inspired by Cloudorado): If we have separate scaling (Cloudorado calls it 'distribute') flags for CPU and RAM, then for an ingredient that has 'no distribute' for RAM and 'distribute' for CPU, we would require that each of the chosen resources by itself fulfills the RAM requirement, but the aggregate CPU requirement can be fulfilled by multiple resources (e.g., min 2 GB RAM, no distribute, 1 CPU per 1000 users, would mean that only resources with ≥2 GB RAM are eligible, but e.g. for 10k users we can get the 10 CPUs from multiple resources (with each ≥2 GB RAM)). Hence, RAM growth per user should be 0, otherwise the instances would have to get bigger and bigger with more users, but we basically just wanted to specify that 'min RAM per instance' should be 2 GB.
  • c) I agree, we should keep additional user input to a minimum. Ad 'reusing' the minimum values from the workloads: Yes, see b.2 above.

@inz
Copy link
Collaborator Author

inz commented Oct 10, 2016

Ad b.1) One thing we should consider then, however, is that for e.g. a DB master, which we only scale vertically, we should then add a separate ingredient for horizontally scaled DB slaves.

Overall, I think this approach (b.1) is very reasonable. If you agree I will go ahead and implement the changes in the model and the UI. We can implement more complex scaling scenarios (b.2, c) later if necessary.

@joe4dev
Copy link
Collaborator

joe4dev commented Oct 10, 2016

  • b1) If it doesn't slow down MiniZinc, the "maximum number of allowed instances" approach sounds fine 👍
  • b2) Sounds reasonable. What I meant by considering growth would be the (non-trivial) optimization that one take the number of additional users an instance can serve into account to find an optimal distribution. Example: Distributing 12GB using 6* 2GB instances (min RAM) might be less optimal than 4* 3GB instances. To obtain how many users can be served with the additional 1GB, one would need the RAM growth, right?

[DB master-slave] Agree, that special case would need some extra treatment.

Yes, that's fine for me 👍👍
If you have no further comment on PR #145, you could merge. I can test on staging then and migrate production too. Then I can run the provider updaters safely to resolve #160.
Afterwards: What do you think I work on next then?

@inz inz temporarily deployed to fathomless-escarpment-2-pr-159 October 12, 2016 10:04 Inactive
@inz inz force-pushed the feature/horizontal-scaling branch from 41950ec to c1c1e8c Compare October 12, 2016 11:43
@inz inz temporarily deployed to fathomless-escarpment-2-pr-159 October 12, 2016 11:43 Inactive
@inz
Copy link
Collaborator Author

inz commented Oct 12, 2016

A very interesting error on Wercker:

 test_hierarchical_region_constraint#DeploymentRecommendationTest (19.96s)
        --- expected
        +++ actual
        @@ -1 +1 @@
        -[2577369412, 2577369412, 4116750498]
        +[4116750498, 2577369412, 2577369412]
        test/models/deployment_recommendation_test.rb:74:in `block in <class:DeploymentRecommendationTest>'

I am moderately confused. Array#collect should execute in order. For the tested recommendation, the first two resources are from Azure and the last is from Google. So the 4116750498 region at the beginning doesn't make sense. Also the 2577369412 region for the third resource doesn't make sense.

@joe4dev
Copy link
Collaborator

joe4dev commented Oct 12, 2016

Mhmmm 🙃
Is the error reproducible? Have you tried to retry the build and gotten the same outcome?

@inz
Copy link
Collaborator Author

inz commented Oct 12, 2016

Hm.. no. Apparently it's kind of reproducible. The second wercker build succeeded.

Locally, it works fine with guard but I just ran a local wercker build and it failed with the same message. When I run the local wercker build again it fails again. I suspect because it is always rebuilding the complete container on wercker build but not when I hit 'retry' in the web app. 😕

@joe4dev
Copy link
Collaborator

joe4dev commented Oct 12, 2016

My local Wercker build finally (after ages) passed on first try 🙄:
wercker-local-build

@inz
Copy link
Collaborator Author

inz commented Oct 12, 2016

Hmm. Interesting. Well I guess then we blame Wercker and assume that it'll work...

@joe4dev joe4dev mentioned this pull request Oct 12, 2016
@joe4dev
Copy link
Collaborator

joe4dev commented Oct 12, 2016

Tried another time locally after deleting ~/.wercker, pass again.
After merging and rebasing on #162, #165, and #166 , we'll see what happens with the build then

@inz
Copy link
Collaborator Author

inz commented Oct 12, 2016

Agreed. I'll finish the UI tomorrow, then we can rebase and roll it out.

On Wednesday, October 12, 2016, Joel Scheuner [email protected]
wrote:

Tried another time locally after deleting ~/.wercker, pass again.
After merging and rebasing on #162
#162, #165
#165, and #166
#166 , we'll see what happens
with the build then


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#159 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAXnTSBoevYy-0sr8IAO31bWiXPJKgeks5qzO7ZgaJpZM4KSVDt
.

@inz
Copy link
Collaborator Author

inz commented Oct 13, 2016

This PR is now ready for review.

@inz inz force-pushed the feature/horizontal-scaling branch from c1c1e8c to 874cec5 Compare October 13, 2016 10:27
@inz inz temporarily deployed to fathomless-escarpment-2-pr-159 October 13, 2016 10:27 Inactive
@inz inz force-pushed the feature/horizontal-scaling branch from 874cec5 to b4a81a6 Compare October 13, 2016 13:00
@inz inz temporarily deployed to fathomless-escarpment-2-pr-159 October 13, 2016 13:00 Inactive
@inz inz force-pushed the feature/horizontal-scaling branch from b4a81a6 to 7b74651 Compare October 13, 2016 13:01
@inz inz temporarily deployed to fathomless-escarpment-2-pr-159 October 13, 2016 13:01 Inactive
@inz inz added this to the Sensitivity Analysis v1.0 milestone Oct 13, 2016
@inz
Copy link
Collaborator Author

inz commented Oct 14, 2016

I drafted a blog post to introduce the scaling policies: https://medium.com/cloud-stove/e1410f816b73

@inz inz force-pushed the feature/horizontal-scaling branch from 7b74651 to 32ec91c Compare October 25, 2016 10:52
@inz inz had a problem deploying to fathomless-escarpment-2-pr-159 October 25, 2016 10:52 Failure
@inz inz had a problem deploying to fathomless-escarpment-2-pr-159 October 25, 2016 11:01 Failure
@inz inz force-pushed the feature/horizontal-scaling branch from f65ac5c to 32ec91c Compare October 25, 2016 11:07
@inz inz had a problem deploying to fathomless-escarpment-2-pr-159 October 25, 2016 11:08 Failure
inz added 9 commits October 25, 2016 13:21
Seems to work well when I disable inter ingredient traffic, but is 
really slow if traffic is considered.
It will now parse as JSON object (with one empty ingredient at the top of the list).
It works fine for smallish ram and cpu constraints, but takes forever 
with big values (e.g., 100GB RAM).
Instead of finding the ideal combination among all resources per 
ingredient, we now search for the ideal number of resources from a 
single resource type to fulfill all constraints.

With this simplification, recommendations are generated quickly.
The new model is a drop-in replacement for the original model, except 
for the additional ‘resource_count’ result.
Also, adjust model to make vCPUs integers again (i.e., vCPUs * 100).
Using `max_num_resources` you can limit the number of resources assigned to an ingredient. Set it to `0` for no restrictions.

Currently, the model generated by the app does not yet contain a scaling constraint and just defaults to `0`.
The scaling constraint specifies the maximum number of instances allowed
for any ingredient. The scaling workload is currently a boolean flag
to indicate whether the ingredient should be scaled horizontally.

The template seed ingredients already use the scaling constraint to
keep the db master as a single instance with multiple horizontally
scaled db slaves.

To ease transition, `Ingredient#update_constraints` does not fail if
no scaling workload exists. A default scaling workload that activates
horizontal scaling is created instead (see `Ingredient#scaling_workload`).
@inz inz force-pushed the feature/horizontal-scaling branch from 32ec91c to 1427e08 Compare October 25, 2016 11:21
@inz inz temporarily deployed to fathomless-escarpment-2-pr-159 October 25, 2016 11:21 Inactive
'regions' => region_codes,
'vm_cost' => '475.42',
'total_cost' => 475416
'total_cost' => 475419
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total_cost changed because the number of different resources in the recommendation is added to the costs as a tie-breaker.

For providers like Digital Ocean, where instance costs and specs scale exactly linearly, we have multiple "optimal" solutions and by adding the number of assigned resources to the costs we prefer recommendations with fewer resources.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Agree to prefer fewer solutions with fewer resources 👍
We should keep that in mind if we want to use total_cost one day 😉

@inz inz temporarily deployed to fathomless-escarpment-2-pr-159 October 26, 2016 14:12 Inactive
@inz
Copy link
Collaborator Author

inz commented Oct 31, 2016

This PR should be ready. Please review and merge.

/cc @joe4dev

Copying an application set every ingredient to use the horizontal scaling scheme
instead of mirroring the scheme of the copy template.
@inz inz temporarily deployed to fathomless-escarpment-2-pr-159 November 3, 2016 13:30 Inactive
User authentication is done by the application controller by default
Copy link
Collaborator

@joe4dev joe4dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general it seems ok to me.

However, the Amazon recommendations for horizontal scaling do not behave the same as Google with regard to partial CPU cores. Google chooses f1-micro (5x) instances with 0.2 cores whereas Amazon always chooses m3.medium (1x), which is the smallest instance with a full CPU core.
It seems to me that Amazon instance with partial core count are not considered 😏

Comment about instance choice:
Obviously, horizontal scaling often chooses the weakest instances (e.g., f1-micro, basic-a0). This might sometimes not be suitable for production deployment.

@@ -0,0 +1,75 @@
class ScalingWorkloadsController < ApplicationController

before_action :authenticate_user!
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed as already present in ApplicationController (rationale: protect every endpoint by default to avoid security breaches when adding new controllers)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 770b13c

'regions' => region_codes,
'vm_cost' => '475.42',
'total_cost' => 475416
'total_cost' => 475419
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Agree to prefer fewer solutions with fewer resources 👍
We should keep that in mind if we want to use total_cost one day 😉

@inz
Copy link
Collaborator Author

inz commented Nov 7, 2016

I'm not sure if there is a problem with the recommendations, but Amazon t2 instances have a pretty bad price per CPU, so maybe they are not part of recommendations because m3 is cheaper? And the Google f1-micro is actually cheaper per full CPU than n1 (it is more expensive per GB RAM though). So I guess the recommendations kind of make sense.

Yes, horizontal scaling will always choose the smallest instance. Currently, I guess recommendations should include other instance types if RAM requirements are high (since specialized instances should be cheaper per GB RAM) and when CPU requirements are very high (since specialized instances should be cheaper per core). We could of course add a minimum RAM/CPU threshold to the workload to prevent these instance types from showing up in recommendations.

Nevertheless, I would merge this PR now if there are no other issues.

@joe4dev
Copy link
Collaborator

joe4dev commented Nov 14, 2016

Agree 👍

@joe4dev joe4dev merged commit 15f1172 into master Nov 14, 2016
@joe4dev joe4dev deleted the feature/horizontal-scaling branch November 14, 2016 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Horizontal scaling
2 participants