-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The memory store allows multiple segments to run per node for the same repair #1519
Comments
Hello, Alexander! Sometimes it will be great to repair more then only one segment per node per one repair - it will be more faster if node have enough resources. This will be useful when we need run full segmented repair as fust as we can in an emergency situation (as example, when consistency in multi-dc have just been broken and we know about this). Can we add variable parameber for set maximum allowed parallel segments repairing count at same time for current repar? Of course we must understand the risk of affect on node if this variable will be very big, but if this value will be less or equeal 4 -this will not produce any problem for powerful nodes.I mean, that why is this limitation a dogma and can't be slightly increased with the possibility of flexible adjustment if necessary? |
We the current state of the implementation, what you can do is allow n concurrent repair runs, and if you create a run per table that'll give you some concurrency. To reduce the overhead of segments and increase the pressure on your nodes, you can lower the number of segments, which will make them bigger and lead to shorter execution times. |
Yes, it is interesting method for reducing segment numbers. But sometimes big segments - are big problem for repair on big node or in multi-dc cluster. Repairing one big node with 1TB data with 1 segment will be more riskable and tme long even in 4 threads than 4 parallel running segments. Very sad, that this feature will not be enabled in future - it will be great to use on some production clusters. |
Project board link
It looks like the memory store doesn't honor the guarantee of scheduling a single segment per node per repair.
Instead, it's the maxParallelRepair concurrency number that seem to be applied as two repairs seem run at once.
At any time, one replica should run at most one segment per allowed repair, and never multiple segments for the same repair.
In the Cassandra storage implementation, we use LWTs to guarantee that.
Definition of Done
┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: REAP-2
The text was updated successfully, but these errors were encountered: