-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do resolutions within a module in parallel #391
base: main
Are you sure you want to change the base?
Conversation
… multi-core systems
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a nice improvement
// This is required for jscala 2.12/2.13 compatibility for parallel collections | ||
// (see https://github.com/scala/scala-parallel-collections/issues/22) | ||
private[internal] object CompatParColls { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code already exists in sbt, but is not public https://github.com/sbt/sbt/search?q=CollectionConverters
@alexarchambault could you take a look at this PR and see what you would like changed for it to be mergeable? (Or reject it if you think this work is no good) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @craffit. I'm not against parallelizing this code, all the more that it brings perf gains. The only issue here is that the thread pool these calculations run on is not clear / not explicit. I've been bitten by this again very recently here for example.
So either the parallel collection should be configured so that it runs on an explicit thread pool (seems can be done, as shown here, although I have no experience with this API). Or things should be parallelized manually using futures and Future.sequence
to go from List[Future[…]]
to Future[List[…]]
(all of that while passing those methods a custom thread pool / execution context).
As a thread pool to run those calculations, we could use either
- the thread pool used by the coursier cache (but its thread count defaults to 6, which might not correspond to the number of cores that are available…)
- a thread pool lazily started just for this via
coursier.cache.internal.ThreadUtil. fixedThreadPool
(which can be kept in a lazy field on anobject
somewhere - its threads automatically shut down after 1 minute of inactivity, so there's no need to manually shut them down)
I think both thread pools ought to do the job…
Profiling showed conflict resolution to be a performance hotspot. Cpu utilization is a about 2 cores on my machine, after parallelizing the toplevel resolution loop utilization went to about 4 cores for my local workload.
With this change i measured a performance improvement from 50 seconds on
sbt update
to 30 seconds.