You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now both flow potential and cumulative current are solved in the same loop.
e.g.:
flow_potential_storage_array =fill(0.0, (size(resistance)..., nthreads())
cumulative_current_storage_array =fill(0.0, (size(resistance)..., nthreads())
for i in moving_windows
# solve flow potential for window i# add flow potential for window i to the flow potential storage array# solve current flow for window i# add current for window i to the cumulative current storage arrayend# sum flow potential along 3rd dim# sum cumulative current along 3rd dim
so storage arrays (X by Y by N_THREADS; which are then summed along the N_THREADS dimension) need to be allocated for both at the same time... if flow potential was solved first, then it could be summed and stored as an X by Y array, and the storage array removed before allocating the array for storing cumulative current. Shouldn't take any longer at all, but will be much more efficient.
The new code would look like:
flow_potential_storage_array =fill(0.0, (size(resistance)..., nthreads())
for i in moving_windows
# solve flow potential for window i# add flow potential for window i to the flow potential storage arrayend# sum flow potential along 3rd dim (new object of size size(resistance))
flow_potential_storage_array =nothing
cumulative_current_storage_array =fill(0.0, (size(resistance)..., nthreads())
for i in moving_windows
# solve current flow for window i# add current for window i to the cumulative current storage arrayend# sum cumulative current along 3rd dim
Need to give some more thought to memory management in general, though.... e.g. maybe there is a better way to do this than by allocating a separate array for each thread to make it threadsafe -- I just didn't want to bother with locks/unlocks, as I know that can come with a compute time penalty.
The text was updated successfully, but these errors were encountered:
The standard question in all of parallel computing. The answer often is - implement it both ways - and use whichever one works better (for some definition of better).
With the threadsafe 3D array vs. locks/unlocks on a 2D array? Yeah, I should, and do some formal benchmarks -- won't hurt for me to get more familiar with locks/unlocks too.
Yes. But you'll need to test it with big problem sizes. I mean you only have a handful of tasks, and I suppose they can do all their partial sums locally. I can't imagine the locking and unlocking for the final part to take more than a second. But would love to see what you find. Without knowing the sizes of the 3d array and how it grows with input size, it is hard to tell.
Yeah -- definitely will need any benchmarks to consider different map sizes, moving window sizes, and other options within Omniscape. It will be a valuable exercise I think.
Right now both flow potential and cumulative current are solved in the same loop.
e.g.:
so storage arrays (X by Y by N_THREADS; which are then summed along the N_THREADS dimension) need to be allocated for both at the same time... if flow potential was solved first, then it could be summed and stored as an X by Y array, and the storage array removed before allocating the array for storing cumulative current. Shouldn't take any longer at all, but will be much more efficient.
The new code would look like:
Need to give some more thought to memory management in general, though.... e.g. maybe there is a better way to do this than by allocating a separate array for each thread to make it threadsafe -- I just didn't want to bother with locks/unlocks, as I know that can come with a compute time penalty.
The text was updated successfully, but these errors were encountered: