You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The shared tensor mechanism introduced in #58 operates by providing several output tensors to each runner for writing their RunnerModel results. When all output tensors have been used, the runner goes back to the first tensor (like a ring buffer), and repeats the whole process. To ensure that an output tensor is not overwritten before any consumer reads from it, we've added a simple synchronization procedure -- a producer cannot access a shared tensor before any consumer releases its Event.
Unfortunately, the current implementation may lead to cases in which the writer runner is waiting on a certain shared tensor while another shared tensor has already been released. We have not implemented a way to randomly access any tensor in the ring buffer; a write to tensor N must always be followed by a write to tensor N+1 (mod len(buffer)). Thus, if the consumer runner that was reading from tensor N+1 releases the tensor before the consumer on tensor N finishes, then we may have a situation where the writer blocks on tensor N even though tensor N+1 is immediately accessible. This may lead to unnecessary wait latency, harming the overall performance of the job.
A simple solution would be to have yet another queue (within the local process) for managing "released" shared tensors, so that writes on tensors do not necessarily have to follow a certain order.
The text was updated successfully, but these errors were encountered:
The shared tensor mechanism introduced in #58 operates by providing several output tensors to each runner for writing their RunnerModel results. When all output tensors have been used, the runner goes back to the first tensor (like a ring buffer), and repeats the whole process. To ensure that an output tensor is not overwritten before any consumer reads from it, we've added a simple synchronization procedure -- a producer cannot access a shared tensor before any consumer releases its
Event
.Unfortunately, the current implementation may lead to cases in which the writer runner is waiting on a certain shared tensor while another shared tensor has already been released. We have not implemented a way to randomly access any tensor in the ring buffer; a write to tensor N must always be followed by a write to tensor N+1 (mod
len(buffer)
). Thus, if the consumer runner that was reading from tensor N+1 releases the tensor before the consumer on tensor N finishes, then we may have a situation where the writer blocks on tensor N even though tensor N+1 is immediately accessible. This may lead to unnecessary wait latency, harming the overall performance of the job.A simple solution would be to have yet another queue (within the local process) for managing "released" shared tensors, so that writes on tensors do not necessarily have to follow a certain order.
The text was updated successfully, but these errors were encountered: