Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batchRoute: Better handle job cancellation and deletion #1092

Merged
merged 1 commit into from
Nov 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -195,12 +195,9 @@ class TrRoutingBatch {
checkpointTracker.completed();

this.options.progressEmitter.emit('progress', { name: 'BatchRouting', progress: 1.0 });
this.options.progressEmitter.emit('progress', { name: 'StoppingRoutingParallelServers', progress: 0.0 });

const stopStatus = await TrRoutingProcessManager.stopBatch();

this.options.progressEmitter.emit('progress', { name: 'StoppingRoutingParallelServers', progress: 1.0 });
console.log('trRouting multiple stopStatus', stopStatus);
// FIXME Should we return here if the job is cancelled? Or we still
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an issue to track this? That seems important to get an answer to that

// generate the results that have been calculated since now?

// Generate the output files
this.options.progressEmitter.emit('progress', { name: 'GeneratingBatchRoutingResults', progress: 0.0 });
Expand Down Expand Up @@ -253,6 +250,14 @@ class TrRoutingBatch {
console.error(`Error in batch routing calculation job ${this.options.jobId}: ${error}`);
throw error;
}
} finally {
// Make sure to stop the trRouting processes, even if an error occurred
this.options.progressEmitter.emit('progress', { name: 'StoppingRoutingParallelServers', progress: 0.0 });

const stopStatus = await TrRoutingProcessManager.stopBatch();

this.options.progressEmitter.emit('progress', { name: 'StoppingRoutingParallelServers', progress: 1.0 });
console.log('trRouting multiple stopStatus', stopStatus);
}
};

Expand Down
12 changes: 9 additions & 3 deletions packages/transition-backend/src/tasks/TransitionWorkerPool.ts
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,14 @@ function newProgressEmitter(task: ExecutableJob<JobDataType>) {
});
eventEmitter.on('checkpoint', (checkpoint: number) => {
console.log('Task received checkpoint ', checkpoint);
task.attributes.internal_data.checkpoint = checkpoint;
task.save();
// Refresh the task before saving the checkpoint
task.refresh()
.then(() => {
// Add checkpoint, then save the task
task.attributes.internal_data.checkpoint = checkpoint;
task.save().catch(() => console.error('Error saving task after checkpoint'));
})
.catch(() => console.error('Error refreshing task before saving checkpoint')); // This will catch deleted jobs
});
return eventEmitter;
}
Expand Down Expand Up @@ -65,7 +71,7 @@ const getTaskCancelledFct = (task: ExecutableJob<JobDataType>) => {
clearInterval(intervalObj);
}
})
.catch(() => (refreshError = true));
.catch(() => (refreshError = true)); // This will catch deleted jobs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking, you added the same comment as before without changing the code. This is accurate and not just a bad copy/paste ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's accurate, it adds an explanation of which kind of use case would be caught here.

}, 5000);
return () => refreshError || task.status === 'cancelled';
};
Expand Down
Loading