Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature to manually mark analyses to start after top up #3971

Open
1 task
RasmusBurge-CG opened this issue Nov 25, 2024 · 14 comments
Open
1 task

Feature to manually mark analyses to start after top up #3971

RasmusBurge-CG opened this issue Nov 25, 2024 · 14 comments

Comments

@RasmusBurge-CG
Copy link
Contributor

RasmusBurge-CG commented Nov 25, 2024

As a member of prod bioinfo,
I want to have a manual way to mark an analysis to start when new sequencing data is available,
So that analysis would start automatically when new data is available for cases that have been topped up.

Clarification

It’s very easy to overlook manually starting cases after a top-up, which has the potential to cause significant delays in processing samples (potentially devastating for critically ill patients). To prevent this and avoid unnecessary sample processing, one could mark the analysis to be started once there is new sequencing data. This way, when new data becomes available for that case, the automation will pick up the sample and start it. Additionally, there is potential for integration with LIMS: when a sample is requeued for top-up, it could automatically update the status in StatusDB.

Work impact

Answer the following questions:

  • Is there currently a workaround for this issue? If so, what is it?
    • Yes, we need to communicate with each other in production to inform one another about a top-up. Alternatively, one could check all the samples in the delivery step, retrieve the case from statusdb, and verify if it’s running in TB.
  • How much time would be saved by implementing this feature on a weekly basis?
    • 1 hour
  • How many users are affected by this issue?
    • All prod members
  • Are customers affected by this issue?
    • Yes, it happens from time to time that we forget to manually restart cases that have been topped up.

Acceptance Criteria

  • There is a manual way to mark analyses to start automatically when new data is ready for any of the samples in a case

Notes

  • Additional information.
  • Dependencies.
  • Related user stories.
@diitaz93
Copy link
Contributor

Second clarification

There are two situations (that I know) in which a sample will be sent for top-up:

  1. The sample didn't reach the target amount of reads (based on the app tag). In this case, the automation checks that the reads are not reached and simply skips the analysis until the sample has reached the desired amount of reads.
  2. The sample got the required amount of reads and it starts the analysis, but during the analysis we realise that it didn't get the expected coverage (or didn't satisfy other pipeline-specific requirements related to sequencing quality). Then it is sent to top-up and the sample/analysis is labelled as analysed/complete (unclear how it works on this level).

The problem described in this issue concerns only the second situation, as in the first one, the sample will be taken by the automation when the desired number of reads is reached. In the second case, the automation will never start again the analysis as it is labelled as already run.

@henrikstranneheim
Copy link
Contributor

Will the automation not already pick this up whem there is new seq data for a sample which is newer than the latest analysis and action is None?

    def _is_latest_analysis_done_on_all_sequences(self, case: Case) -> bool:
        return case.latest_analyzed < case.latest_sequenced

@beatrizsavinhas
Copy link
Contributor

beatrizsavinhas commented Nov 27, 2024

Yes, I was thinking exactly of this @henrikstranneheim!
The only exception I can think of is if the initial analysis fails, and the case was not stored and therefore still has status running in status DB - so the automation won't start it. I think this might be the case with samples that received enough reads but the analysis failed on coverage and was left as failed in trailblazer.

@henrikstranneheim
Copy link
Contributor

But should that case not warrant a manual investigation? It might fail for number of reasons.

@RasmusBurge-CG
Copy link
Contributor Author

Hi!

Yes, @beatrizsavinhas, this would fall under case 2, as stated in the second clarification by @diitaz93, right?

@henrikstranneheim, that’s true. If the investigation reveals a need for a top-up, then the Top-up action could be useful. Do you agree?

There might be a better solution to this. If there’s something I’m unaware of, I’m happy to learn!

@henrikstranneheim
Copy link
Contributor

Hi!

Yes, @beatrizsavinhas, this would fall under case 2, as stated in the second clarification by @diitaz93, right?

@henrikstranneheim, that’s true. If the investigation reveals a need for a top-up, then the Top-up action could be useful. Do you agree?

Not sure, it would not have any function in cg and it would complicate the logic that finds cases to start, (which is already complicated). Unsure of that it is worth it compare to just a comment in TB and let the automation pick it up once ready.

@RasmusBurge-CG
Copy link
Contributor Author

RasmusBurge-CG commented Nov 28, 2024

Okay, I see @henrikstranneheim. It’s just that the automation doesn’t pick it up for pipelines like mip-dna. I do agree that it would affect the start logic and require changes in CG. The benefit, as I see it, would be that a sample could be started during the nighttime, reducing turnaround time. Additionally, in my view, it would lessen the risk of missing a case restart.

@karlnyr
Copy link
Contributor

karlnyr commented Nov 28, 2024

It would be vital that we restart cases whenever new data is available. @Karl-Svard brought up that if we re-demultiplex we would not want to start things because of the new "latest_sequenced_at" date. We will have to refine the acceptance criteria at a later meeting.

@beatrizsavinhas
Copy link
Contributor

Suggestion for acceptance criteria:

  • There is a manual way to set an analysis to be started when new sequencing data is available.

Considerations:

  • If the case was already analysed after passing aggregated sequencing QC (all samples received enough reads), no analyses should be started automatically. Example: sample A received enough reads and is in the same pool as sample B that needs top up. The whole pool is re-sequenced since more reads are needed for sample B. No new analyses for sample A should start.

@henrikstranneheim
Copy link
Contributor

Hhmm, tricky one. @RasmusBurge-CG Do you have an example of a case that is not picked up and do you have any reason why.

@beatrizsavinhas Can't you set the case action to "analyze"?

@beatrizsavinhas
Copy link
Contributor

I don't think that would work as intended @henrikstranneheim because if the case is set to analyse the analysis will just start again when the systemd runs, with no regard to there being new sequencing data or not. So we run the risk of analysing the case again with the same data. What we need is something that makes it so the analysis only starts after the sample was topped up.

But as @karlnyr mentioned, we will still discuss this issue in our next meeting!

@islean
Copy link
Contributor

islean commented Nov 29, 2024

I think setting the case action to None will result in the behaviour you are after:

  1. ANALYZE will result in it getting picked up for analysis regardless of whether there is new sequencing data or not
  2. RUNNING will result in the case never being picked up
  3. None will result in a check if there is new sequencing data since the latest analysis and if so start it

@karlnyr
Copy link
Contributor

karlnyr commented Nov 29, 2024

That would restart the case instantly. Since there is no analysis object in statusdb (only created when we have a completed case), and there is sequence data. What we want, is to queue the analysis for start after it has been sequenced again. Meaning, that there is not analysis object in statusdb, there is sequencing data, but what we need is to check if there is new sequencing data that has been added to the sample. This could be done in a lot of ways- perhaps checking if there is an analysis already in trailblazer that is failed - if so, it means that we have once started the case but something was up and then check if the latest sequencing date is after the start of that trailblazer analysis. Or, you could check if more than one flow cell exists on the sample. OR we could start creating analysis objects upon start, but that would require some changes to existing logic, and perhaps some new fields 👍 SO many choices

edit:

something I though of is that we could in fact make lims set the case for the sample to analyze if it gets enough reads in sequence aggregation. Or even do that in the post-processing of demux

@karlnyr karlnyr changed the title New Case Action Top-up Feature to manually mark analyses to start after top up Dec 10, 2024
@karlnyr
Copy link
Contributor

karlnyr commented Dec 10, 2024

@Clinical-Genomics/sysdev We have tried to clarify this user story and updated the acceptance criteria

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants