Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run FASTQC on a subset of reads #74

Open
harmonbhasin opened this issue Oct 16, 2024 · 4 comments
Open

Run FASTQC on a subset of reads #74

harmonbhasin opened this issue Oct 16, 2024 · 4 comments
Assignees
Labels
done Issues that have been addressed in dev branch, but not reflected in master branch enhancement New feature or request priority_1 time&cost Changes to improve the pipeline's runtime and computational cost

Comments

@harmonbhasin
Copy link
Collaborator

Given that FASTQC takes a large chunk of time, Mike suggested running FASTQC on a subset of reads. Will agrees that this would be good to do, we would just need to calculate the total number of reads per sample separately (plus some other overall metrics) since that's currently being pulled from FASTQC.

@harmonbhasin harmonbhasin added enhancement New feature or request priority_2 labels Oct 16, 2024
@mikemc
Copy link
Member

mikemc commented Oct 21, 2024

I still think just running FastQC on a subset of reads makes sense, but you might also consider using Falco to speed things up (#78)

@willbradshaw
Copy link
Contributor

willbradshaw commented Oct 21, 2024 via email

@mikemc
Copy link
Member

mikemc commented Oct 21, 2024

Ambiguous grammar, I was trying to suggest doing both of

  1. just running QC assessment on a subset of reads
  2. using Falco instead of FastQC for QC assessment

@harmonbhasin harmonbhasin added the in-progress Mark this label when the assignee is actively working on this item. label Nov 13, 2024
@harmonbhasin harmonbhasin self-assigned this Nov 13, 2024
@willbradshaw
Copy link
Contributor

We're now planning to implement this early next quarter by moving FASTQ downstream of the subsetting used at the start of the PROFILE workflow. (Strictly speaking this is downstream of implementing #129.)

@willbradshaw willbradshaw added the time&cost Changes to improve the pipeline's runtime and computational cost label Dec 17, 2024
@harmonbhasin harmonbhasin removed the in-progress Mark this label when the assignee is actively working on this item. label Jan 6, 2025
@harmonbhasin harmonbhasin added the done Issues that have been addressed in dev branch, but not reflected in master branch label Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
done Issues that have been addressed in dev branch, but not reflected in master branch enhancement New feature or request priority_1 time&cost Changes to improve the pipeline's runtime and computational cost
Projects
None yet
Development

No branches or pull requests

3 participants