Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: Externalize .LevelReduction and .GetKingdom for global use #85

Merged
merged 2 commits into from
Oct 29, 2024

Conversation

teddyCodex
Copy link
Collaborator

Description

Defines .LevelReduction and .GetKingdom outside of the plotLineageHeatmap function, eliminating the need for functions within a function and making them globally accessible.

devtools::check() - 0 errors

Fixes issue #80

What kind of change(s) are included?

  • Feature (adds or updates new capabilities)
  • Bug fix (fixes an issue).
  • Enhancement (code refactor).
  • Breaking change (these changes would cause existing functionality to not work as expected).

Checklist

Please ensure that all boxes are checked before indicating that this pull request is ready for review.

  • I have read and followed the CONTRIBUTING.md guidelines.
  • I have searched for existing content to ensure this is not a duplicate.
  • I have performed a self-review of these additions (including spelling, grammar, and related).
  • I have added comments to my code to help provide understanding.
  • I have added a test which covers the code changes found within this PR.
  • I have deleted all non-relevant text in this pull request template.
  • Reviewer assignment: Tag a relevant team member to review and approve the changes.

@teddyCodex teddyCodex requested review from the-mayer and jananiravi and removed request for the-mayer October 8, 2024 21:54
@teddyCodex teddyCodex changed the title Refactor: Externalize internal functions for global use Refactor: Externalize .LevelReduction and .GetKingdom for global use Oct 12, 2024
@jananiravi jananiravi added the outreachy for outreachy interns label Oct 20, 2024
@@ -766,15 +775,15 @@ plotLineageHeatmap <- function(prot, domains_of_interest, level = 3, label.size
)
ggplot(
data = all_grouped_reduced,
aes_string(x = "ReducedLin", y = "Query")
aes(x = "ReducedLin", y = "Query")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for this change?

Copy link
Collaborator

@the-mayer the-mayer Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see that aes_string() has been deprecated -- Instead, can you confirm that quoting the params of the aes() call still produces the desired result?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @the-mayer, Its a Deprecation Adjustments:

It replaces aes_string() (deprecated) with aes() by quoting parameters, as suggested by a collaborator. This adjustment aligns the code with current ggplot2 standards while maintaining intended functionality.

@the-mayer
Copy link
Collaborator

Thanks @teddyCodex, just added a quick comment to verify that we aren't breaking the plot code. Note, I've also invited @Joiejoie1 to provide a review as they have also been working on this issue. Thanks both!

@Joiejoie1
Copy link
Collaborator

Thanks @teddyCodex, just added a quick comment to verify that we aren't breaking the plot code. Note, I've also invited @Joiejoie1 to provide a review as they have also been working on this issue. Thanks both!

Thamks @the-mayer I'll do just that

Copy link
Collaborator

@Joiejoie1 Joiejoie1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the review, here's a breakdown of key changes:

Function Modularization:
Functions .LevelReduction and .GetKingdom, initially defined within plotLineageHeatmap, are moved outside and set as independent internal functions. This modularization improves reusability, readability, and reduces nested function complexity within plotLineageHeatmap.

Color Fallback:
A fallback is added for color assignment in plotLineageHeatmap, defaulting to black if there are insufficient colors for the unique lineages. This ensures visual consistency without missing colors in the output.

Deprecation Adjustments:
Replaces aes_string() (deprecated) with aes() by quoting parameters, as suggested by a collaborator. This adjustment aligns the code with current ggplot2 standards while maintaining intended functionality.

These modifications collectively make the code more robust, maintainable, and aligned with best practices. The modularized functions enhances the file's utility and reliability.

@@ -766,15 +775,15 @@ plotLineageHeatmap <- function(prot, domains_of_interest, level = 3, label.size
)
ggplot(
data = all_grouped_reduced,
aes_string(x = "ReducedLin", y = "Query")
aes(x = "ReducedLin", y = "Query")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @the-mayer, Its a Deprecation Adjustments:

It replaces aes_string() (deprecated) with aes() by quoting parameters, as suggested by a collaborator. This adjustment aligns the code with current ggplot2 standards while maintaining intended functionality.

@@ -665,30 +706,6 @@ plotLineageDomainRepeats <- function(query_data, colname) {
#' }
#'
plotLineageHeatmap <- function(prot, domains_of_interest, level = 3, label.size = 8) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fallback is added for color assignment in plotLineageHeatmap, defaulting to black if there are insufficient colors for the unique lineages. This ensures visual consistency without missing colors in the output.

}
}

.GetKingdom <- function(lin) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function Modularization:

Functions .LevelReduction and .GetKingdom, initially defined within plotLineageHeatmap, are moved outside and set as independent internal functions. This modularization improves reusability, readability, and reduces nested function complexity within plotLineageHeatmap.

R/plotting.R Outdated
########################
#'
#'
.LevelReduction <- function(lin, level) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have boundary and Error Handling:

Where guards are added in .LevelReduction to handle cases where the lineage string does not contain enough ">" delimiters, returning the original lineage if the specified level is out of bounds. This change enhances error resilience. An example is thus:

.LevelReduction <- function(lin, level) {
# Check if '>' delimiter is present in the lineage
gt_loc <- str_locate_all(lin, ">")[[1]]

# Calculate the number of levels available in the lineage
available_levels <- length(gt_loc) / 2  # Since `str_locate_all` returns a matrix

# Guard against out-of-bounds level requests
if (level > available_levels || level < 1) {
    # Not enough '>' in lineage or invalid level requested
    return(lin)
} else {
    # Retrieve the location of the requested '>' delimiter
    gt_loc <- gt_loc[level, ][1] %>% as.numeric()
    # Extract lineage up to the specified level
    lin <- substring(lin, first = 0, last = (gt_loc - 1))
    return(lin)
}

}

Explanation:
str_locate_all(lin, ">")[[1]]: Finds all occurrences of > in the lineage string.
Boundary Guard: Checks if the requested level is within bounds:
If level is greater than available_levels (i.e., there aren’t enough > delimiters) or if level is less than 1, it returns the original lineage.
Lineage Extraction: If level is within bounds, the function extracts the lineage up to the specified level.
This boundary check ensures that .LevelReduction gracefully handles cases where the requested level is out of bounds, thereby improving error resilience.

Copy link
Collaborator Author

@teddyCodex teddyCodex Oct 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joiejoie1 Have a look at this implementation. A little more concise but now checks if the specified level exists within the lineage and then calculating the total available levels. it would return the full lineage string if the requested level exceeds the total levels or if the input is invalid (i.e. negative or zero)

.LevelReduction <- function(lin, level) {
    gt_loc <- str_locate_all(lin, ">")[[1]]
    available_levels <- length(gt_loc) / 2  # Since "str_locate_all" returns a matrix
    
    # Guard against out-of-bounds level requests
    if (level > available_levels || level < 1) {
        return(lin)
    } else {
        gt_loc <- gt_loc[level, ][1] %>% as.numeric()
        lin <- substring(lin, first = 0, last = (gt_loc - 1))
        return(lin)
    }
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joiejoie1 Have a look at this implementation. A little more concise but now checks if the specified level exists within the lineage and then calculating the total available levels. it would return the full lineage string if the requested level exceeds the total levels or if the input is invalid (i.e. negative or zero)

.LevelReduction <- function(lin, level) {
    gt_loc <- str_locate_all(lin, ">")[[1]]
    available_levels <- length(gt_loc) / 2  # Since "str_locate_all" returns a matrix
    
    # Guard against out-of-bounds level requests
    if (level > available_levels || level < 1) {
        return(lin)
    } else {
        gt_loc <- gt_loc[level, ][1] %>% as.numeric()
        lin <- substring(lin, first = 0, last = (gt_loc - 1))
        return(lin)
    }
}

Thank you, @teddyCodex, for sharing this! I appreciate the streamlined approach, especially the additional checks to handle cases where the specified level exceeds available levels or is invalid. Using str_locate_all to check the number of levels is a smart way to keep the function concise and error-proof. I’ll go through it and test how it performs across cases with different lineage structures to confirm everything works smoothly.

Let me know if there’s anything specific you’d like me to focus on in the review.

Copy link
Collaborator

@the-mayer the-mayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Joiejoie1 for the careful review and @teddyCodex for the PR. Good changes all around, merging shortly.

@the-mayer the-mayer self-assigned this Oct 29, 2024
@the-mayer the-mayer merged commit ce7ef66 into JRaviLab:main Oct 29, 2024
1 check passed
@jananiravi
Copy link
Member

Good work, @teddyCodex @Joiejoie1! thanks for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
outreachy for outreachy interns
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor internal functions to be defined outside of plot functions
4 participants