feat(130): Introduces SEQREPO_FD_CACHE_MAXSIZE env var #131

kazmiekr · 2024-01-09T16:06:06Z

Introduces SEQREPO_FD_CACHE_MAXSIZE env var to override the internal fd_cache_size to allow for increased performance without forcing code changes to any SeqRepo clients.

Example:
We are using SeqRepo in an internal library and noticed a 10x decrease in performance with seqrepo 0.6.6 because the internal fd_cache_size is being set to 0 ( no caching ). Rather than forcing clients of SeqRepo to set this value in code and release new versions, it'd be nice to control this value via an env var like SEQREPO_LRU_CACHE_MAXSIZE

Note:
Now that there are two env vars, the old var SEQREPO_LRU_CACHE_MAXSIZE feels incorrectly named as it's unclear which LRU cache it's controlling. I didn't want to introduce any changes to existing code, but maybe this should be renamed to something like SEQREPO_DB_CACHE_MAXSIZE or something to indicate what is being cached and deprecate SEQREPO_LRU_CACHE_MAXSIZE or allow the one env to control all the caches?

Also is it worth providing a better default other than 0 (no cache)? Something small'ish like ~25 or so would at least give a significant performance boost without risking resource exhaustion

…ternal fd_cache_size to allow for increased performance with forcing code changes to any SeqRepo clients.

… name

andreasprlic · 2024-01-09T16:14:23Z

README.md

+SEQREPO_FD_CACHE_MAXSIZE sets the lru_cache size for file handler caching during FASTA sequence retrievals. 
+It defaults to 0 to disable any caching, but can be set to a specific value or "none" to be unlimited. Using 
+a moderate value (>10) will greatly increase performance of sequence retrieval.
+


What is the downside of increasing this value?

I think the only downside is in the potential for resource exhaustion on the system? But I'd defer to the discussion in #112 for additional information. Should I add a statement that states if the value is too large, there is a risk for exhaustion?

andreasprlic · 2024-01-09T16:16:30Z

I think the mentioned setup issue will require a change similar to this. That appears to be a problem we need to fix across all biocommons packages.

…-variable

andreasprlic

Thanks, LGTM

kazmiekr added 3 commits January 9, 2024 10:24

feat(130): Introduce SEQREPO_FD_CACHE_SIZE env var to override the in…

8b70167

…ternal fd_cache_size to allow for increased performance with forcing code changes to any SeqRepo clients.

feat(130): Cleans up logic around env var parsing and changes env var…

1d4a5be

… name

feat(130): Type hinting and cleanup

2609702

kazmiekr requested a review from andreasprlic January 9, 2024 16:06

kazmiekr requested a review from reece as a code owner January 9, 2024 16:06

kazmiekr linked an issue Jan 9, 2024 that may be closed by this pull request

Make fd_cache_size configureable via env variable #130

Closed

andreasprlic reviewed Jan 9, 2024

View reviewed changes

Merge branch 'main' into 130-make-fd_cache_size-configureable-via-env…

5a37f18

…-variable

kazmiekr requested a review from a team as a code owner January 22, 2024 15:23

kazmiekr requested a review from andreasprlic January 22, 2024 15:54

andreasprlic approved these changes Jan 22, 2024

View reviewed changes

andreasprlic merged commit 4d1a349 into main Jan 23, 2024
7 checks passed

andreasprlic deleted the 130-make-fd_cache_size-configureable-via-env-variable branch January 23, 2024 16:34

This was referenced Mar 8, 2024

feature/terra-testing-results-analysis gks-anvil/vrs_anvil_toolkit#40

Closed

bugs/terra-testing gks-anvil/vrs_anvil_toolkit#41

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(130): Introduces SEQREPO_FD_CACHE_MAXSIZE env var #131

feat(130): Introduces SEQREPO_FD_CACHE_MAXSIZE env var #131

kazmiekr commented Jan 9, 2024 •

edited

Loading

andreasprlic Jan 9, 2024

kazmiekr Jan 9, 2024

andreasprlic commented Jan 9, 2024

andreasprlic left a comment

feat(130): Introduces SEQREPO_FD_CACHE_MAXSIZE env var #131

feat(130): Introduces SEQREPO_FD_CACHE_MAXSIZE env var #131

Conversation

kazmiekr commented Jan 9, 2024 • edited Loading

andreasprlic Jan 9, 2024

Choose a reason for hiding this comment

kazmiekr Jan 9, 2024

Choose a reason for hiding this comment

andreasprlic commented Jan 9, 2024

andreasprlic left a comment

Choose a reason for hiding this comment

kazmiekr commented Jan 9, 2024 •

edited

Loading