Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post-conditions of adding a run are not satisfied #3273

Open
bluenote10 opened this issue Dec 17, 2024 · 0 comments
Open

Post-conditions of adding a run are not satisfied #3273

bluenote10 opened this issue Dec 17, 2024 · 0 comments
Labels
help wanted Extra attention is needed type / bug Issue type: something isn't working

Comments

@bluenote10
Copy link

bluenote10 commented Dec 17, 2024

🐛 Bug

Adding a run should intuitively have certain post-conditions like

  • repo.run_exists(run.hash) should be True.
  • repo.list_all_runs() should list the added run.

Unfortunately all these post-conditions are currently (temporarily) violated, as the following example demonstrates.

To reproduce

import tempfile

from aim import Repo, Run

with tempfile.TemporaryDirectory() as tmp_dir:
    repo = Repo(tmp_dir, init=True)

    # Pre-condition: No runs are there.
    assert len(repo.list_all_runs()) == 0

    # Add a run
    run = Run(repo=repo)

    # Post-conditions: All these assertions should hold.
    assert repo.run_exists(run.hash)
    assert len(repo.list_all_runs()) > 0

Both assertions are currently failing.

Expected behavior

The assertions should not fail.

Environment

  • Aim Version: 3.25.1
  • Python version: 3.12.7
  • pip version: 24.3.1
  • OS: macOS

Additional context

The reason for this behavior is that these methods internally delegate to _all_run_hashes, which unfortunately is using a ttl_cache, which does not get invalidated properly after the change has been made.

aim/aim/sdk/repo.py

Lines 412 to 413 in 9124e64

@ttl_cache(ttl=0.5)
def _all_run_hashes(self) -> Set[str]:

This is pretty unfortunately, because the behavior actually depends what happens before, i.e., how long ago the last call to such a function was. These "eventually satisfied post-conditions" are a pretty ugly source of WTF moments for users, because it takes quite some digging to understand why it sometimes works, and sometimes doesn't.

To fix the behavior, it might be possible to invalidate the ttl cache explicitly when a modification has been made to the repository. This should result in consistent behavior, and perhaps would also allow to increase the ttl for more aggressive caching. As per the cachetools docs, invalidating the cache explicitly should be possible by using the clear_cache() function that gets attached to the decorated function.

@bluenote10 bluenote10 added help wanted Extra attention is needed type / bug Issue type: something isn't working labels Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed type / bug Issue type: something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant