Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wandb: ERROR Uploading artifact file failed. Artifact won't be committed. done. #149

Open
rangehow opened this issue Mar 6, 2025 · 3 comments

Comments

@rangehow
Copy link

rangehow commented Mar 6, 2025

I used wandb to log the training process on an offline server A, and then I copied the logs to server B, which has internet access, to upload them using wandb sync. However, I encountered the following error, and the link is also a 404:

wandb sync .\offline-run-20250306_134443-8sfvkizj
Find logs at: C:\Users\lenovo\Downloads\offline-run-20250306_134443-8sfvkizj\mnt\dolphinfs\hdd_pool\docker\user\hadoop-aipnlp\INS\ruanjunhao04\verl-main\wandb\wandb\debug-cli.lenovo.log
Syncing: https://wandb.ai/a954369920/reinforce%2B%2Bacr-agi/runs/8sfvkizj ... 
wandb: ERROR Error uploading "/home/hadoop-aipnlp/.local/share/wandb/artifacts/staging/tmpltuuv4qs": FileNotFoundError, [Errno 2] No such file or directory: '/home/hadoop-aipnlp/.local/share/wandb/artifacts/staging/tmpltuuv4qs'
wandb: ERROR Uploading artifact file failed. Artifact won't be committed.
wandb: ERROR Error uploading "/home/hadoop-aipnlp/.local/share/wandb/artifacts/staging/tmpulmc3fwf": FileNotFoundError, [Errno 2] No such file or directory: '/home/hadoop-aipnlp/.local/share/wandb/artifacts/staging/tmpulmc3fwf'
wandb: ERROR Uploading artifact file failed. Artifact won't be committed.
done.

This comment has been minimized.

This comment has been minimized.

@fmamberti-wandb
Copy link

Hi @rangehow, the artifacts being logged and uploaded are store in a separate folder from the folder storing the wandb runs' data, which is the staging folder with the artifact file mentioned in the error

wandb: ERROR Error uploading "/home/hadoop-aipnlp/.local/share/wandb/artifacts/staging/tmpltuuv4qs": FileNotFoundError, [Errno 2] No such file or directory: '/home/hadoop-aipnlp/.local/share/wandb/artifacts/staging/tmpltuuv4qs'

So you won't be able to sync move the run and sync the artifact from a different device. If you have access to the original artifact files, I'd recommend to copy them across the device with access to wandb.ai, resume the run after syncing and log them that way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants