Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/hpc setup task #47

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from
Open

Feature/hpc setup task #47

wants to merge 12 commits into from

Conversation

rlskoeser
Copy link
Contributor

attempt to implement #44

@rlskoeser rlskoeser changed the base branch from develop to feature/script-update-model February 4, 2025 23:06
@rlskoeser rlskoeser force-pushed the feature/hpc-setup-task branch from 28e1d3b to 98bcf30 Compare February 5, 2025 16:54
@cmroughan
Copy link
Collaborator

cmroughan commented Feb 5, 2025

Checking in admin for the results of a new user's train task -- the task looks to have completed successfully, with the model indeed uploaded to eScr. The task report messaging shows that we did hit an error -- am I remembering correctly that right now the setup script is installing a different branch of htr2hpc? Maybe that's causing the disconnect:

Failed to load _MLModelProxy: No module named 'coremltools.libcoremlpython'
Failed to load _MLCPUComputeDeviceProxy: No module named 'coremltools.libcoremlpython'
Failed to load _MLGPUComputeDeviceProxy: No module named 'coremltools.libcoremlpython'
Failed to load _MLNeuralEngineComputeDeviceProxy: No module named 'coremltools.libcoremlpython'
Failed to load _MLModelProxy: No module named 'coremltools.libcoremlpython'
Failed to load _MLComputePlanProxy: No module named 'coremltools.libcoremlpython'
Failed to load _MLModelProxy: No module named 'coremltools.libcoremlpython'
Failed to load _MLModelAssetProxy: No module named 'coremltools.libcoremlpython'
WARNING:py.warnings:/home/wh4213/.conda/envs/htr2hpc/lib/python3.11/site-packages/PIL/Image.py:2926: RuntimeWarning: divide by zero encountered in divide
  As = 1.0 / w

(See in admin the report for Task 4296.)

Update:

I attempted to replicate on my user by running the htr2hpc update task and then running a train task, but I did not get the missing coremltools error. Probably because it is already installed in my htr2hpc conda env and reinstalling the htr2hpc package did nothing to change that. But we will want to pin down why coremltools is not getting set up by default.

Base automatically changed from feature/script-update-model to develop February 6, 2025 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Under Review
Development

Successfully merging this pull request may close these issues.

2 participants