-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to set ORT_DISABLE_ALL as optimization #195
Conversation
If you haven't already, do you mind to follow the contribution page here to submit a signed CLA? Also, can you please update the README documentation for level to reflect this option? CC: @tanmayv25 |
@casassg Following up about the CLA and README updates. |
Hi @casassg, thank you for this PR and we are looking forward to merging it soon. However, we do need a signed CLA and a simple entry in the documentation about this level of optimization (for greater visibility of this feature). May I ask you if you have time in the nearest future to do these 2 actions? |
@casassg Following up about the CLA. Are you able to submit it? If not, we can merge these in a separate PR on our own. Also, I am reading this a little closer and think it would be good to add a comment about this option in the README under Model Config Options. |
The triton project has accepted the CLA for Block Inc.. This PR should not be blocked on the CLA anymore. |
Fantastic, thank you Gerard and Kyle! @cassassg, would you be able to add the minor documentation updates, so we can get this merged? We can re-approve after. |
@dyastremsky added a small comment (sorry this got deprioritized in my stack and buried down the line) |
Please let me know if it would be better to use a different value instead of 2.
This change allows users to set optimization->graph->level to DISABLE_ALL by setting 2. The reason for this is that sometimes specially when using a triton server for serving multiple models, we prefer users to optimize the model offline instead of online. This allows us to disable optimization greatly speeding up load time into triton onnxruntime backend.
This is my first contribution here, so please do let me know if I should change anything or add more tests. Given the code is 2 lines, I didn't initially add it.