Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to set ORT_DISABLE_ALL as optimization #195

Merged
merged 2 commits into from
Dec 18, 2023

Conversation

casassg
Copy link
Contributor

@casassg casassg commented Jun 21, 2023

Please let me know if it would be better to use a different value instead of 2.

This change allows users to set optimization->graph->level to DISABLE_ALL by setting 2. The reason for this is that sometimes specially when using a triton server for serving multiple models, we prefer users to optimize the model offline instead of online. This allows us to disable optimization greatly speeding up load time into triton onnxruntime backend.

This is my first contribution here, so please do let me know if I should change anything or add more tests. Given the code is 2 lines, I didn't initially add it.

pranavsharma
pranavsharma previously approved these changes Jun 26, 2023
@dyastremsky
Copy link
Contributor

If you haven't already, do you mind to follow the contribution page here to submit a signed CLA?

Also, can you please update the README documentation for level to reflect this option?

CC: @tanmayv25

@dyastremsky
Copy link
Contributor

@casassg Following up about the CLA and README updates.

@oandreeva-nv
Copy link
Contributor

Hi @casassg, thank you for this PR and we are looking forward to merging it soon. However, we do need a signed CLA and a simple entry in the documentation about this level of optimization (for greater visibility of this feature). May I ask you if you have time in the nearest future to do these 2 actions?

@dyastremsky
Copy link
Contributor

dyastremsky commented Sep 20, 2023

@casassg Following up about the CLA. Are you able to submit it? If not, we can merge these in a separate PR on our own.

Also, I am reading this a little closer and think it would be good to add a comment about this option in the README under Model Config Options.

@nv-kmcgill53
Copy link
Contributor

The triton project has accepted the CLA for Block Inc.. This PR should not be blocked on the CLA anymore.

@dyastremsky
Copy link
Contributor

Fantastic, thank you Gerard and Kyle! @cassassg, would you be able to add the minor documentation updates, so we can get this merged? We can re-approve after.

@casassg
Copy link
Contributor Author

casassg commented Dec 18, 2023

@dyastremsky added a small comment (sorry this got deprioritized in my stack and buried down the line)

@tanmayv25 tanmayv25 merged commit fa05686 into triton-inference-server:main Dec 18, 2023
3 checks passed
@casassg casassg deleted the patch-1 branch January 16, 2024 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants