Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve GPU frames accuracy #413

Merged
merged 5 commits into from
Dec 2, 2024
Merged

Conversation

NicolasHug
Copy link
Member

@NicolasHug NicolasHug commented Nov 28, 2024

We now call nppiNV12ToRGB_709CSC_8u_P2C3R for GPU frame color conversion, instead of nppiNV12ToRGB_709HDTV_8u_P2C3R, as suggested in #412

This dramatically improves accuracy of decoded frames, as they are now much closer to the expected CPU ones, within atol=2 (this is even stricter than our frame comparison for MacOS!).

Closes #412

Thank you @fmassa for the original hint in #372 (comment) and @pjs102793 for chasing this up!

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 28, 2024
@NicolasHug
Copy link
Member Author

NicolasHug commented Nov 28, 2024

EDIT: out of date, just refer to updated see main comment above
CI is completely broken right now, but I can confirm locally that we can now lower all checks to assert_close(atol=2) ! So that's a massive accuracy improvement

@NicolasHug NicolasHug changed the title Call nppiNV12ToRGB_709CSC_8u_P2C3R Improve GPU frames accuracy Nov 29, 2024
@NicolasHug NicolasHug marked this pull request as ready for review November 29, 2024 11:30
@NicolasHug NicolasHug requested a review from scotts November 29, 2024 11:32
@scotts
Copy link
Contributor

scotts commented Dec 2, 2024

Wow! Great catch! Thanks @fmassa, @pjs102793 and @NicolasHug!

@NicolasHug NicolasHug merged commit de6a6d4 into pytorch:main Dec 2, 2024
36 of 37 checks passed
@NicolasHug NicolasHug deleted the cuda_colorspace branch December 2, 2024 13:56
NicolasHug added a commit to NicolasHug/torchcodec that referenced this pull request Dec 4, 2024
@@ -225,7 +225,7 @@ void convertAVFrameToDecodedOutputOnCuda(
auto start = std::chrono::high_resolution_clock::now();
NppStatus status;
if (src->colorspace == AVColorSpace::AVCOL_SPC_BT709) {
status = nppiNV12ToRGB_709HDTV_8u_P2C3R(
status = nppiNV12ToRGB_709CSC_8u_P2C3R(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by: this should probably be gated by AVFrame::color_range.

https://ffmpeg.org/doxygen/trunk/pixfmt_8h.html#a3da0bf691418bc22c4bcbe6583ad589a

MPEG has limited color-range while JPEG has full color range.

We were using full for all frames before -- now we are using limited for all. Colorspace and color-range are orthogonal so probably need nested ifs here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Difference in CPU and CUDA Decode Output Values May Be Reduced with CSC Function
4 participants