Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(win/video): add support for recombined YUV444 encoding #2760

Closed
wants to merge 1 commit into from

Conversation

ns6089
Copy link
Contributor

@ns6089 ns6089 commented Jun 26, 2024

Description

The continuation of #2533. It's possible to emulate YUV 4:4:4 on gpus that don't support it natively by doubling the YUV 4:2:0 pixel count and running custom recombination shaders on both encoding and decoding side. Like Microsoft did it in MS-RDPEGFX.

Prototype stage. Requires changes on moonlight's side: I currently have custom libplacebo mpv shader implemented for plvk backend, in the future it should be possible to add Direct3D11 and OpenGL shaders.

https://github.com/ns6089/Sunshine/compare/yuv444..yuv444in420

moonlight-common-c pull request: TBD
moonlight-qt pull request: TBD, testing branch https://github.com/ns6089/moonlight-qt/tree/yuv444in420

What works and what doesn't

  1. First prototype, left half of U_src and V_src planes in Y_out. Good DCT, bad motion compensation.
  2. Second prototype. U_src in Y_out. V_src is spread across U_out and V_out in a pattern that is spatially consistent with Y_out. Good motion compensation, relatively fat DCT on U_out and V_out due to high frequencies.
  3. Third prototype, dropped. Maybe can slightly improve the DCT by running 1/4 of V through averaging low pass filter.

To Do

  • decide what to do with resolutions not divisible by 2
  • decide in which part of the protocol dimension doubling will be taking place, e.g. will the client request the doubled dimension or will it be done implicitly

Screenshot

before
after

Issues Fixed or Closed

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Dependency update (updates to dependencies)
  • Documentation update (changes to documentation)
  • Repository update (changes to repository files, e.g. .github/...)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated the in code docstring/documentation-blocks for new or existing methods/components

Branch Updates

LizardByte requires that branches be up-to-date before merging. This means that after any PR is merged, this branch
must be updated before it can be merged. You must also
Allow edits from maintainers.

  • I want maintainers to keep my branch updated

@mirh
Copy link

mirh commented Jun 27, 2024

Awesomely crazy.
Could there be anything worth doing with an emulated 4:2:2 stream then? Like, I don't know, slightly lower recombination overhead, or lower bandwidth requirements?

Or perhaps not hitting encoding limits at higher resolutions. Like, is my understanding correct that this pixel doubling would not allow for 1440p on (say) older VCE versions that max out at 4K?

@ns6089
Copy link
Contributor Author

ns6089 commented Jun 27, 2024

I don't think anyone but Intel supports 4:2:2. About 4K limit, 1440p might still work depending on how exactly said limit is implemented, the overall pixel count stays within 4K range.


amd

@mirh
Copy link

mirh commented Jun 27, 2024

I don't think anyone but Intel supports 4:2:2.

To be honest, I was more thinking of TVs than computers here. It's a mixed bag even there, but still it's not so rare.
But now that you mention pcs, decoding is much lighter on the cpu than encoding. I don't think that would usually be a deal breaker. Or nevertheless, couldn't the client-side recombination just fake to be 4:4:4 then? Or would whatever empty padding you add ruin the image more than the results you could get with just plain 4:2:0?

About 4K limit, 1440p might still work depending on how exactly said limit is implemented, the overall pixel count stays within 4K range.

You mean if the limit is actually implemented like 4096x2160 (usual old amd) vs 4096x4096 (usual old nvidia)?
Or can you really call it a day just as long as the supported total pixel count, whatever the "shape", is 7.372.800 (2560x1440x2) or more?

@ns6089
Copy link
Contributor Author

ns6089 commented Jun 27, 2024

You can't encode 4:2:2 on nvidia gpus, implementing a path exclusively for intel will be too expensive.

Or can you really call it a day just as long as the supported total pixel count, whatever the "shape", is 7.372.800 (2560x1440x2) or more?

I'm already calling it a day 😎
Doubling one dimension allows to minimize discontinuities in motion estimation, in contrast to tiling. Current half-naive implementation for example has single motion estimation vertical "seam" in U and V planes.

    //     Y       U     V
    // +-------+ +---+ +---+
    // |       | |   | |   |
    // |   Y   | |UR | |VR |
    // |       | |   | |   |
    // +---+---+ +---+ +---+
    // |   |   |
    // |UL |VL |
    // |   |   |
    // +---+---+

@mirh
Copy link

mirh commented Jun 27, 2024

You can't encode 4:2:2 on nvidia gpus, implementing a path exclusively for intel will be too expensive.

You can't encode 4:4:4 on amd gpus either, and yet this is what this PR is about isn't it?

@ns6089
Copy link
Contributor Author

ns6089 commented Jun 27, 2024

Personally, I don't see a point in supporting recombination into 4:2:2
It will still have visible artifacts while having computational overhead close to 4:4:4 and significant amount of additional development time. And this development time will be multiplied by the amount of distinct clients,

@mirh
Copy link

mirh commented Jun 28, 2024

I mean, sure, of course this is already miraculous.
I was just trying to think outside the box (4:2:2 is still subpar, but even the worst case scenario starts to be bearable instead).

If any I guess the improvement isn't that clear cut, because unlike with a direct cable connection it's not like there aren't already compression artifacts anyway. So if 4:4:4 couldn't fit in some whatever doubled 4:2:0 4K scenario, just lowering the resolution could also be a possible (and if not any easily immediate) alternative?

@ns6089 ns6089 force-pushed the yuv444in420 branch 4 times, most recently from fc48f22 to 3a1115d Compare June 30, 2024 08:50
@ns6089 ns6089 force-pushed the yuv444in420 branch 2 times, most recently from 2cc6a6a to a4ffe24 Compare August 1, 2024 07:52
@ns6089 ns6089 changed the title Support recombined YUV 4:4:4 encoding (Prototype, Windows-only for now) feat(win/video): add support for recombined YUV444 encoding Aug 22, 2024
Copy link

@ns6089
Copy link
Contributor Author

ns6089 commented Aug 25, 2024

The code in this pull request is Not a Contribution under LizardByte Individual Contributor License Agreement.
The code in this pull request is shared under GNU GENERAL PUBLIC LICENSE Version 3.

The feature itself is completed on sunshine side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants