-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: url_decode function #2957
Conversation
This implementation is just not correct. $ ./jq -na '"è" | @uri | ., url_decode'
"%C3%A8"
"\u00c3\u00a8" $ ./jq -na '"è" | ., (@uri | ., (url_decode | ., (@uri | ., url_decode)))'
"\u00e8"
"%C3%A8"
"\u00c3\u00a8"
"%C3%83%C2%A8"
"\u00c3\u0083\u00c2\u00a8" The correct result should have been an $ jq -na '"è"'
"\u00e8"
|
You were right! Sorry, we didn't catch those cases in the first implementation. Analyzing the issues we commented (especially #798 ), there were a function that matches exactly with your test cases and even with emotes. (This one |
Showed up an failed test at the checks, but as it is only in one check and it is labeled as "disabled", I don't believe it has to do with this new feature. |
You have added those tests to |
Why not just adding |
URL decoding a byte sequence that only contains non-utf-8 bytes triggers an error: $ ./jq -na '"%ff%fa" | url_decode'
jq: error (at <unknown>): null (null) and number (128) cannot be subtracted URL decoding a byte sequence that contains non-utf-8 bytes, but contains a valid utf-8 subsequences, results in a single $ ./jq -na '"%ff%c3%a7%fa" | url_decode'
"\ufffd" The correct behaviour would be that every non-utf-8 byte becomes $ jq -Ra . <<< $'\xff\xfa'
"\ufffd\ufffd"
$ jq -Ra . <<< $'\xff\xc3\xa7\xfa'
"\ufffd\u00e7\ufffd" |
I've just noticed that the implementation is literally just entirely copied from this gist https://gist.github.com/jcracknell/52cf4e0f8c4518a853784638db8a258d Come on now. :/ |
As the code was never merged to the main repo I thought would be a good addition to it, that's why I referenced the gist in my comment, sorry for not being more clear. But thanks for the feedback! I read the comments and I would like to know what is the best approach here:
|
url_decode function
Creates:
The code for this functions were based of Rosetta Code, like suggested by others maintainers.
Motivation
There is some issues related and asking for this implementation, for example: #2261 and even older ones like #798 , closing this issues.