Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the code #3

Open
YosefMac opened this issue Sep 22, 2016 · 3 comments
Open

Question about the code #3

YosefMac opened this issue Sep 22, 2016 · 3 comments

Comments

@YosefMac
Copy link

Hello! :)

I have this doubt about your code

In the file:line urldecode.c:15

  while(!d) {
    d = 1;
    int i; /* the counter for the string */

    for(i=0;i<strlen(dStr);++i) {
                           .
                           .
                           .

What is the purpose of have a nested loop ?
I thought that one is enough. Why perform another decode when you already did one ?
This is for achieve something of the standard rfc that I'm missing out ?

Thanks in advance!!!

@abejfehr
Copy link
Owner

Hey! This is a really good question.

I wrote this years ago so I had to look at my code again to see what I might have been thinking.

The code seems to be decoding the string until there's nothing left to decode, which could decode something that was doubly encoded. Try decoding the string %2524 for yourself and you'll see that the result is another percent-encoded string.

What I did here actually violates the spec, so that's too bad.

Implementations must not percent-encode or decode the same string more than once, as decoding an already decoded string might lead to misinterpreting a percent data octet as the beginning of a percent-encoding, or vice versa in the case of percent-encoding an already percent-encoded string.

I didn't actually know about that clause previously. I'll keep this issue open to represent the problem of continuously decoding strings. Feel free to submit a Pull Request that resolves this issue. Perhaps strings can be optionally decoded recursively?

I hope that helps!

@YosefMac
Copy link
Author

Thanks for the clarification!! 😃

@bucanero
Copy link

bucanero commented Dec 29, 2022

Here's my simplified version, that:

  • doesn't "doble decode" strings, following the spec
  • doesn't use malloc (decodes over the same original string)
  • returns 1 if it decoded some values (0 if nothing was decoded)

Note: since it's overwriting the original buffer, if you want to keep the original url you'll need to make a copy yourself

int urlDecode(char *dStr) {
	int i, j;
	char hex[] = "00"; /* for a hex code */

	for(i=0, j=0; dStr[i]; i++, j++)
	{
		if(dStr[i] != '%' || dStr[i+1] == 0)
		{
			dStr[j] = dStr[i];
			continue;
		}

		if(isxdigit(dStr[i+1]) && isxdigit(dStr[i+2]))
		{
			/* combine the next two numbers into one */
			hex[0] = dStr[i+1];
			hex[1] = dStr[i+2];

			/* convert it to decimal */
			dStr[j] = strtol(hex, NULL, 16);
			i += 2; /* move to the end of the hex */
		}
	}
	dStr[j] = 0; /* null terminate the string */

	return (i != j);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants