You parse a full URL by setting the CURLUPART_URL
part in the handle:
CURLU *h = curl_url();
rc = curl_url_set(h, CURLUPART_URL,
"https://example.com:449/foo/bar?name=moo", 0);
If successful, rc
contains CURLUE_OK
and the different URL components are
held in the handle. It means that the URL was valid as far as libcurl
concerns.
The function call's forth argument is a bitmask. Set none, one or more bits in that to alter the parser's behavior:
Makes curl_url_set()
accept a non-supported scheme. If not set, the only
acceptable schemes are for the protocols libcurl knows and have built-in
support for.
Makes the function URL encode the path part if any bytes in it would benefit from that: like spaces or "control characters".
If the passed in string does not use a scheme, assume that the default one was
intended. The default scheme is HTTPS. If this is not set, a URL without a
scheme part is not accepted as valid. Overrides the CURLU_GUESS_SCHEME
option if both are set.
Makes libcurl allow the URL to be set without a scheme and it instead
"guesses" which scheme that was intended based on the hostname. If the
outermost sub-domain name matches DICT, FTP, IMAP, LDAP, POP3 or SMTP then
that scheme is used, otherwise it picks HTTP. Conflicts with the
CURLU_DEFAULT_SCHEME
option which takes precedence if both are set.
Skips authority checks. The RFC allows individual schemes to omit the host
part (normally the only mandatory part of the authority), but libcurl cannot
know whether this is permitted for custom schemes. Specifying the flag permits
empty authority sections, similar to how the file scheme is handled. Really
only usable in combination with CURLU_NON_SUPPORT_SCHEME
.
Makes libcurl skip the normalization of the path. That is the procedure where
curl otherwise removes sequences of dot-slash and dot-dot etc. The same option
used for transfers is called CURLOPT_PATH_AS_IS
.
Makes the URL parser allow space (ASCII 32) where possible. The URL syntax
does normally not allow spaces anywhere, but they should be encoded as %20
or +
. When spaces are allowed, they are still not allowed in the
scheme. When space is used and allowed in a URL, it is stored as-is unless
CURLU_URLENCODE
is also set, which then makes libcurl URL-encode the space
before stored. This affects how the URL is constructed when curl_url_get()
is subsequently used to extract the full URL or individual parts.