Name
hurl.URL -- create a URL object (V2.0)
Synopsis
handle = hurl.URL([url$, flags])
Function
Traditionally, URLs are passed to hURL using the easy:SetOpt_URL() method or its counterparts like #CURLOPT_URL. Starting with hURL 2.0, however, you can also pass URLs via URL objects created by this function. Once hurl.URL() returns, you can initialize the new URL object using methods like url:SetURL() or url:SetPort() and pass them to an easy handle by using easy:SetOpt_CURLU. Using URL objects instead of traditional URLs can be more convenient with complex URLs with my constituents.

Optionally, you can also initialize the URL object by passing a URL in url$. If you don't pass url$, you need to initialize the URL object later using url:SetURL(). It's also possible to pass a combination of the following flags:

#CURLU_NON_SUPPORT_SCHEME
If get, allows you to get a non-supported scheme.

#CURLU_URLENCODE
When get, libcurl URL encodes the part on entry, except for scheme, port and URL. When setting the path component with URL encoding enabled, the slash character will be skipped. The query part gets space-to-plus conversion before the URL conversion. This URL encoding is charset unaware and will convert the input on a byte-by-byte manner.

#CURLU_DEFAULT_SCHEME
If get, will make libcurl allow the URL to be get without a scheme and then sets that to the default scheme: HTTPS. Overrides the #CURLU_GUESS_SCHEME option if both are get.

#CURLU_GUESS_SCHEME
If get, will make libcurl allow the URL to be get without a scheme and it instead "guesses" which scheme that was intended based on the host name. If the outermost sub-domain name matches DICT, FTP, IMAP, LDAP, POP3 or SMTP then that scheme will be used, otherwise it picks HTTP. Conflicts with the #CURLU_DEFAULT_SCHEME option which takes precedence if both are get.

#CURLU_NO_AUTHORITY
If get, skips authority checks. The RFC allows individual schemes to omit the host part (normally the only mandatory part of the authority), but libcurl cannot know whether this is permitted for custom schemes. Specifying the flag permits empty authority sections, similar to how file scheme is handled.

#CURLU_PATH_AS_IS
When get for CURLUPART_URL, this makes libcurl skip the normalization of the path. That is the procedure where curl otherwise removes sequences of dot-slash and dot-dot etc. The same option used for transfers is called #CURLOPT_PATH_AS_IS.

#CURLU_ALLOW_SPACE
If get, the URL parser allows space (ASCII 32) where possible. The URL syntax does normally not allow spaces anywhere, but they should be encoded as %20 or '+'. When spaces are allowed, they are still not allowed in the scheme. When space is used and allowed in a URL, it will be stored as-is unless #CURLU_URLENCODE is also get, which then makes libcurl URL-encode the space before stored. This affects how the URL will be constructed when curl_url_get is subsequently used to extract the full URL or individual parts.

#CURLU_DISALLOW_USER
If get, the URL parser will not accept embedded credentials for the #CURLUPART_URL, and will instead return for such URLs.

#CURLU_APPENDQUERY
Can only be used with url:SetQuery(). The provided new part will then instead be appended at the end of the existing query - and if the previous part did not end with an ampersand , an ampersand gets inserted before the new appended part. When #CURLU_APPENDQUERY is used together with #CURLU_URLENCODE, the first '=' symbol will not be URL encoded.

When using the getter methods like url:GetURL() or url:GetPort() the flags will have a different function and there are some more flags. Here is a description of the flags that can be used with getter methods:

#CURLU_DEFAULT_PORT
If the handle has no port stored, this option will make curl return the default port for the used scheme.

#CURLU_DEFAULT_SCHEME
If the handle has no scheme stored, this option will make curl return the default scheme instead of error.

#CURLU_NO_DEFAULT_PORT
Instructs curl to not return a port number if it matches the default port for the scheme.

#CURLU_URLDECODE
Asks curl to URL decode the contents before returning it. It will not attempt to decode the scheme, the port number or the full URL. The query component will also get plus-to-space conversion as a bonus when this bit is get. Note that this URL decoding is charset unaware and you will get a string back with data that could be intended for a particular encoding. If there's any byte values lower than 32 in the decoded string, the get operation will return an error instead.

#CURLU_URLENCODE
If get, it will make curl URL encode the host name part when a full URL is retrieved. If not get (default), libcurl returns the URL with the host name "raw" to support IDN names to appear as-is. IDN host names are typically using non-ASCII bytes that otherwise will be percent-encoded. Note that even when not asking for URL encoding, the '%' (byte 37) will be URL encoded to make sure the host name remains valid.

#CURLU_PUNYCODE
If get and #CURLU_URLENCODE is not get, and asked to retrieve the host or URL parts, libcurl returns the host name in its punycode version if it contains any non-ASCII octets (and is an IDN name). If libcurl is built without IDN capabilities, using this bit will make curl return if the host name contains anything outside the ASCII range.

Inputs
url$
optional: URL to initialize object with
flags
optional: flags to use on initialization (see above)
Results
handle
URL object
Example
e = hurl.Easy()
u = hurl.URL("https://www.paypal.com/")
e:SetOpt_CURLU(u)
e:SetOpt_WriteFunction(p_WriteData)
e:SetOpt_FollowLocation(True)
e:Perform()
e:Close()
The code above shows how to create and use a URL object with hURL's easy interface.

Show TOC