Skip to content

LispKit Url

Matthias Zenger edited this page Nov 19, 2024 · 1 revision

Library (lispkit url) defines procedures for creating and decomposing URLs. URLs are represented as strings which conform to the syntax of a generic URI. Each URI consists of five components organized hierarchically in order of decreasing significance from left to right:

url = scheme ":" ["//" authority] path ["?" query] ["#" fragment]

A component is undefined if it has an associated delimiter and the delimiter does not appear in the URI. The scheme and path components are always defined. A component is empty if it has no characters. The scheme component is always non-empty. The authority component consists of several subcomponents:

authority = [userinfo "@"] host [":" port]
userinfo = username [":" password]

Generic URLs

(url? obj)     [procedure]

Returns #t if obj is a string containing a valid URL; #f otherwise.

(make-url proto)     [procedure]
(make-url proto scheme)
(make-url proto scheme auth)
(make-url proto scheme auth path)
(make-url proto scheme auth path query)
(make-url proto scheme auth path query fragment)

Returns a string representing the URL defined by merging the given URL components with URL prototype proto. proto is either #f (no prototype) or it is a string which is interpreted as a partially defined URL. The URL components provided as arguments of make-url overwrite the respective components of proto. If a URL component such as scheme, auth, path, etc. is set to #f, no changes are made to the respective component of proto. If they are set to #t, then the respective component in proto is removed. If a non-boolean value is provided, it replaces the respective value in proto. The result of applying all given URL components to proto is returned as the result of make-url. The result is not guaranteed to be a valid URL. It could, for instance, be used as a prototype for other make-url calls. If it is not possible to return a result that can be parsed back into a URL prototype, #f is returned.

scheme is a string defining the URL scheme. auth is defining the URL authority. The following formats are supported for auth:

  • host: A simple string defines the host of the URL without port.
  • (host): A list with one element has the same effect than just using string host alone.
  • (host port): Specifies both the host of the URL as a string followed by the port of the URL as a fixnum.
  • (user host port): Defines the username as a string followed by the host of the URL as a string followed by a port number.
  • (user passwd host port): The username followed by a password, followed by a hostname followed by a port number. The first three elements of the list are strings, the last element is a number.

If a list is used to specify the URL authority, again #f and #t can be used to either not modify the respective authority component from proto or to remove it. path and fragment define a path or fragment of a URL as a string. query defines the query component of a URL using two possible formats:

  • query: A simple string defining the full query component of the URL
  • ((name . value) ...): An association list of query items consisting of name/value pairs of strings provides a structured representation of a query. It gets automatically mapped to a query string in which items are represented in the form name=value, separated by &.

If the URL extensibility mechanism via prototype proto is not used and it is the goal to defined a valid URL, then using procedures url and url-copy should be preferred over using make-url.

(make-url #f "https" "lisppad.app")
  ⇒  "https://lisppad.app"
(make-url #f "https" "lisppad.app" "/libraries/lispkit")
  ⇒  "https://lisppad.app/libraries/lispkit"
(make-url #f "https" "lisppad.app" "/libraries/lispkit"
          '(("lang" . "en")("c" . "US")))
  ⇒  "https://lisppad.app/libraries/lispkit?lang=en&c=US"

(url)     [procedure]
(url scheme)
(url scheme auth)
(url scheme auth path)
(url scheme auth path query)
(url scheme auth path query fragment)

Returns a string representing the URL defined by the given URL components. Providing #f for a component means the component does not exist. scheme is a string defining the URL scheme. auth is defining the URL authority supporting the following formats: host, (host), (host port), (user host port), and (user passwd host port). path and fragment define a path or fragment of a URL as a string. query defines the query component of a URL either as a query string or an association list of query items consisting of name/value pairs of strings.

(url scheme ...) is similar to (make-url #f scheme ...), but it guarantees that the result is a valid URL. Invalid combinations of URL components result in procedure url returning #f.

(url-copy url)     [procedure]
(url-copy url scheme)
(url-copy url scheme auth)
(url-copy url scheme auth path)
(url-copy url scheme auth path query)
(url-copy url scheme auth path query fragment)

Returns a new string representing the URL defined by merging the given URL components with URL prototype url. url is a string which is interpreted as a partially defined URL. The URL components provided as arguments of url-copy overwrite the respective components of url. If a URL component such as scheme, auth, path, etc. is set to #f, no changes are made to the respective component of proto. If they are set to #t, then the respective component in proto is removed. If a non-boolean value is provided, it replaces the respective value in url. The result of applying all given URL components to url is returned as the result of url-copy if it constitutes a valid URL, otherwise #f is returned.

scheme is a string defining the URL scheme. auth is defining the URL authority supporting the following formats: host, (host), (host port), (user host port), and (user passwd host port). path and fragment define a path or fragment of a URL as a string. query defines the query component of a URL either as a query string or an association list of query items consisting of name/value pairs of strings.

(url-scheme url)     [procedure]

Returns the scheme of the URL string url.

(url-authority url)     [procedure]
(url-authority url url-encoded?)

Returns the authority of the URL string url as a list of four components: username, password, host and port. URL components that do not exist are represented as #f. If url-encoded? is provided and set to true, then the authority components are returned in percent-encoded form.

(url-user url)     [procedure]
(url-user url url-encoded?)

Returns the user name of the URL string url as a string, or #f if there is no user name defined. If url-encoded? is provided and set to true, then the user name is returned in percent-encoded form.

(url-password url)     [procedure]
(url-password url url-encoded?)

Returns the password of the URL string url as a string, or #f if there is no password defined. If url-encoded? is provided and set to true, then the password is returned in percent-encoded form.

(url-host url)     [procedure]
(url-host url url-encoded?)

Returns the host of the URL string url as a string, or #f if there is no host defined. If url-encoded? is provided and set to true, then the host is returned in percent-encoded form.

(url-port url)     [procedure]

Returns the port of the URL string url as a fixnum, or #f if there is no port defined.

(url-path url)     [procedure]
(url-path url url-encoded?)

Returns the path of the URL string url as a string, or #f if there is no path defined. If url-encoded? is provided and set to true, then the path is returned in percent-encoded form.

(url-query url)     [procedure]
(url-query url url-encoded?)

Returns the query of the URL string url as a string, or #f if there is no query defined. If url-encoded? is provided and set to true, then the query is returned in percent-encoded form.

(url-query-items url)     [procedure]

Returns the query of the URL string url as an association list of string pairs, mapping URL query parameters to values. #f is returned, if the query cannot be parsed.

(url-fragment url)     [procedure]
(url-fragment url url-encoded?)

Returns the fragment of the URL string url as a string, or #f if there is no fragment defined. If url-encoded? is provided and set to true, then the fragment is returned in percent-encoded form.

(url-format url)     [procedure]
(url-format url schc)
(url-format url schc usrc)
(url-format url schc usrc passc)
(url-format url schc usrc passc hostc)
(url-format url schc usrc passc hostc portc)
(url-format url schc usrc passc hostc portc pathc)
(url-format url schc usrc passc hostc portc pathc qc)
(url-format url schc usrc passc hostc portc pathc qc fragc)

Formats the given URL string url using the provided component-level configurations schc, usrc, passc, hostc, portc, pathc, qc, and fragc, and returns the result as a string. Each configuration has one of the following forms:

  • #f: The component is omitted in the formatted URL.
  • #t: The component is always included in the formatted URL.
  • "...": The component is omitted if the component of url matches the string.
  • ("..." ...) or ("..." ... . #t): The component is omitted if the list contains a string matching the component of url.
  • ("..." ... . #f): The component is only displayed if the list contains a string matching the component of url.
  • ("..." ... . comp) where comp is a symbol: The component is omitted if the list contains a string matching the component specified via comp of url. The following specifiers can be used: scheme, user, password, host, port, path, query, and fragment. The component is only displayed if the list contains a string matching the component specified via comp of url. The following specifiers can be used: scheme?, user?, password?, host?, port?, path?, query?, and fragment?.
(url-format "http://[email protected]:80/index.html?lang=en"
  #t #f #f #t "80" '("/index.html" "/index.htm") #t)
  ⇒  "http://lisppad.app?lang=en"

(url-parse str)     [procedure]
(url-parse str schs)
(url-parse str schs usrs)
(url-parse str schs usrs pass)
(url-parse str schs usrs pass ports)
(url-parse str schs usrs pass hosts ports)
(url-parse str schs usrs pass hosts ports paths)
(url-parse str schs usrs pass hosts ports paths qs)
(url-parse str schs usrs pass hosts ports paths qs frags)

Parses a string str using the provided component-level parsing settings schs, usrs, pass, hosts, ports, paths, qs, and frags, and returns the result as a string. Each setting has one of the following forms:

  • #f: The component is optional.
  • #t: The component is required.
  • "..." or number: The component is optional and if it is missing, this string or number is used as a default.
(url-parse " http://lisppad.app:80?lang=en ok")
  ⇒  "http://lisppad.app:80?lang=en"
(url-parse " http://lisppad.app/?lang=en ok"
  #t #f #f #t 80 #t #f "target")
  ⇒  "http://lisppad.app:80/?lang=en#target"

File URLs

(file-url? url)     [procedure]
(file-url? url dir?)

If dir? is not provided, file-url? returns #t if obj is a string containing a valid file URL; #f otherwise. If dir? is provided and set to true, file-url? returns #t if obj is a string containing a valid file URL and the file URL refers to a directory; #f otherwise. If dir? is provided and set to #f, file-url? returns #t if obj is a string containing a valid file URL and the file URL refers to a regular file; #f otherwise.

(file-url path)     [procedure]
(file-url path base)
(file-url path base expand?)

Returns a file URL string for the given file path. If file URL base is provided, path is considered to be relative to base. If base is not provided or set to #f and path is a relative path, it is considered relative to the current directory. If expand? is given and set to true, symbolic links are getting resolved in the resulting file URL.

(file-url-standardize url)     [procedure]
(file-url-standardize url expand?)

Returns a standardized version of file URL url. If expand? is given and set to true, symbolic links are getting resolved in the resulting file URL.

URL encoding

(url-encode str)     [procedure]
(url-encode str allowed-char)
(url-encode str allowed-char force?)

Returns a URL/percent-encoded version of the string str. allowed-char defines the characters that are exempted from the encoding. By default, allowed-char corresponds to all characters that are allowed to be unencoded in URL queries. Argument allowed-char can either be:

  • #f: All characters get encoded
  • #t: The default characters are allowed to be unencoded.
  • Symbols user, password, host, path, query, fragment: The characters that are allowed in the respective URL components.
  • String: All characters included in the string are allowed to be unencoded.
  • Character set: All characters included in the character set (as defined by library (lispkit char-set) are allowed to be unencoded.

If argument force? is set to #t, it is guaranteed that url-encode returns a string. If argument force? is set to #f (the default), then url-encode might return #f if encoding fails.

(url-decode str)     [procedure]
(url-decode str force?)

Returns a decoded version of the URL/percent-encoded string str. If argument force? is set to #t, it is guaranteed that url-decode returns a string. If argument force? is set to #f (the default), then url-decode might return #f if decoding fails.

Clone this wiki locally