Using the net/url package in Go

One of my pet peeves in code reviews is creating and manipulating URLs by hand. It’s a never-ending source of easily-avoided bugs that can be caught at code review time instead of in testing or production. It’s just so tempting to use fmt.Sprintf() or string concatenation but you are guaranteed trouble later on.

It doesn’t help that the API implemented by the net/url package is a little quirky, so here is a short article with examples on how to use the net/url module in Go. 

Parsing URL from a string

While strictly not URL manipulation or creation it’s a very common situation that you need to break apart a URL that’s a string into it’s component parts.

This is done using the url.Parse() function:

import (
    "fmt"
    "net/url"
)

func main() {
    u, err := url.Parse(
        "http://example.com/xyz?a=b&c=d")
    if err != nil {
        // Error handling
    }
    fmt.Println(u.Scheme)
    fmt.Println(u.Host) 
    fmt.Println(u.Path) 
    fmt.Println(u.Query())
}

This program produces the following output:

scheme = http
host = example.com
path = /xyz
query = map[a:[b] c:[d]]

So far, so good. It’s a bit annoying that the query needs to be unpacked further with another function call but we can live with that.

(In all the proceeding examples we assume that the net/url module has been imported and we have parsed an URL from a string into a local variable called u.)

Modifying the path of an URL

The most common bug encountered when manipulating URLs is inadvertently not percent-encoding  or URL-encoding reserved characters when working with path components.

The following is wrong – do not do it even though it might be obvious in this case that no characters need escaping.

// INCORRECT - should use url.PathEscape()
arg1 := "foo"
arg2 := "bar"
u.Path = "/" + arg1 + "/" + arg2  

The correct way to build or modify a URL path to use url.PathEscape() to escape reserved characters.

// CORRECT
arg1 := "foo"
arg2 := "bar"
u.Path = 
  "/" + url.PathEscape(arg1) + "/" + url.PathEscape(arg2)

Interestingly, the sets of reserved and unreserved characters and their meanings has changed slightly between revisions of the URL specification.

Creating query parameters

Another common bug in escaping is in creating or modifying query parameters. Like path components, there is a set of rules that govern which characters are reserved, unreserved and what they mean.

In a url.Url, the query is updated by assigning to the RawQuery field. Like path components it can be tempting to do the following:

// INCORRECT - should use url.Values{}, Set() and Encode()
key := "foo"
value := "bar"
u.RawQuery = key + "=" + value

Instead create a url.Values object and use the Set() function to property escape query keys and values:

// CORRECT
key := "foo"
value := "bar"
query := url.Values{}
query.Set(key, value)
u.RawQuery = v.Encode()

There are also functions called Add(), Del() and Get() to manipulated query parameters as well as Set() shown above.

Developers are not expected to memorise the rules for URL encoding – they have better things to do with their time. Instead, make use of the built-in URL parsing and manipulations libraries of your platform. You’re going to end up with fewer bugs and more time to develop and debug your application.

 

Published
Categorized as Go

Leave a comment

Your email address will not be published. Required fields are marked *