go-iiif version 2.0

This is a blog post by aaron cope that was published on November 13, 2019 . It was tagged golang and iiif.

negative: San Francisco International Airport (SFO), TWA (Trans World Airlines), Boeing B-131, 1962, Collection of SFO Museum, 2011.032.0732

Today, I am happy to announce that go-iiif version 2.0.0 has been released. A quick refresher:

IIIF is an acronym for the International Image Interoperability Framework, a project driven by public institutions and private companies in the cultural heritage sector to produce common standards and interfaces (APIs) for accessing and working with collections material.

go-iiif is software written in the Go programming language that implements the IIIF Image API and that SFO Museum has been using to process the images in its collection. We’ve written about IIIF before in the following blog posts:

The biggest change in the 2.0 release is that go-iiif no longer requires the libvips image processing library, by default. libvips is pretty great, and recently added the ability to generate IIIF-compatible tilesets natively, but it introduces non-trivial build and setup requirements. It is still possible to use go-iiif with libvips but that functionality has been moved in to a separate package.

As of version 2.0 go-iiif does all its image processing using native (Go) code. Specifically, Anthony Simon’s bild for most image-related tasks and Christian Muehlhaeuser’s smartcrop for cropping images dynamically.

The absence of external dependencies means that go-iiif tools can be compiled in to standalone applications that can be run even if Go isn’t installed on the same computer. This allows us to build and distribute sophisticated image processing tools to staff, tailored to our workflow, quickly and easily. As I write this all these tools are run still from the command-line so there are plenty of user interface and user experience improvements that remain but this release lays the foundation for that work going forward.

go-iiif tools can now also be run as an AWS Lambda functions. Recently we’ve been using AWS ECS instances to process images using go-iiif. That works fine but due to the limit on the number of concurrent ECS instances that can be running, and the corresponding complexity involved in scheduling processes manually, the ability to use Lambda with its more generous limits is very attractive.

Other notable changes from previous releases are: The use of the Go Cloud Bucket and Blob interfaces for reading and writing files to a variety of storage endpoints and the introduction of go-iiif-uri URI strings to identify images for processing.

The rest of this blog post is very technical so if that’s not of interest you can stop here. It is divided in to four sections: Drivers, Buckets, URIs and Tools.

Drivers

photograph: San Francisco Bay Area, two men in grounded biplane, c. 1920, Collection of SFO Museum, 2010.174.235

All image processing, including the native Go code, is done through the use of “drivers”, similar to the way the Go database/sql packages work. A driver needs to support the driver.Driver interface which looks like this:

import (
	iiifcache "github.com/go-iiif/go-iiif/cache"
	iiifconfig "github.com/go-iiif/go-iiif/config"
	iiifsource "github.com/go-iiif/go-iiif/source"
)

type Driver interface {
	NewImageFromConfigWithSource(*iiifconfig.Config, iiifsource.Source, string) (iiifimage.Image, error)
	NewImageFromConfigWithCache(*iiifconfig.Config, iiifcache.Cache, string) (iiifimage.Image, error)
	NewImageFromConfig(*iiifconfig.Config, string) (iiifimage.Image, error)
}

The idea here is that the bulk of the go-iiif code isn’t aware of how images are being processed, or who is processing them, only that it can reliably pass around things that implement the image.Image interface (the go-iiif image interface, not the Go language image interface).

Drivers are expected to “register” themselves through the driver.RegisterDriver method at runtime. For example:

package native

import (
	iiifdriver "github.com/go-iiif/go-iiif/driver"
)

func init() {

	dr, _ := NewNativeDriver()
	iiifdriver.RegisterDriver("native", dr)
}

And then in your code you might do something like this:

import (
	"context"
	"github.com/aaronland/gocloud-blob-bucket"	
	_ "github.com/go-iiif/go-iiif/native"
	iiifconfig "github.com/go-iiif/go-iiif/config"
	iiifdriver "github.com/go-iiif/go-iiif/driver"	
)

ctx := context.Background()
	
config_bucket, _ := bucket.OpenBucket(ctx, "file:///etc/go-iiif")

cfg, _ := config.NewConfigFromBucket(ctx, config_bucket, "config.json")

driver, _ := iiifdriver.NewDriverFromConfig(cfg)

That’s really the only change to existing code.

Careful readers may have noticed the calls to bucket.OpenBucket and config.NewConfigFromBucket, as well as the use of the gocloud-blob-bucket package, to load go-iiif configuration files. All of these things are discussed below but in the meantime the only other change is to update the previously default graphics.source property in the configuration file from VIPS (or vips) to native.

For example, this:

    "graphics": {
	"source": { "name": "VIPS" }
    }

Becomes:

    "graphics": {
	"source": { "name": "native" }
    }

The value of the graphics.source property should match the name that driver uses to register itself with go-iiif.

The rest of the code in go-iiif has been updated to expect a driver.Driver object and to invoke the relevant NewImageFrom... methods as needed. It is assumed that the driver package in question will also define its own implementation of the go-iiif image.Image interface. For concrete examples you can take a look at the following:

Buckets

negative: San Francisco International Airport (SFO), custodial staff, 1968, Collection of SFO Museum, 2011.032.1707

Starting with version 2 the go-iiif package uses the Go Cloud Bucket and Blob interfaces for reading and writing all files. For example, instead of doing this:

cfg, _ := config.NewConfigFromFile("/etc/go-iiif/config.json")

You’d do this, now:

config_bucket, _ := bucket.OpenBucket(ctx, "file:///etc/go-iiif")
cfg, _ := config.NewConfigFromBucket(ctx, config_bucket, "config.json")

This allows for configuration files, and others, to be stored and retrieved from any storage source (or bucket) that is supported by the Go Cloud package, notably remote storage services like AWS S3.

The source and caching layers have also been updated accordingly but support for the older Disk, S3 and Memory sources has been updated to use the Go Cloud packages so there is no need to update any existing go-iiif configuration files. For example, in the following two snippets both the Disk and S3 sources and their corresponding Blob configurations are functionally the same…almost.

These two source definitions are functionally the same:

    "images": {
        "source": { "name": "Disk", "path": "/usr/local/go-iiif/docker/source" },
        "source": { "name": "Blob", "path": "file:///usr/local/go-iiif/docker/source" },
    },

These two cache defintions are the same with the exception of the acl= parameter, described below, in the second defintion:

    "derivatives": {
        "cache": { "name": "S3", "path": "sfomuseum-iiif", "prefix": "misc", "region": "us-east-1", "credentials": "session" },
        "cache": { "name": "Blob", "path": "s3:///sfomuseum-iiif?region=us-east-1&credentials=session&prefix=misc&acl=public-read" },
    }

As mentioned above we are using the gocloud-blob-bucket to open “buckets”. This is a thin wrapper around the core Go Cloud packages that that allows us to define credentials as named parameters in URI strings.

Under the hood the Blob cache supports an optional acl={ACL} query parameter in the path property (which is equivalent to a Go Cloud URI definition). This is to account for the inability to assign permissions when writing Go Cloud blob objects. Currently the acl=ACL parameter is only honoured for s3:// URIs but patches for other sources would be welcome.

URIs

booklet: Air Traffic Control, 1974, Collection of SFO Museum, 2002.134.238

As of go-iiif version 2 image URIs no longer use simple strings for paths and filenames but a string-based syntax that encodes instructions for how a path should be interpreted and manipulated.

go-iiif-uri URI strings are defined by a named scheme which indicates how a URI should be processed, a path which is a reference to an image and zero or more query parameters which are the specific instructions for processing the URI.

Like image processing go-iiif-uri URI processing is handled by various “drivers” that conform to a URI interface:

type URI interface {
	Driver() string
	String() string
	Origin() string
	Target(*url.Values) (string, error)
}

This allows developers to define their own URI processing instructions outside the default schemes which are part of the go-iiif-uri package. Default URI schemes are:

file://

The file:// URI scheme is basically just a path or filename. It has an optional target property which allows the name of the source image to be changed. These filenames are not the final name of the image as processed by go-iiif but the name of the directory structure that files will be written to, using the IIIF instructions-based syntax for URIs.

For example:

file:///path/to/source/image.jpg
file:///path/to/source/image.jpg?target=/path/to/target/image.jpg

idsecret://

The idsecret:// URI scheme is designed to rewrite a source image URI to {UNIQUE_ID} + {SECRET} + {LABEL} style filenames. For example cat.jpg becomes 1234_s33kret_b.jpg and specifically 123/4/1234_s33kret_b.jpg where the unique ID is used to generate a nested directory tree in which the final image lives.

For example:

idsecret:///path/to/source/image.jpg?id=1234&secret=s33kret&secret_o=seekr3t&label=b

The idsecret:// URI scheme was developed for use with go-iiif “instructions” files where a single image produced multiple derivatives that need to share commonalities in their final URIs.

rewrite://

The rewrite:// URI scheme is a variant of the file:// URI scheme except that the target query parameter is required and it will be used to redefine the final URI, rather than just its directory tree, of the processed image.

For example:

rewrite:///path/to/source/image.jpg?target=/path/to/target/picture.jpg

Here’s an abbreviated example taken from the process/parallel.go code that processes a single source image, defined as an idsecret:// URI, in to multiple derivatives defined in an “instructions” file.

The idsecret:// URI is output as a string using the instructions file to define the label and other query parameters. That string is then used to create a new rewrite:// URI where source is derived from the original idsecret:// URI and the target is a newly generated URI string.

go func(u iiifuri.URI, label Label, i IIIFInstructions) {

   	// in this example we assume that u.String() is:
	// idsecret:///path/to/source/image.jpg?id=1511015579

	var process_uri iiifuri.URI

	switch u.Driver() {
	case "idsecret":

		// here we are defining specific parameters derived from
		// the instructions file to be used when calling u.Target()
		
		str_label := fmt.Sprintf("%s", label)

		opts := &url.Values{}
		opts.Set("label", str_label)
		opts.Set("format", i.Format)

		if str_label == "o" {
			opts.Set("original", "1")
		}

		target_str, _ := u.Target(opts)

		// u.Origin() is /path/to/source/image.jpg
		
		origin := u.Origin()

		// now we create a new "rewrite:///" URI with the original
		// source image and the newly created "target" URI and use
		// that for processing the image in question
		
		rw_str := fmt.Sprintf("%s?target=%s", origin, target_str)
		rw_str = iiifuri.NewRewriteURIString(rw_str)

		rw_uri, _ := iiifuri.NewURI(rw_str)

		process_uri = rw_uri

	default:
		process_uri = u
	}
	
	pr.ProcessURIWithInstructions(process_uri, label, i)
	
}(u, label, i)

Resulting in the following images, and directory structure, being produced:

go-iiif-uri URI strings are still a work in progress. They may still change a bit around the edges but efforts will be made to ensure backwards compatibility going forward.

Tools

lighter: Lockheed Constellation with route map, c. 1950s, Collection of SFO Museum, 2003.065.109

Everything is the same with all the command line tools that are bundled with go-iiif… almost.

The -config flag has been deprecated in favour of the -config-source and -config-name flags but will still be honoured and used to automatically populate the newer flags.

The bulk of the logic behind each tool, including parsing command line arguments has been moved in to the go-iiif tools namespace which allows different go-iiif image processing packages to share functionality while taking care to load their relevant drivers.

For example, this is what the go-iiif-vips/cmd/iiif-process tool looks like:

package main

import (
	"context"
	_ "github.com/go-iiif/go-iiif-vips"
	"github.com/go-iiif/go-iiif/tools"
)

func main() {
	tool, _ := tools.NewProcessTool()
	tool.Run(context.Background())
}

Here’s an example using the same iiif-process tool from both the go-iiif and go-iiif-vips packages, the latter via a Docker container:

$> go run -mod vendor cmd/iiif-process/main.go \
	-instructions-source file:///usr/local/go-iiif/docker/config \
	-config-source file:///usr/local/go-iiif/docker/config \
	'idsecret:///spanking.jpg?id=1511015579'

{"/spanking.jpg":{"dimensions":{"b":[1461,1536],"d":[3507,3508],"o":[3897,4096]},"palette":[{"name":"#4e3c24","hex":"#4e3c24","reference":"vibrant"},{"name":"#9d8959","hex":"#9d8959","reference":"vibrant"},{"name":"#c7bca6","hex":"#c7bca6","reference":"vibrant"},{"name":"#5a4b36","hex":"#5a4b36","reference":"vibrant"}],"uris":{"b":"file:///151/101/557/9/1511015579/1511015579_mLB9fj6XjNto1HgDsI4fKj8vOg6w0vm3_b.png","d":"file:///151/101/557/9/1511015579/1511015579_mLB9fj6XjNto1HgDsI4fKj8vOg6w0vm3_d.jpg","o":"file:///151/101/557/9/1511015579/1511015579_8Un90mD46ZYXzatvTOrqX0NKzMyI1Fuh_o.jpg"}}}

$> ls -a docker/cache/151/101/557/9/1511015579/
1511015579_8Un90mD46ZYXzatvTOrqX0NKzMyI1Fuh_o.jpg
1511015579_mLB9fj6XjNto1HgDsI4fKj8vOg6w0vm3_b.png
1511015579_mLB9fj6XjNto1HgDsI4fKj8vOg6w0vm3_d.jpg
$> docker run -v /usr/local/go-iiif-vips/docker:/usr/local/go-iiif \
	go-iiif-vips-process /bin/iiif-process \
	-config-source file:///usr/local/go-iiif/config \
	-instructions-source file:///usr/local/go-iiif/config \
	'idsecret:///freedom.jpg?id=1511015579'
	
{"/freedom.jpg":{"dimensions":{"b":[1216,1536],"d":[320,320],"o":[2852,3600]},"palette":[{"name":"#a4241a","hex":"#a4241a","reference":"vibrant"},{"name":"#b7a064","hex":"#b7a064","reference":"vibrant"},{"name":"#694f32","hex":"#694f32","reference":"vibrant"},{"name":"#8b826f","hex":"#8b826f","reference":"vibrant"},{"name":"#c4baae","hex":"#c4baae","reference":"vibrant"}],"uris":{"b":"file:///151/101/557/9/1511015579/1511015579_hYWEkOK2Ui4eKzhegJFt7bpzJUYlgXOt_b.png","d":"file:///151/101/557/9/1511015579/1511015579_hYWEkOK2Ui4eKzhegJFt7bpzJUYlgXOt_d.jpg","o":"file:///151/101/557/9/1511015579/1511015579_uB9fbr7X9ye1GmVnawqxar8PwqS9sJvK_o.jpg"}}}

$> docker run -v /usr/local/go-iiif-vips/docker:/usr/local/go-iiif \
	go-iiif-vips-process \
	ls -a /usr/local/go-iiif/cache/151/101/557/9/1511015579/

1511015579_hYWEkOK2Ui4eKzhegJFt7bpzJUYlgXOt_b.png
1511015579_hYWEkOK2Ui4eKzhegJFt7bpzJUYlgXOt_d.jpg
1511015579_uB9fbr7X9ye1GmVnawqxar8PwqS9sJvK_o.jpg

So, that’s version 2 of go-iiif. There’s not a lot of new functionality (unless you count increased portability) but these are important changes that will hopefully make using the code a little easier and a lot more flexible.

In closing, here are some screenshots that were captured while tile-seeding for zoomable images, in version 2, was being developed and tested. In these images you can see some of the mistakes we made along the way as well as where we’re going next!

Note: The image, above, of the so-called “Spanking Cat” is not part of SFO Museum’s collection but it has become the default reference image for the go-iiif project which is why we were using it for testing.

Links