Using IIIF at SFO Museum

This is a blog post by aaron cope that was published on July 18, 2018 . It was tagged golang and iiif.

Thumbnail and color palette of a photo of the When Art Rocked exhibition posters in the International Terminal

Thumbnail and color palette of a photo of the When Art Rocked exhibition posters in the International Terminal. Photo by SFO Museum.

This is a technical blog post about image processing. The short non-technical summary is that not only were we able to use open source software to simplify our image processing workflow (and reduce costs) but we contributed our improvements back to the project so that hopefully others in the museum sector may benefit from our work. Yay!

The past

Some quick background in order to set the stage for the rest of this blog post:

Between 2012 and 2015 I was Head of Internet Typing and part of the Digital and Emerging Media team at the Cooper Hewitt Smithsonian Design Museum, in New York City. One of my responsibilities was wrangling images for all their different uses by all the different parts of the museum: The collection website itself, the API, the interactive tables in the galleries.
In 2012 the first verion of the International Image Interoperability Framework (IIIF) Image API was introduced. IIIF is community project driven by public institutions and private companies in the cultural heritage sector to produce common standards and interfaces (APIs) for accessing and working with collections material. These include APIs for image processing, producing presentations and search.
Sometime in the summer of 2016 I became aware of IIIF and that fall I wrote an implementation of the IIIF Image API in the Go programming language to help work through some of the questions I had about the standard. That software is called go-iiif.
In the early part of 2017 I used that code to help pre-render zoomable image tiles for all the objects on the Cooper Hewitt collection website. That involved processing 289, 208 images producing 58, 977, 784 individual tiles (or 518 GB of data). It was a valuable project to help understand the limits involved in using IIIF to process a lot of images. The whole thing took about 45 days, largely because there were readily apparent bottlenecks involving CPU and disk I/O that I wasn’t able to address at the time.

An example of the different image sizes, and color palettes, on the Cooper Hewitt collection website

An example of the different image sizes, and color palettes, on the Cooper Hewitt collection website. (Animated GIF by Sha Hwang.)

During the years that I was at the Cooper Hewitt, for any given object on the collection website there would be a number of steps to produce the catalog of derivative images necessary for that object. These included:

Generating the usual set of small, medium, large (and everything in between) thumbnails from the primary image.
Generating a square thumbnail that was cropped around a focal point determined by calculating the most “interesting” part of the primary image.
Generating a black and white halftone version for some of the smaller thumbnails to use as placeholders while the large full-color images were loaded in the browser.
Extracting and indexing the dominant colors from the primary image.

This workflow was cobbled together using a combination of GraphicsMagick and a series of purpose-fit libraries and web services all held together with shell scripts. To say the images were “cobbled together” isn’t meant to speak ill of the process (after all, I wrote most of it). It simply reflects the fact that almost everything done between 2012 and 2015 was done in a hurry and without a lot of polish, itself a reflection of the larger work to re-imagine and re-open the museum in December 2014.

The present

“Smart” cropped thumbnails and color palettes of installation photos from the Life and Style in the Age of Art Deco exhibition

Smart cropped thumbnails and color palettes of installation photos from the Life and Style in the Age of Art Deco exhibition. Photos by SFO Museum.

Fast forward to 2018 and I am once again Head of Internet Typing at SFO Museum. I actually have a suitably-suitable title befitting my role as a civil servant but it rarely seems to help people understand what I do all day. Once again there is the need to process a lot of images, in a large collection, for a variety of purposes.

Knowing that I could always fall back on the tools developed at the Cooper Hewitt I decided to see whether we could integrate all of the Cooper Hewitt workflow in to the go-iiif code, allowing us to maintain a single service (iiif-server) rather than multiple services. Could we do just as much, or more even, with less?

The short version is: Yes we can!

“Smart” cropped thumbnails and color palettes of installation photos from the The Nation’s Game: The NFL from the Pro Football Hall of Fame exhibition

Smart cropped thumbnails and color palettes of installation photos from the The Nation’s Game: The NFL from the Pro Football Hall of Fame exhibition. Photos by SFO Museum.

Some of the implementation specifics remain a bit rough around the edges but everything works and can be improved over time. Here’s an annotated version of some of the release notes for the various changes we made to go-iiif in the process:

Add support for smart-cropping via (non-standard) -1,-1,W,H regionByPx instruction

Under the hood go-iiif uses the bimg package which is itself a wrapper around the libvips image processing library. Version 8.5 of libvips introduced ”… a new cropping mode called “attention” which searches the image for edges, skin tones and areas of saturated colour, and attempts to position the crop box over the most significant feature.”

“Smart” cropped thumbnail of an installation photo from the Life and Style in the Age of Art Deco exhibition

Smart cropped thumbnail of an installation photo from the Life and Style in the Age of Art Deco exhibition. Photo by SFO Museum.

This is what go-iiif uses in place of the Cooper Hewitt’s Shannon number for generating thumbnails around a focal point. The results, in both cases, are about what you’d expect from a computer in 2018:

Generally the results are good enough for most purposes until they are sometimes very weird (and occasionally very very wrong) with little or no reasoning that a human can discern.

Add preliminary support for colour extraction as a profile service

Whereas the Cooper Hewitt uses Giv Parvaneh’s RoyGBiv library for exacting colors from an image go-iiif currently uses Rob Cherry’s vibrant package.

As of this writing go-iiif returns specific “swatches” calculated by vibrant rather than a ranked set of colors. They are: VibrantSwatch, LightVibrantSwatch, DarkVibrantSwatch, MutedSwatch, LightMutedSwatch, DarkMutedSwatch. This has the side-effect of ensuring a broader overall range of colors for a given image but can sometimes still miss obvious matches (to human eyes).

The principal reason for this decision was expediency. As much as I would have enjoyed spending work-hours porting Giv’s code, written in a different programming language, it doesn’t seem like the best use of my time (yet) given everything else we’re trying to do.

Screenshot of color/palette extraction tests

Comparing the color extraction algorithms in the go-iiif and RoyGBiv code bases. Meanwhile, the smart cropping algorithm really likes airplane tailfins. It almost always chooses them as the most interesting part of a plane.

The second reason is that the go-iiif code is designed to support multiple so-called extruders. Extruders are the bits of code that decide which colors are represented in an image. Like “smart” or “interestingness” based image-cropping these extruders are the biases in the algorithms that increasingly haunt our daily lives so there is no expecation that any one extruder will suit every need.

The vibrant code returns results that are either satisfactory or good enough to prove the point that there is a working model and framework for extracting colors inside of go-iiif. As time and circumstance permit we will cycle back to improve existing extruders for extracting color palettes and add new ones along the way.

Add a Dockerfile

We don’t process images in real-time so our need for an image-processing service is going to occur in bursts. Support for Docker allows us to run the go-iiif server as an on-demand and scalable Amazon Web Services Elastic Container Service rather than as a dedicated server that we need to operate (and pay for) 24 hours a day.

Add preliminary support for (non-standard) -1 rotation to explicitly disable automatic EXIF Orientation rotation

I don’t know if there has ever been a discussion in the IIIF working groups about this but if you are implementing the IIIF Image API specification using software that automatically rotates JPEG images based on the value of that image’s EXIF Orientation flag (which most JPEG libraries do these days) but which doesn’t update that image’s EXIF Orientation flag accordingly (which most JPEG libraries don’t these days) then you end up in a fun-house mirror-world of wrong if you transform the result of an IIIF transformation a second time.

go-iiif has a flag to stop this from happening now.

“Smart” cropped installation photos from the Typewriter: An Innovation in Writing, the Maneki Neko: Japan’s Beckoning Cat and the Classic Monsters: The Kirk Hammett Collection exhibitions

Smart cropped installation photos from the The Typewriter: An Innovation in Writing, the Maneki Neko: Japan’s Beckoning Cat and the Classic Monsters: The Kirk Hammett Collection exhibitions. Photos by SFO Museum.

So how does this all work in practice? Pretty much as follows, keeping in mind that the samples below should be treated as pseudo-code.

First, let’s assume that we’re running a copy of the go-iiif iiif-server on the same local machine we’re working on, and that it’s listening for requests on port 8080.

$> bin/iiif-server -config config.json
2018/03/07 15:45:07 Serving 127.0.0.1:8080 with pid 12075

We maintain a dictionary of IIIF instructions where each key is a named label and its value is another dictionary of IIIF Image API instructions, sometimes called request parameters, specific to that label. For example:

$GLOBALS["cfg"]["iiif_default_instructions"] = array(
	# -1 means "do not autorotate" (which is go-iiif specific)
	"o" => array("size" => "full", "format" => "", "rotation" => "-1"),
	"b" => array("size" => "!1024,768", "format" => "jpg"),
	"c" => array("size" => "!800,600", "format" => "jpg"),
	"dd" => array("size" => "!800,600", "quality" => "dither", "format" => "jpg"),
	"z" => array("size" => "!640,480", "format" => "jpg"),
	# -1,-1 means "smart crop" (which is go-iiif specific)	
	"sq" => array("size" => "full", "region" => "-1,-1,320,320", "format" => "jpg"),
	"d" => array("size" => "full", "quality" => "dither", "region" => "-1,-1,320,320", "format" => "jpg"),
	"n" => array("size" => "!320,240", "format" => "jpg"),
);

Photo of the When Art Rocked exhibition posters in the International Terminal

Photo of the When Art Rocked exhibition posters in the International Terminal.

Let’s say we want to take the photo of the When Art Rocked exhibition posters, from the top of this blog post, and generate a square thumbnail that’s been halftoned and “smart” cropped.

We store these instructions using the d key which the states that the IIIF quality for the new image should be dither (which means “halftone”) and the region should be -1,-1,320,320 (which means “a square image 320 pixels to a side, where the center point is chosen using smart cropping”).

$img = "when-art-rocked.jpg";
$sz = "d";

$args = $GLOBALS["cfg"]["iiif_default_instructions"][$sz];

If we combine the instructions for the d label with a set of default rules then we can easily build the URL, as defined by canonical URI syntax, to request the new image from the iiif-server instance, like this:

$defaults = array(
	"region" => "full",		# deprecated in 3.x
	"size" => "full",		# deprecated in 3.x
	"rotation" => "0",
	"quality" => "default",
	"format" => "jpg",
);

$args = array_merge($defaults, $args);

$url = "http://localhost:8080/{$img}/{$args["region"]}/{$args["size"]}/{$args["rotation"]}/{$args["quality"]}.{$args["format"]}";
$img_d = http_get($url);

And here’s the output, along with a full-color square cropped thumbnail:

Full color square thumbnail of the When Art Rocked exhibition posters in the International Terminal

Haltfone square thumbnail of the When Art Rocked exhibition posters in the International Terminal

Smart cropped and halftoned thumbnail photos of the When Art Rocked exhibition posters in the International Terminal. Photos by SFO Museum.

To extract the colors for this image we call the IIIF image information endpoint, which contains an additional profile service for include color palette data:

$url = "http://localhost:8080/{$img}/info.json

$info = http_get($url);
$colors = $info["service"]["palette"];

And the output will look something like this:

[
      {
        "hex": "#cc1e89",
        "name": "#cc1e89",
        "closest": [
          {
            "hex": "#e3256b",
            "name": "Razzmatazz",
            "reference": "crayola"
          },
          {
            "hex": "#c71585",
            "name": "mediumvioletred",
            "reference": "css4"
          }
        ],
        "reference": "vibrant"
      },
      ... and so on
]

There are many different ways to write code to do the same thing and the point of the examples above is not to suggest that you should do it our way, only that whichever way you do it should be that easy.

That is the real benefit of using go-iiif for us. If we need to add a new image size all we need to do is add the instructions to the dictionary above and write a little bit of custom code to loop over all the images in the collection processing just that one size and we’re done.

Screenshot of different photos sizes catalog

The same is true if we need to reprocess some or all of the images because… well, because everyone has different reasons for needing to reprocess all their images. These things happen so it shouldn’t be a struggle to accomodate them when they do.

IIIF is not for every image processing task but given that most image processing tasks are some combination of basic region, size, rotation, quality and final output format criteria the value of IIIF is in dissolving (hiding) the complexities of all that work behind a simple URL and an HTTP request.

The future

A halftoned installation photo from the A World of Characters: Advertising Icons from the Warren Dotz Collection exhibition

A halftoned installation photo from the A World of Characters: Advertising Icons from the Warren Dotz Collection exhibition. Photo by SFO Museum.

So, that’s the state of go-iiif today. All of our fixes and contributions have been merged back in to the original project, which is available on GitHub:

https://github.com/aaronland/go-iiif

In the short-term I’d like to address some of the issues around color extraction mentioned above. In the near-term, I would like to add support for a pure Go image processing “engine”. Somewhere in between I’d like to make sure that all the images in the SFO Museum collection are tiled and zoomable.

libvips, the code that does the actual pixel crunching in go-iiif, is a remarkable piece of software but it is a C library that needs to be compiled and comes with a non-trivial set of dependencies all of which means building go-iiif requires a level of technical expertise that is outside the reach of many people.

A pure Go image processing engine would allow us to build pre-compiled binary versions of the go-iiif tools for specific platforms. That means you or your institution could download a copy of the software and start using it right away without having to contemplate phrases like “Just install {WORDS THAT SOUND LIKE GIBBERISH TO YOU}…”

There is also the possibility of using a pure Go version of the iiif-tile-seed program as an Amazon Web Services Lambda function for quickly and cheaply pre-rendering tiled and zoomable images.

Thumbnails and color palettes of installation photos from the The Enduring Designs of Josef Frank exhibition

Thumbnails and color palettes of installation photos from the The Enduring Designs of Josef Frank exhibition. Photos by SFO Museum.

An ideal scenario is one where a museum could upload a set of full-sized images to a AWS S3 bucket, wait for Amazon’s computers to process each image with the iiif-tile-seed function and then find a new set of tiled images to download (along with a reasonable bill for services rendered) in a different S3 bucket.

That’s still not possible today but it should be. One day it will be.