Extracting Subjects from Images in Swift (and gRPC)

This is a blog post by aaron cope that was published on October 31, 2023 . It was tagged swift, tools, grpc, golang and roboteyes.

Flight attendant scarf: National Airlines. Polyester. , SFO Museum Collection. 2016.138.001

In the Searching Text in Images on the Aviation Collection Website blog post I introduced the swift-text-emboss Swift package, a wrapper library around Apple’s Vision Framework to simplify extracting text from images. In this blog post I’d like to introduce the swift-image-emboss Swift package. Like the swift-text-emboss package this is also a wrapper around code provided by Apple’s Vision Framework with the goal simplifying the code necessary to extract, or “lift”, one or more subjects from an image.

It works like this:

import ImageEmboss

// This is the image you want to extract subjects from.
var ciImage: CIImage

let em = ImageEmboss()
let rsp = em.ProcessImage(image: ciImage, combined: false)

The rsp variable is a Swift Result instance wrapping a list of CIImage instances or any errors encountered during processing. For example, when “embossed” this image of six forks from the Cooper Hewitt Smithsonian National Design Museum yields six individual images where each fork is considered a subject:

To extract all the subjects in an image in to a single image you would call the ProcessImage method passing in a combined: true argument. For example, this image of the items in an Air France amenity bag when processed in “combined” mode produces a single image with five distinct subjects:

Note that the underlying “subject lifting” code merges two objects in to one and excludes four others entirely. This is a good reminder that as impressive as the technologies used to perform these tasks are they remain fallible and the reasons they choose to do, or not do, something often remain opaque. While they herald exciting new possibilities they are still tools which should only be used with supervision and a good understanding (or at least awareness) of when and where they may not meet the needs of your specific project.

Safety information card: National Airlines, Lockheed L-188 Electra II. Paper, ink. Gift of Thomas G. Dragges, SFO Museum Collection. 2002.127.425

The swift-image-emboss package provides a library that you can include in your own code but does not export any applications of its own. We have released a separate tool, called swift-image-emboss-cli, for extracting subjects from images on the command line. For example, processing this image of an airplane resting atop a walrus:

$> cd swift-image-emboss-cli
$> swift build

$> ./.build/debug/image-emboss --input-file fixtures/sfomuseum-walrus-1511908311.jpg

This is really a thing in the SFO Museum Aviation Collection!

We have also released a gRPC server, called swift-image-emboss-grpc, wrapping the swift-image-emboss functionality so that you can extract subjects from images over the network. For example:

$> cd swift-image-emboss-grpc
$> swift build

$> ./.build/debug/image-emboss-grpc-server
2023-10-26T14:13:25-0700 info org.sfomuseum.text-emboss-grpc-server : [GRPCServer] server started on port 8080

And then using a gRPC client, for example the client provided by the go-image-emboss package, to extract subjects from this image of a tie tack from Civil Air Transport (CAT) airlines::

$> cd go-images-emboss
$> go build -mod vendor -ldflags="-s -w" -o bin/emboss cmd/emboss/main.go

$> ./bin/emboss -embosser-uri 'grpc://localhost:8080' fixtures/cat-pin.jpg 
2023/10/26 14:15:24 fixtures/cat-pin-emboss-001.png

Results in:

A couple of things to be aware of:

First, there is nothing special about the gRPC client we are using in this example. The whole promise of gRPC is that it enjoys broad support across a number of programming languages and because it uses formal definitions for its input and output it is very easy to create your own custom gRPC client for any given service. Second, the protocol definition for the swift-image-emboss-grpc server only accepts and returns raw (image) bytes so it is left to inidividual applications (like the go-image-emboss tool above) to determine how those bytes should be read from and written to.

Negative: San Francisco International Airport (SFO), runway construction. Negative. Transfer from San Francisco International Airport, SFO Museum Collection. 2011.032.0518

The image-emboss-grpc-server gRPC server implementation was sufficiently similar to the equivalent “text embosser” server described in the last blog post that we created a standalone swift-grpc-server Swift package which both the image and text “embosser” servers now use. It works like this:

import Logging
import GRPCServer
import GRPC

// This is your own code, for example the image embosser or the text embosser.
// This is the package that will be created when you run protoc on your protobuffer
// definition. Consult the swift-grpc docs for details.
var provider: CallHandlerProvider

let logger = Logger(label: "org.sfomuseum.example")

let server_opts = GRPCServerOptions(
	host: "localhost",
	port: 8080,
	logger: logger
)
      
let server = GRPCServer(server_opts)
try await server.Run([provider])

The goal with the GRPCServer package is to reduce the amount of boilerplate code necessary for simple gRPC servers that need to log requests (and in particular record the remote address of clients connecting to the server which is more complicated than you’d imagine and, optionally, use TLS to secure connections and provide authentication. It is the minimal amount of scaffolding necessary to expose functionality not available on other platforms as a network-based service.

Timetable: Southwest Airlines. Paper, ink. Gift of Craig Lynes, SFO Museum Collection. 2003.105.097.005

We’ve also abstracted one other piece of code in to its own dedicated software package: swift-coreimage-image. SFO Museum often writes tools in Swift that we would like to run in both MacOS (AppKit) and iOS (UIKit) contexts. Because each context has its own image type (NSImage and UIImage respectively) and each framework is specific to their respective platform this can result in a lot of fiddly-code when cross-compiling applications.

This package provides a single method (LoadFromURL) which accepts a URL instance and returns a CoreImage CIImage instance for the image at that location, regardless of platform. That’s all it does. How many people have already written this code? I’d wager a lot of people have but I couldn’t find any of those efforts in a simple package that we could import in to our code so now we have our own implementation which we are sharing with you.

import CoreimageImage

let im_url = URL(string: "file:///path/to/image.jpg")
let im_rsp = CoreImageImage.LoadFromURL(url: im_url)
     
if case .success(let ci_image) = im_rsp {
    // do something with ci_image here
}

Load adjuster: Pan American Airways, Stratocruiser. Leather, wood, plastic, metal, ink. Gift of Jon E. Krupnick, SFO Museum Collection. 2022.124.782

The swift-image-emboss package, and the layering of tooling on top of it, were written specifically with the goal of improving the quality of results deriving dominant colors in images from the aviation collection, by isolating and focusing on the principal subjects in those images rather than the entire frame. That is SFO Museum’s use-case but as part of our commitment to the idea of “small focused tools” by and for the cultural heritage sector we have made the effort to package as discrete reusable (and rearrange-able) components for use in a variety of circumstances.

The cultural heritage sector needs as many small, focused tools as it can produce. It needs them in the long-term to finally reach the goal of a common infrastructure that can be employed sector-wide. It needs them in the short-term to develop the skill and the practice required to make those tools successful. We need to learn how to scope the purpose of and our expectations of any single tool so that we can be generous of, and learn from, the inevitable missteps and false starts that will occur along the way.

All of these tools are available from SFO Museum’s GitHub account and we welcome any feedback, suggestions or patches that you think would help to improve them: