Blog posts written by aaron cope

Asserting claims: Cryptographically signing the digital artifacts we produce

Asserting claims: Cryptographically signing the digital artifacts we produce

Why is SFO Museum signing the vector embeddings that we’re producing? To provide an additional guarantee that the data we are publishing is, in fact, the data we published. Vector embeddings are, by their nature, basically impossible to “spot check”. Their size, shape and volume lend themselves to subtle, often imperceptible, corruption whether those changes are introduced by malice or negligence. By providing digital signatures for these embeddings we can provide the means to ensure that the data have not been tampered with after they’ve been downloaded from our website.

This is a blog post by aaron cope. It was published on June 15, 2026 and tagged golang, embeddings, pgp, c2pa and rustlang.

Shared vector embeddings updates

Shared vector embeddings updates

There is a lot of ground to cover in this blog post: The Met publishing their own vector embeddings, SFO Museum publishing 1152-dimension vector embeddings for its images, SFO Museum producing 1152-dimension vector embeddings for NGA and MoMA, and a whole bunch of updates to the tooling used to generate and query vector embeddings targeting local-first and consumer-grade hardware.

This is a blog post by aaron cope. It was published on May 27, 2026 and tagged roboteyes, machine-learning, collection, duckdb, golang, parquet, bleve, embeddings, aws, s3 and s3vectors.

OEmbeddings - What is the least amount of metadata necessary for shared vector embeddings?

OEmbeddings - What is the least amount of metadata necessary for shared vector embeddings?

This is a blog post describing a proposal for a set of common attributes to include with shared vector embeddings. These common attributes are meant to be the least amount of metadata necessary to provide a simple preview and suitable attribution for the item (an image or text) for which vector embeddings have been produced.

This is a blog post by aaron cope. It was published on April 15, 2026 and tagged roboteyes, machine-learning, collection, oembeddings, embeddings, golang and parquet.

Shared cross-institutional vector embeddings – how we might get there

Shared cross-institutional vector embeddings – how we might get there

We are proposing a simple‑is‑best approach to sharing vector embeddings of our collections, a step that moves us closer to realizing the long‑standing ‘holy grail’ of cross‑institutional collections search through vector‑based image similarity.”

This is a blog post by aaron cope. It was published on April 06, 2026 and tagged roboteyes, machine-learning, collection, duckdb, golang, parquet and embeddings.

Updates (and additions) to machine-learning tools running on consumer hardware

Updates (and additions) to machine-learning tools running on consumer hardware

These are not “silver bullet” tools. Rather, they endeavour to be part of a set of building blocks for creating an infrastructure that preserves and guarantees the cultural heritage sector some agency in our work.

This is a blog post by aaron cope. It was published on February 10, 2026 and tagged swift, roboteyes, machine-learning, collection, duckdb, golang and embeddings.

Similar object images derived using the MobileCLIP computer-vision models

Similar object images derived using the MobileCLIP computer-vision models

Like a lot of things involving machine-learning, the image similarity results while not always right aren’t necessarily wrong either. In the same vein as searching collections by color this “fuzzy” and imprecise space presents a whole new avenue for browsing collections and making visible objects that would otherwise get lost in the crowd.

This is a blog post by aaron cope. It was published on January 09, 2026 and tagged swift, roboteyes, machine-learning, collection, mobileclip and embeddings.

Map updates, December 2025 - Now with more PMTiles

Map updates, December 2025 - Now with more PMTiles

In addition to 2025, we’ve also added new imagery from 1920, 1936 and 1961 all produced using the Allmaps Editor to georeference existing collections materials. I’ll talk more about some of the tools and workflows we’ve developed to work with Allmaps in a future blog post. All of these new maps have also been added to the interactive map application on display in the Terminal 2 SkyTerrace Observation Deck. As part of those updates we’ve also started serving these historic maps from PMTiles databases rather than folders full of individual tiles on disk.

This is a blog post by aaron cope. It was published on December 15, 2025 and tagged maps, protomaps, ios, swift and vapor.

WallLabel – Experiments with Apple’s open source machine-learning frameworks

WallLabel – Experiments with Apple's open source machine-learning frameworks

On-device models are still someone else’s models but having the flexibility to choose one model over another, to recognize that they are systems with strengths and weaknesses rather than all-knowing oracles, and the ability to incorporate those choices in to how our projects are designed and implemented is a small, but important, step in retaining some degree of control and agency in our work.

This is a blog post by aaron cope. It was published on October 29, 2025 and tagged swift, ios, mlx, machine-learning and roboteyes.

Registrar – Experiments with Apple’s on-device machine-learning frameworks

Registrar – Experiments with Apple's on-device machine-learning frameworks

We are releasing this work in a spirit of generousity and to encourage others to suggest improvements with the larger goal of providing resources to help the broader cultural heritage sector think about how to use machine learning technologies outside and beyond the promises of the billboards advertising these same technologies in Silicon Valley and the world over.

This is a blog post by aaron cope. It was published on October 16, 2025 and tagged swift, ios, llm, machine-learning and roboteyes.