Blog posts tagged duckdb
Shared vector embeddings updates
There is a lot of ground to cover in this blog post: The Met publishing their own vector embeddings, SFO Museum publishing 1152-dimension vector embeddings for its images, SFO Museum producing 1152-dimension vector embeddings for NGA and MoMA, and a whole bunch of updates to the tooling used to generate and query vector embeddings targeting local-first and consumer-grade hardware.
This is a blog post by aaron cope. It was published on May 27, 2026 and tagged roboteyes, machine-learning, collection, duckdb, golang, parquet, bleve, embeddings, aws, s3 and s3vectors.
Shared cross-institutional vector embeddings – how we might get there
We are proposing a simple‑is‑best approach to sharing vector embeddings of our collections, a step that moves us closer to realizing the long‑standing ‘holy grail’ of cross‑institutional collections search through vector‑based image similarity.”
This is a blog post by aaron cope. It was published on April 06, 2026 and tagged roboteyes, machine-learning, collection, duckdb, golang, parquet and embeddings.
Updates (and additions) to machine-learning tools running on consumer hardware
These are not “silver bullet” tools. Rather, they endeavour to be part of a set of building blocks for creating an infrastructure that preserves and guarantees the cultural heritage sector some agency in our work.
This is a blog post by aaron cope. It was published on February 10, 2026 and tagged swift, roboteyes, machine-learning, collection, duckdb, golang and embeddings.
SFO Museum flight data records now available as GeoParquet exports
This is a short blog post to announce the availability of flight data for of the over 8.4 million (and counting) flights that have traveled in and out of SFO, since 2006, as GeoParquet files.
This is a blog post by aaron cope. It was published on January 07, 2025 and tagged flightdata, geoparquet, geo and duckdb.



