Most of this week was either off for the May bank-holiday long weekend or up in Edinburgh at this great workshop, with plenty of hacking on the long train journeys in between.
1 Rewilding the Web in Edinburgh
Jon Crowcroft and I went up to Edinburgh for Kate Nave's Rewilding the Web: Diversity & Resilience in Sociotechnical Infrastructure workshop, an interdisciplinary mix of economists, ecologists, philosophers, techies and authors. The notes are in the workshop report.
I came back with a huge reading list, learnt the word "coopetition", and gathered a giant list of follow-ups from our Internet ecology paper. Jon and I did a double act on antibotty networks and code self-modification, and -- unlike the response six months ago at Aarhus -- nobody in the room treated it as sci-fi. The shift on coding agents into the mainstream is happening fast.

2 ocaml-uring refreshes
Thomas Leonard is back hacking on eio and working through the PR backlog (Unix.file_descr conversion to lose Obj.magic, MDX hang detection, a big liburing 2.14 update to get the latest goodies).
Spurred on by all this activity, I spent a chunk of the week filling in some coverage gaps so I can start using uring again in my TESSERA code. My PR #147 adds bindings for shutdown, socket, renameat and symlinkat.
The other PR is smaller but highlights a slightly more obscure API in Linux. PR #142 fixes a bug where the supported-attribute check for statx was inverted. Eio doesn't currently use that code path so we hadn't noticed. statx is a slightly odd syscall: it negotiates which attributes the kernel will fill in via a request-mask / returned-mask handshake, and it's easy to get the boolean direction wrong on the OCaml side. It's also not entirely clear under what conditions the kernel will let this mask get out of sync from the request...
3 GeoTessera 0.9 and a HuggingFace home for the models
On the TESSERA side, I've been getting GeoTessera 0.9 ready to land. The release does two things: it migrates the embeddings host from our Cambridge infrastructure to s3://tessera-embeddings/ on AWS us-west-2 (the AWS Open Data sync that Mark Elvers and I have been doing), and adds support for our forthcoming TESSERA v1.1 model alongside the existing v1.0.
Since we're now on S3, we've dropped the SHA256-based registry in favour of
S3's built-in x-amz-checksum-crc64nvme header, which simplifies the integrity
check and lets us simplify the existing download path. The geotessera-registry s3scan tool now auto-discovers every (version, variant, year) under any S3
prefix and shards the listing by longitude. A one-year scan went from ~11
minutes to ~47 seconds, which makes regenerating the manifests cheap enough to
do routinely. Cache freshness now also uses ETag / If-None-Match in addition
to If-Modified-Since, so clients won't miss updates when local mtime drifts.
To prepare for the v1.1 release, I opened up a new geotessera org on Hugging Face and uploaded model cards for TESSERA-V-1.0 and TESSERA-V-1.1. The card format follows the geospatial embeddings model card template that came out of the Clark University embeddings sprint earlier in the spring. Madeline Lisaius did a lot of the work pulling that template together, and it's good to see the community standard land on something concrete that other model authors can reuse!
Most users will continue to pull pregenerated embeddings via the GeoTessera library rather than the raw weights, but having a canonical HF home for the model itself was overdue.
I'll post properly about v1.1 once the release is fully out. The v1.0-v1.1 transition is a no-op for downstream code since you just point at a new manifest and grab new embeddings. Users should just see their performance increase without any effort, as the model backing the embeddings has improved!
4 Fun Links
- Jane Street on strace_tui got me to refresh my own Bonsai code and get it running under oi. Working for me but still polishing for release!
- Started reading through more geocaml code after seeing the Lidar viewer.
- I gotta tidy up my bleeding edge oxcaml-httpz code for others to use in their site. Something for next week!
- Learnt a lot about how power transformers work in the latest MCJ podcast.
