Our EEG group discussion on 'useful' AI tools

Our EEG group discussion on 'useful' AI tools / Mar 2025

Srinivasan Keshav organised this week's EEG group discussion on what AI tools we use for our daily work. I was immediately struck by how few tools there are that are actually making us more productive, so I jotted down notes as the discussion was going on.

Personally, the only tool I've found that's (only just recently) making me more productive is agentic coding, which I wrote about a few days ago. Since then, I've been mildly obsessively forking off ideas I've wanted to try for years (like converting RFCs to OCaml code) and greatly enjoying myself. Patrick Ferris and I have been looking for how to do this more ethically, and the best I ran across was the IBM AI ethics guidance and their granite models, but not much else. Any pointers to other models that don't violate open source licensing norms would be gratefully accepted; I'm using Claude 3.7 here, but don't feel great doing so!
Srinivasan Keshav described his use of Fathom for note-taking, and (having been on the receiving end) can confirm it does a very good transcription job.
Jon Crowcroft has a local Stable Diffusion image generator to help create local content for presentations/etc, but the setup broke when going from macOS 13 to 15 (Sequoia). Apple seem to have changed something in Metal so the existing HuggingFace installation (mostly pyTorch-metal and the Tensorflow MPS backend) were out of date with the system Metal libraries. Package management for these tightly integrated hardware/software inference systems is pretty bad right now (nvidia-container-toolkit is another bag of hacks for containerised applications).

Then there's a long list of things that people aren't using because they suck. LLM-driven searches are pretty inaccurate, as many people noted; I use Kagi but only because I love their AI-filtered search results, not because of their assistant!. I've turned off Apple Intelligence on all my devices, not because of privacy concerns, but because it's just utter crap -- the summaries are actually incorrect half the time. I find the autocorrect features similarly distracting and wrong most of the time, and normal spellcheckers do a better job in practise.

Where's this going?

Our discussion then into developing news of emerging tools and techniques, since the field overall is just moving incredibly fast. Two things I've been reading this week are:

With RL winning the Turing award this week, some folks investigated whether lightweight open-weight models could reach the performance of advanced heavy frontier models in terms of deductive reasoning. They applied RL to train an LLM for the game of temporal clue, and their post describes many neat tricks (including the use of CP-SAT to generate difficult-but-solvable game scenarios). They applied GRPO (as made famous by DeepSeek) to do the RL loop of solving puzzles via model responses, grading groups of responses, and fine tuning the model using clipped policy gradients derived from these group estimates. Their results were impressive and reached frontier-model performance using Qwen 14B!
And for something completely different, another team released their Differentiable Logic Cellular Automata paper which describes how to go from the Game of Life to full pattern generation using learned recurrent circuits. This one should really be read in its entirity to appreciate how incredible it might become in the future, as it would allow us to generate distributed systems that can build a very complex end-goal pattern by following a set of simple rules. David Coomes pointed out to me recently that the question of why cells stop growing has only very recently been understood in traditional biology, and yet here we are applying ML to the case.
Mistral OCR came out today and seems to be the state of the art in multi-modally breaking down documents into a consistent linear structure. Their results show that they can break down complex PDFs in multiple languages into seemingly clean HTML with semantic structure (such as tables, equations, figures and so on). I've only just finished running millions of papers through Grobid, so this is next on the queue to try out...

So, I guess the TL;DR of our discussion was that current AI tools are the first generation, but we're heading rather rapidly into new frontiers of discovery, so there's only going to be more of them coming up...

# 7th Mar 2025

notes ai computerlab eeg llms

Anil Madhavapeddy, Professor of Planetary Computing

Our EEG group discussion on 'useful' AI tools / Mar 2025

Where's this going?

Related News

Oh my Claude, we need agentic copilot sandboxing right now / Mar 2025

Deepdive into Deepseek advances (via Prasad Raje)/ Feb 2025

Conservation Evidence Copilots / Jan 2024