A very fun talk at ACM HOPE 2024
on some new work with Cyrus Omar and Patrick Ferris on how we can formally specify
systems to be robust to code generation by AI agents. For instance, if you were
to ask GitHub Copilot to generate you code to filter endangered animals out of
a folder of images, it might interpret that as to delete the image, or to move
it to another folder (which might be public), or just remove it from the index.
Any of those options are potentially valid, so what do we do? Our idea is to
use F* to specify a rich set of allowable behaviours which can then be
dynamically enforced in less expressive languages, and thus offer layers of
protection against over-eager (or rogue) AI agents.
We'll increasingly need these sort of protections in our systems as natural
language interfaces get adopted for programming, and the traditional 'yes or
not' access control policies we use right now will be insufficient.
Read more in: