AI Foot-Guns

With the sporadic, yet increasing sound of Artificial Intelligence Agent Foot-Guns¹ going off in the distance, I’m left wondering whether we’ve forgotten the lessons of the past. Or perhaps in the rush to use Large Language Models (LLMs) for everything we’re forgetting to put up the same guard rails we did to protect ourselves from Organic Intelligence “hallucinations”. What am I misssing?

It seems like every other day, we’re reading about someone’s experience with LLM-driven automation, such as Agentic AI, ending in tears and with fewer toes. Entire datasets and backups gone. Whole software repositories left a smoking crater.

This isn’t an anti-AI post by any stretch. I’m not an AI naysayer, preaching about the impending AI-pocalypse. But I’m also not a member of the AI cult, drinking the Kool-Aid without checking the ingredients. Like any tool we’ve developed in the industry ever, once you see past the hype, there’s value to be had, but also caution to be exercised.

I started my first software development job in 2000. I already had some years of running production systems in previous roles at that point, all using the only thing we had back then – Organic Intelligence. But my employer at this software development company didn’t, on my first day, hand me the keys to their customers’ production databases to drop it as I see fit. So why are folks out there granting these agents the ability to wipe out a whole dataset?

A few years further into my career, I was running some security and other courses for telecommunications companies, utility companies and government. One of the key security concepts we used to drive home was to grant the least privilege necessary for a given user. So why are folks out there granting agents permission to wipe out entire software repos. In most organisations I’ve worked for, we generally all have to create a pull request and have it reviewed before it’s merged.

My (admittedly relatively new) experience with self-hosting LLMs and agents suggests that there are numerous ways to separate different concerns for different agent functions. It’s also definitely possible to have LLMs write test cases that would also act as an extra set of checks and balances.

A previous employer and mentor of mine once described there being a difference between a software engineer or software developer, and someone who can write code. In some ways, the code written is less important than some of the other things that the job entails.

LLMs can write code, but there’s a lot they’re missing that an engineer brings. The same likely applies for other automated tasks.

So whilst we’re offloading the writing of code (arguably the most fun part) to AI, perhaps we need to remember to take care of the rest of the job that AI can’t, and to supervise it with the same sorts of guardrails we would any human with Organic Intelligence.

The term foot-gun is common slang for tools or functionalities within a technology that grant the user all the power they need to shoot themselves in the foot. ↩︎