Software Defectrums: The End of an Era?

What I mean by "Defectrums"

Throughout this article I use the term Defectrums as an umbrella for every category of software error. That includes logic errors, runtime errors, dependency and supply-chain vulnerabilities, security vulnerabilities, and specification misalignment, where the software does exactly what it was told to do, just not what it was meant to do. If a category matters to you that I have not listed, mentally add it; the argument that follows applies to all of them.

Software is deterministic, people are not

Start with a deceptively simple observation: software is deterministic. When code executes, it behaves identically every single time. Take a routine that branches on a person's recorded sex: if the value is M it follows the male branch, every single time; if it is F it follows the female branch, every single time. No exceptions, ever.

But watch what happens the moment someone arrives whose record is neither M nor F. The machine does not improvise and it does not panic. It does exactly what it was told to do, which may be nothing useful at all, or something nonsensical, simply because the developer never imagined that input. The determinism held perfectly. The defect was ours: a failure of foresight, baked into the code and executed faithfully forever.

That is the whole problem in miniature, and it matters more than it first appears. If software is deterministic, then in principle every defect in it is knowable, reproducible, and fixable. A flaw is a fixed point in a fixed system. The reason software remains stubbornly broken is not the machine. It is the chain of humans who wrote the code, the compiler, the operating system, and the libraries beneath it. The determinism is in the silicon. The unpredictability enters through us.

Hold on to that idea, because it is the spine of everything that follows: a deterministic system should, eventually, be made flawless. The only question is whether we can remove enough of the human variability to get there.

The making of a programmer

It was not always this messy, because it was not always this widespread or complicated.

In the beginning, computing was a niche industry. Computer engineers were the only people who programmed and operated the mainframes and minicomputers, and the cost of buying and running those machines put them far beyond the reach of any hobbyist. Those machines also had limited storage and processing power, which kept the complexity of any single program in check. And because computer time was both slow and expensive, code was typically desk-checked and reviewed line by line before anyone dared run it. Scarcity enforced a discipline that abundance would later erode.

The microcomputer changed everything. Initially pooh-poohed by mainstream media and the technology establishment as non-serious gadgets, toys that would stay confined to hobbyists while "serious" computing remained the domain of mainframes, these machines quietly put programming into the hands of an ever-growing audience. Ordinary people began writing their own programs, then sharing and selling them, which drew in more buyers, which created more programmers, in a snowball that never stopped rolling.

VisiCalc is the classic example. The 1979 spreadsheet from Software Arts was the "killer app" that drove Apple II sales; many people bought the computer specifically to run it. As John Markoff observed, the machine was effectively being sold as a "VisiCalc accessory." Every computer that entered a home widened the exposure further. Dad's machine became the kids' games console the moment he stepped away, and some of those children started dabbling in code of their own. Programs multiplied to cover every topic imaginable: recipe managers, household budgets, astronomy charts. If you could think of it, someone had probably already written it.

The incumbents had lost. Any hope of keeping the role of the programmer, later rebranded the software engineer, narrow or formally licensed was gone. As the user base grew exponentially, so did the flood of ideas. It was, genuinely, a case of let a hundred flowers blossom. Microcomputers went on to take over the office, and through networking, the internet, and mobile computing the proliferation has become almost comical. Pick the most niche subject you can imagine, knot tying, say, and you will still find an astonishing number of apps, sites, and videos devoted to it alone. Now multiply that by every hobby, profession, and passing curiosity on earth and you begin to sense the scale. Some of this software is specialised, some visual, some gamified. You get the gist.

The downside is that not all software is created equal. Where scarcity once forced rigour, abundance rewarded speed. Quality is bounded by the programmer's skill, by time-to-market pressure, by whatever testing strategy survived the deadline. Plenty of programs written for an audience of one have taken on a life of their own and ended up serving millions. And software is never self-contained. It runs on an operating system, leans on packages to do its work, and each of those components in turn depends on libraries and tools built by someone else.

XKCD's "Dependency", comic #2347, captures this perfectly. Consider FFmpeg, " a complete, cross-platform solution to record, convert and stream audio and video. " Tens of thousands of software programs, including browsers, applications, packages, libraries, and operating systems, and billions of devices, including TVs, streaming boxes, games consoles, and phones, rely on it internally. A defect in one such building block does not stay local. It propagates silently into everything stacked on top.

We have watched this happen, twice in recent memory. In 2021, Log4Shell, a single flaw in Log4j, a Java logging library few outside the trade had ever heard of, opened a remote-code-execution hole in millions of servers overnight and set off a global firefight that ran for months. In 2024 the xz-utils backdoor came even closer to catastrophe: an attacker spent years patiently earning the trust of a burned-out volunteer maintainer, then slipped a hidden backdoor into a compression library threaded through the entire Linux ecosystem. It was caught almost by accident, by one engineer who noticed a login was running half a second slower than it should. That is the XKCD cartoon made real, and the near-miss is more frightening than the hit.

The human in the loop

So the problem with software is the human. We are intelligent but not omniscient; we cannot foresee every possible outcome and guard against it in advance. And even if we imagine a perfectly written program, the compiler that built it, the operating system that runs it, and the libraries it depends on are never guaranteed to be flawless. Any link in that chain can be the weak one.

We have accepted this so completely that essentially all software ships with a warranty disclaimer like the one below. Imagine the engineers behind an apartment block, a bridge, a tunnel, a train, or an aircraft offering their customers anything remotely like it.

THE SOFTWARE IS PROVIDED AS IS WITHOUT ANY WARRANTY WHATSOEVER, INCLUDING BUT NOT LIMITED TO ANY WARRANTY OF FUNCTIONALITY. YOU RECOGNIZE THAT THE AS-IS CLAUSE OF THIS SOFTWARE LICENSE AGREEMENT IS AN IMPORTANT PART OF THE BASIS OF THIS SOFTWARE LICENSE AGREEMENT, WITHOUT WHICH THE COMPANY WOULD NOT HAVE AGREED TO ENTER THIS SOFTWARE LICENSE AGREEMENT. THE COMPANY AND THIRD PARTIES DISCLAIM ALL WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, REGARDING THE SOFTWARE, INCLUDING ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, LACK OF VIRUSES, TITLE, AND NONINFRINGEMENT. NO REPRESENTATION OR OTHER AFFIRMATION OF FACT REGARDING THE SOFTWARE SHALL BE DEEMED A WARRANTY FOR ANY PURPOSE OR GIVE RISE TO ANY LIABILITY OF THIRD PARTIES WHATSOEVER.

That disclaimer is not just legal boilerplate. It is an industry-wide admission that we expect our products to be flawed. And those flaws are not academic. Criminal groups and individuals exploit them every day to steal data, install remote-control backdoors, hold systems to ransom, and otherwise turn other people's defects into money.

That acceptance is beginning to change. The European Union's Cyber Resilience Act, in force since December 2024 and biting in full from December 2027, drags software toward the same product-liability regime we already take for granted in bridges and aircraft. Manufacturers of connected products and software sold in Europe will have to build in security, manage vulnerabilities over a product's lifetime, and report the ones being actively exploited, or face real penalties. The "as-is" escape hatch that the industry has hidden behind for years is, slowly, being welded shut. The pressure to make software genuinely defect-free is no longer only moral or commercial. It is becoming law.

With a little help from AI

Here is where the story turns. The same determinism that makes defects inevitable also makes them tractable for a machine that can read code faster and more patiently than any human team.

This is not a brand-new dream, either. For decades, formal methods have let us mathematically prove that a piece of software meets its specification: the seL4 microkernel, for instance, carries a machine-checked proof of correctness, and tools like TLA+ have caught design flaws in systems running at planetary scale. The catch was always cost. Proof at that rigour is so labour-intensive that it stayed locked inside aerospace, chip design, and a handful of safety-critical kernels. What AI changes is the economics, holding out the prospect of proof-grade scrutiny applied to ordinary, everyday code. Anyone interested in diving deeper should read the 1996 Fast Company article They Write the Right Stuff by Charles Fishman, a portrait of the NASA shuttle software team that achieved near-perfect code through sheer discipline, and a reminder that the price of such perfection has always been the thing AI promises to lower.

AI systems built for code can already review software and surface flaws that the original developers missed. The capability is no longer hypothetical. In 2024, Google's Big Sleep, a collaboration between Project Zero and DeepMind, became the first AI agent to find a previously unknown, exploitable vulnerability in real-world software, a memory-safety bug in SQLite, the database engine embedded in billions of devices. Anthropic's Mythos model was reported to identify vulnerabilities in highly sensitive US government systems within hours during a testing exercise. The capability proved sharp enough that, in June 2026, the US government forced Anthropic to pull both its Fable 5 and Mythos models from general availability over national-security concerns. OpenAI, meanwhile, released GPT‑5.5‑Cyber as part of its Daybreak programme, a model explicitly built to find vulnerabilities, validate them in a controlled environment, and develop and test patches.

These frontier models are strikingly good at the full loop: quickly surfacing security flaws, generating working exploits to prove their reasoning, and then rewriting the code to close the hole. It is not hard to imagine a near future in which buggy code is the exception rather than the rule. The "CI" in a CI/CD pipeline could read like this:

Developer commits code written with the help of AI coding agents → an agentic CI pipeline generates scripts to test the code for Defectrums → AI analyses the Defectrums → AI rewrites the code to resolve them → feedback is returned.

I can even imagine the pipeline running itself in reverse: an AI agent that discovers a vulnerability in, say, a package your solution depends on, and then initiates and leads a development cycle to fix it, always under human direction.

But there is a catch worth naming. Today, the very models that find and fix flaws also introduce them. Independent testing in early 2026 found that leading AI coding agents reproduced decade-old security mistakes, with the large majority of their pull requests containing at least one vulnerability. And the tool that can patch a flaw is, mechanically, the same tool that can weaponise it. That dual-use reality, defence and offence sharing one engine, is exactly why governments are nervous, and it shapes everything about the future below.

There is a deeper irony hiding in here. We are proposing to police a perfectly deterministic system with a profoundly non-deterministic one. A large language model does not behave identically every time; ask it the same question twice and you may get two different answers. So who verifies the verifier? An AI's fix can be confidently wrong, can introduce a fresh regression, or can quietly paper over a symptom while leaving the real cause untouched. For now the answer has to be a human, or a second model, reviewing the work, which means the loop never fully closes. We may be trading a world of human defects for a world that needs constant, vigilant checking of the machines we hoped would do the checking.

The Defectrum-free future

So, can we ever reach a world where the only updates we receive are the welcome ones, new features and a fresh interface, rather than yet another urgent patch? I believe it is a realistic possibility. I also believe it will take a very long time, for reasons that have little to do with the cleverness of the AI.

The sheer volume of existing software is astronomical, and we will never fully know who is running what, or where. Thousands of packages are no longer maintained, or are kept alive by a handful of volunteers, the unpaid maintainers in that XKCD cartoon, who lack the time, the money, or both to revisit code they wrote years ago. The dependency problem, whether in code or in the wider supply chain, has to be addressed link by link. The first organised attempts already exist: Google's CodeMender and OpenAI's "Patch the Planet" both point AI squarely at finding and fixing flaws in widely used open-source code. They are a start rather than a solution, but they show the direction of travel.

Then there is the question of who pays. Most of this fragile, forgotten code belongs to no one with a commercial reason to repair it, which is precisely why it rotted in the first place. So why would anyone fund the cleanup? The uncomfortable answer is that the large platforms increasingly have to. Their own products are built on the same open-source foundations, so a hole in a common library is a hole in their stack too; defending themselves means defending the commons. Regulation sharpens the incentive further, as liability regimes like the Cyber Resilience Act turn a neglected dependency into a balance-sheet risk rather than someone else's problem. That covers the popular, widely used components. The genuinely orphaned code, used by a few, owned by no one, watched by nobody, is the part the market will always be slowest to reach, and it is where public funding or shared industry consortia may be the only answer.

Then there is the embedded layer. Enormous amounts of software sit inside hardware that is connected to the internet but locked in a cabinet and never touched again, forgotten, unsupported, or simply left alone out of fear of breaking something that currently works. Below even that lies silicon-level fragility, the Spectre- and Meltdown-class flaws that no software rewrite can fully erase.

AI agents are not perfect, which means human oversight is still necessary. This is the same loop I argued earlier never quite closes, and it is also where I expect the answer to be found. As the models improve, I believe single agents will give way to committees of them, working seamlessly together, cross-checking one another through a quorum-based consensus mechanism, with the human holding the casting vote. A panel that has to agree is far harder to fool than any single member of it. The honest caveat is that a committee can still share a blind spot if its members were trained alike, so oversight grows lighter over time without ever quite reaching zero.

The honest summary of where we stand is the one every engineer reaches for eventually: it's complicated.

Picture the far horizon, though. Suppose these AI Defectrum agent committees become free to use, modify, share, build upon, and learn from everyone's code. Suppose they are woven into every development environment, and suppose all existing software has eventually been refactored or replaced. Does the cybersecurity role then fade away?

In the short term, almost certainly the opposite. The likeliest near-future scenario is a barrage of attacks, as criminal groups get their hands on the same offensive AI capabilities and race to exploit the vast backlog of unpatched, vulnerable software before defenders can reach it. The attacker only needs one open door; the defender has to close all of them. For a while, the asymmetry favours the attacker.

Will cybersecurity ever be written into history?

No. And the reason takes us right back to where we started.

We can make a deterministic system flawless, in principle. We cannot do the same to the human using it. Software may one day be Defectrum-free, but as long as a person sits at the keyboard, there remains a non-deterministic, fallible, persuadable entity in the loop, one that can be tricked, phished, and socially engineered no matter how perfect the code behind the screen.

The era of buggy software may well be ending. The era of the human, gloriously and dangerously unpredictable, is not. And so cybersecurity does not disappear. It simply moves up the stack, from defending the code to defending the one part of the system we were never able to make deterministic: ourselves.

Follow This, That and (Maybe), the Other:

Search This Blog

TT(O)M