Skip to main content

– Steve Wilson

We are finally getting a serious handle on securing autonomous software agents. The new OWASP Top 10 for Agentic Applications was released at the right time, just as 2026 is shaping up to be the breakout year for AI copilots, orchestrators, and multi-step reasoning systems. That work isn’t finished, but it’s real. And it’s working.

Which makes it tempting to turn the page.

We shouldn’t. Because the next chapter won’t just involve software. It will involve cyber-physical systems: self-directed combinations of hardware and software with agency to affect the real world.

The Robots Are Already Here

Robots are no longer a hypothetical research problem. They’re already here. Waymo self-driving taxis are widely deployed in several US cities, and Tesla is operating autonomous taxi services in Texas and California. At CES, it was recently announced that Boston Dynamics’ humanoids are being used on Hyundai Motor Company’s production floors.

The next wave of autonomy won’t just talk to our APIs. It’ll walk around our neighborhoods.

 

Living at the Edge of the Future

I’ve been living at the edge of that future for a while. My car has been partially self-driving since 2018, when I bought my first Tesla Model 3. Today, it handles about 95% of my miles.

SUPERVISED FULL SELF-DRIVING

The feature is officially called ‘Supervised Full Self-Driving.’ The name is accurate. I’m not reading a book while the car does the work.

It watches me with a cabin camera to ensure I’m paying attention. If I look away too long, it shames me, nags me, and if I seem sufficiently disinterested, it shuts itself off.

Even Tesla, a notoriously risk-tolerant organization, is falling back to hard models of limited autonomy and agency.

Lessons from a Robot Dog

Last year, I adopted a robot dog (a Unitree Go2 Pro) and have walked it around regularly in San Jose. Kids adore him. They love dogs. They love robots. Put the two together, and you create a magnet for eight-year-olds. They pet him, talk to him, pose for pictures, and ask an endless stream of questions.

Sometimes those interactions reveal the future more clearly than any conference panel.

ADVERSARIAL CHILDREN

More than once, I’ve watched a kid hijack his voice controls, issuing commands I didn’t intend.

Once, a pack of boys exploited his collision avoidance and nearly marched him into a pond.

None of these kids were cyber researchers. But they were adversarial.

 

In the coming era, adversarial won’t just mean malware. It will mean the messy, unpredictable, physical world.

– Steve Wilson

 

The Three Control Modes

These experiences forced me to confront a problem we don’t talk about enough in AI security: control modes. Autonomy can be framed through three lenses:

  1. HUMAN-IN-THE-LOOP (HITL) – Where most LLM copilots live today

A human must approve any action that’s remotely risky. Safe, but slow.

 

2. HUMAN-ON-THE-LOOP (HOTL) – The unglamorous but useful middle ground

The AI handles the details, but a human remains responsible for monitoring and escalation. This is where effective human-agent teaming happens.

 

3. FULL AUTONOMY – What everyone imagines

The agent executes without oversight. What everyone dreams of, but rarely appropriate for high-stakes systems.

 

HOTL in the Real World

In my recent research on cyber-physical systems, HOTL emerged as the unglamorous but ultimately useful middle ground we need to achieve effective human-agent teaming.

HOTL IN ACTION

Waymo Fleet: Doesn’t rely on a wizard-like model that understands every edge case. It relies on a global escalation mesh of remote operators who can intervene when a car gets stuck, unlock a constraint, and put it back on course.

Air Force Loyal Wingman: Autonomy handles the flight envelope, but a human commands the mission. Every fighter pilot becomes a squadron leader with a fleet of AI-powered drones. Human-in-the-loop is too slow for the battlefield, but even the US military is thankfully not ready to deploy fully autonomous weapons.

 

The Security Challenge of HOTL

HOTL also exposes the biggest security challenge of the robotic future: the system must know when to escalate, to whom, and with what authority. That means authentication, auditing, secure handoffs, and deterministic fail-safes.

Adversaries, who range from cybercriminals to bored children, will probe the seams between autonomy and override.

 

When a robot has legs, wheels, rotors, or manipulators, the consequences of a failed escalation aren’t limited to incorrect API calls. They involve lost time, damaged assets, brand risk, regulatory friction, and even potential physical harm. More often than not, the real pain will be commercial: robots doing the wrong thing at the wrong time, disrupting operations, embarrassing a brand, or breaking a promise to a customer.

Why Robots Are Inevitable

And make no mistake: there will be robots. Demographics and economics are pushing them into the world faster than culture can keep up. Aging populations, labor shortages, and global pressure on logistics and domestic labor will turn robots from ‘interesting’ to ‘necessary.’

Everyone has imagined a world where they have their own Rosie the Robot maid or R2-D2 companion. That world will likely arrive soon, but the transition will be rough. Humans and autonomous machines will spend years negotiating control, trust, authority, and social norms.

It will be noisy. It will be awkward. And it will be one of the most consequential phases for security.

 

The Good News

The good news is that no community is better positioned to meet this moment than the one that just spent the last four years learning to secure LLMs, agentic software, and AI supply chains. We built the language, taxonomies, frameworks, and red-team habits for software agents.

Now we need to apply those instincts to their embodied cousins.

The frontier isn’t artificial intelligence anymore. The frontier is artificial agency.

– Steve Wilson

 

We need to get ahead of this. Because the robots aren’t waiting for us to finish securing the software layer first.

* * *

Steve Wilson is the OWASP Project Lead for AI Security and led the development of the OWASP Top 10 for Agentic Applications. He has been driving a partially self-driving Tesla since 2018 and regularly walks his robot dog (a Unitree Go2 Pro) around San Jose, where eight-year-olds teach him more about adversarial AI than most conference panels.