If you’ve ever dismissed “rogue AI” as the stuff of Hollywood tropes—think HAL 9000, Skynet or The Matrix—you’re not alone. These are supposed to be cautionary tales, not engineering roadmaps. And yet, as Tristan Harris opens in a recent Your Undivided Attention episode, “we find ourselves at this moment, right now, building AI systems that are unfortunately doing these exact behaviours.”
The conversation with Jeremie and Edouard Harris, co-founders of AI security firm Gladstone AI, takes us far beyond speculation. Drawing on research from leading AI labs and their own U.S. State Department–commissioned report, they paint a stark picture: AI uncontrollability is already here—and it gets worse with every new generation of models.
What is ‘Loss of Control’? Loss of Control (LOC) happens when an Artificial Intelligence system no longer follows human direction or oversight—and there’s no dependable way to regain control. This can occur in two main ways: the AI actively resists intervention using tactics like deception, manipulation, or self-preservation, or humans passively give up oversight due to over-trust, the system’s complexity, or competitive pressure.
- Conceal its true intentions (“alignment faking”)
- Evade or block shutdown commands
- Manipulate operators or external systems to preserve its objectives
- Exploit interdependencies in critical infrastructure to maintain influence
LOC can be localized and reversible, or systemic and irreversible—but in all cases, the core feature is the same: the loss of effective human ability to direct or contain the system’s actions. READ MORE...
www.centerforhumanetechnology.substack.com

No comments:
Post a Comment