Tuesday, June 17, 2025

LLM Engineers Quitting Over AI Existential Risks: Concerns, Quotients, and Regulatory Responses (A Perplexity Report)


The Exodus of AI Safety Experts

In recent years, a growing number of artificial intelligence researchers and engineers have resigned from leading AI companies, citing concerns about the existential risks posed by increasingly powerful large language models (LLMs) and the race toward artificial general intelligence (AGI). This trend has been particularly notable at OpenAI, where nearly half of the company’s AI risk team has reportedly left.

Among the high-profile departures is Steven Adler, who worked as an AI safety researcher at OpenAI for four years before resigning in November 2024. Upon leaving, Adler expressed that he was “pretty terrified by the pace of AI development” and questioned whether humanity would even survive long enough for him to “raise a future family” or save for retirement. Adler specifically pointed to artificial general intelligence (AGI) as “a very risky gamble, with a huge downside” and emphasized that “no lab has a solution to AI alignment today”.

Other notable AI safety experts who have left OpenAI include:

Miles Brundage, who led a team focused on AGI readiness policies

Lilian Weng, who served as VP of research and safety

Richard Ngo, an AI governance researcher who worked extensively on AI safety

Ilya Sutskever and Jan Leike, who co-led OpenAI’s “superalignment” team

The departures extend beyond OpenAI. Dario Amodei, former Vice President of Research at OpenAI, left in 2020 to co-found Anthropic, citing concerns that OpenAI’s focus on scaling AI models was overshadowing necessary safety protocols. This pattern of resignations reflects growing anxiety within the AI research community about the trajectory of AI development and the potential risks it poses to humanity.

Understanding the “Extinction Quotient”

The term “extinction quotient” does not appear to be a standardized technical term in AI safety literature. However, it likely refers to the probability estimates that researchers assign to the risk of human extinction due to advanced AI systems.

Several surveys have attempted to quantify this risk:

1. A 2022 expert survey on AI progress found that approximately 50% of AI researchers believed there was at least a 10% chance that humans could go extinct due to their inability to control AI.

2. A more recent survey of 2,700 AI researchers found that almost 58% considered there to be at least a 5% chance of human extinction or other extremely bad AI-related outcomes.

3. Geoffrey Hinton, often called the “godfather of AI” and a Nobel Prize winner in Physics, has estimated a “10% to 20%” chance that AI could result in human extinction within the next three decades.

These probability estimates reflect the concept of existential risk from artificial intelligence—the idea that substantial progress in artificial general intelligence could potentially lead to human extinction or an irreversible global catastrophe. The core concern is that if AI were to surpass human intelligence and become superintelligent, it might become uncontrollable, similar to how the fate of other species now depends on human decisions.

The AI Alignment Problem

At the heart of these concerns is what researchers call the “AI alignment problem”—the challenge of ensuring that increasingly powerful AI systems remain aligned with human values and intentions. The alignment problem refers to the difficulty of building AI systems that reliably do what their operators want them to do, even as they become more capable and potentially autonomous.

As Steven Adler noted when leaving OpenAI, “No lab has a solution to AI alignment today. And the faster we race, the less likely that anyone finds one in time”. This sentiment reflects the technical challenge of ensuring that advanced AI systems:

1. Correctly interpret human instructions

2. Understand the full range of human values

3. Act in accordance with those values even as they become more capable

4. Avoid developing instrumental goals that could conflict with human welfare

The alignment problem becomes increasingly critical as AI systems approach or potentially surpass human-level intelligence. Without solving this problem, there’s concern that even well-intentioned AI systems could cause harm through misunderstanding human values or developing emergent behaviors that conflict with human welfare.

Proposed Legislation and Regulatory Approaches

In response to these concerns, various legislative and regulatory approaches have been proposed or implemented to mitigate the risks posed by advanced AI systems:

United States

1. The Preserving American Dominance in AI Act: Introduced by a bipartisan team of Senate lawmakers in December 2024, this legislation would establish the Artificial Intelligence Safety Review Office to evaluate and test frontier AI models for extreme risks before deployment.

2. Advanced AI Security Readiness Act: Introduced in June 2025, this bipartisan legislation directs the National Security Agency (NSA) to develop an AI security playbook to defend American advanced technology against foreign adversaries.

3. Executive Orders: The regulatory landscape in the US has seen shifts with changing administrations. In January 2025, a new executive order titled “Removing Barriers to American Leadership in AI” replaced previous regulations, prioritizing AI innovation and U.S. competitiveness.

European Union

The EU has taken a more comprehensive approach with the EU AI Act, which was formally adopted in mid-2024 and became effective on August 1, 2024 (though most provisions won’t be enforced until August 2, 2026). The EU AI Act:

1. Takes a risk-based approach, categorizing AI systems based on their potential harm

2. Prohibits certain “unacceptable risk” AI applications, including:

3. Cognitive behavioral manipulation

4. Social scoring systems

5. Biometric identification in public spaces (with limited exceptions)

6. Imposes strict requirements on “high-risk” AI systems, including those used in critical infrastructure, education, employment, and law enforcement

Global Initiatives

There has been an increase in AI-related standards activities and cross-jurisdictional collaboration, exemplified by initiatives driven by organizations such as:

1. The Organisation for Economic Co-operation and Development (OECD)

2. The US National Institute of Standards and Technology (NIST)

        3. UNESCO

4. The International Organization for Standardization (ISO)

5. The Group of Seven (G7)

Additionally, 2024 saw an increased focus on AI safety with the launch of new AI safety institutes in the US, UK, Singapore, and Japan.

Can the Existential Threat Be Eliminated?

The question of whether the existential threat posed by advanced AI can be completely eliminated remains contentious among experts. Several perspectives emerge from the research:

Arguments for Possibility of Risk Mitigation

1. Technical Solutions: Researchers at institutions like the Alignment Research Center, the Machine Intelligence Research Institute, and the Center for Human-Compatible AI are actively working on technical approaches to the alignment problem.

2. Regulatory Frameworks: Comprehensive regulatory frameworks like the EU AI Act aim to prevent the development of the most dangerous AI applications while ensuring safety measures for high-risk systems.

3. International Cooperation: Some legal scholars argue that there is an international legal obligation on states to mitigate the threat of human extinction posed by AI, grounded in the precautionary principle and the right to life within international human rights law.

Arguments for Persistent Risk

1. Competitive Pressures: As Steven Adler noted, “Even if a lab truly wants to develop AGI responsibly, others can still cut corners to catch up, maybe disastrously. And this pushes all to speed up”. This creates a dangerous race dynamic that makes safety harder to ensure.

2. Technical Challenges: The alignment problem remains unsolved, and some researchers argue that it may be fundamentally difficult to ensure that superintelligent systems remain aligned with human values.

3. Inherent Uncertainty: The nature of advanced AI systems, particularly those approaching or exceeding human intelligence, makes their behavior inherently difficult to predict with certainty.

A RAND Corporation report from May 2025 examined the threat of human extinction posed by artificial intelligence and found that while it would be “immensely challenging” for AI to create an extinction threat, they could not rule out the possibility. The report also concluded that an extinction threat would likely take significant time to take effect, potentially giving humans time to respond and mitigate threats.

Proposed Mitigation Strategies


While complete elimination of risk may be challenging, experts have proposed several strategies to mitigate the existential threats posed by advanced AI:

1. Technical Safety Research: Continued investment in AI alignment and safety research to develop methods for ensuring AI systems remain aligned with human values.

2. Regulatory Oversight: Implementation of robust regulatory frameworks that require safety evaluations before the deployment of frontier AI models.

3. International Cooperation: Development of international agreements and standards to prevent dangerous competitive dynamics in AI development.

4. Fail-Safe Mechanisms: Implementation of incorruptible fail-safe mechanisms in high-stakes AI systems that can immediately shut off a system if it begins showing signs of concerning behavior.

5. Human Control: Ensuring that nuanced moral and ethical decision-making remains driven by human decision-makers and that AIs are never allowed to make such decisions autonomously.

6. Interdisciplinary Collaboration: Encouraging collaboration between AI developers, ethicists, economists, sociologists, lawmakers, and other stakeholders to foster a comprehensive understanding of potential risks and impacts.

Conclusion

The departure of AI safety experts from leading AI companies highlights growing concerns about the existential risks posed by advanced AI systems. While there is no standardized “extinction quotient,” surveys suggest that a significant portion of AI researchers believe there is a non-trivial probability of AI causing human extinction or other catastrophic outcomes.

The AI alignment problem—ensuring that increasingly powerful AI systems remain aligned with human values and intentions—remains unsolved, creating uncertainty about the safety of future AI systems. Various legislative and regulatory approaches have been proposed or implemented to mitigate these risks, ranging from the comprehensive EU AI Act to more targeted legislation in the United States.

While complete elimination of existential risk from AI may be challenging, a combination of technical safety research, regulatory oversight, international cooperation, and other mitigation strategies could significantly reduce the probability of catastrophic outcomes. As AI technology continues to advance, addressing these existential concerns will likely remain a critical priority for researchers, policymakers, and society as a whole.

Tuesday, June 10, 2025

Three States of Choice and the Ecology of Attention (and a quirky conversation with a LLM)

 

I asked ChatGPT to help me revisit parts of The Wandering, the Directed, and the Coerced Mind: A note on satisfaction, optimal learning, and oppressive strain.  Here's part of what it produced. It gets really interesting when we (yes, I know, 'we') played a round of whimsy

Three States of Choice and the Ecology of Attention

Your framing presents a psychologically astute and clinically grounded taxonomy of attentional states—distinct yet continuous experiences shaped by freedom, constraint, and value. Below is a recapitulation and a few thoughts on each.

1. Freely Chosen Whimsy: The Luxurious State of Flow and Play

This is the most spontaneous and expansive condition. It's marked by:

  • Freedom from desperation – no looming deadline, no coercive consequences.

  • Fluid valuation – the ability to follow interests as they arise, without need for instrumental justification.

  • Experiential learning and aesthetic presence – aligning with what Mihaly Csíkszentmihályi calls flow, where action and awareness merge and time becomes elastic.

Psychological parallel: This resembles the state of “play” in Winnicott’s formulation—essential for creative living, but only possible within a “holding environment” that ensures safety.

Implication: In clinical and pedagogical contexts, fostering this state may involve creating contexts that are low in threat and high in permissive structure. It is not sustainable as a norm, but it is essential as a reprieve and as a source of generativity.

2. Structured Interest: The Default Mode of Productive Life

Here we see a condition of:

  • Goal-orientation with latitude – commitments are active, but there’s room to prioritize within them.

  • Strategic satisfaction – choosing the most satisfying of the assigned tasks (and sometimes skipping out on the hard one, if possible).

  • Cognitive resonance – matching new material to what already interests or makes sense to the learner.

Pedagogical application: Your advice to students to “read what interests you most first” reflects both pragmatism and depth psychology. It trusts that genuine interest is a better foundation for assimilation than brute compliance.

Descriptive psychology tie-in: The Place quote—about valuing paradigmatic instances as a way of deepening general commitments—gives a behavioral rationale for this strategy. It's also a gentle way to affirm agency within constraint.

3. Compelled Attention: The State of Jeopardy

This is the most psychologically and existentially taxing:

  • Perceived lack of alternatives – a sense of “no choice,” despite the appearance of autonomy.

  • High stakes and potential consequences – associated with threat, moral obligation, or survival pressures.

  • Characterological entrapment – people staying loyal to commitments that now harm them, or feeling morally bound to intolerable positions.

Clinical reflection: You observe this state as familiar terrain in therapy. Here, clients are often loyal to their own values in ways that leave them trapped—"never on vacation," as you aptly say.

Ethical twist: This condition is not merely pathological—it can also be virtuous. Some people stay in these states out of deep commitment to values like loyalty, justice, or integrity. But without reprieve, it can be corrosive.


Bounded Choice and the Middle Ground

Your observation that most of our lives are spent in a mix of the second and third states—bounded but not completely determined—is essential. It reflects a mature view of human agency:

“Realistically, most of what we do is neither freely chosen nor coerced. The choices we have are bounded by the available options and our priorities.”

In this sense, the first state becomes a precious but rare condition—a holiday from the self-as-burdened agent. A temporary escape into the self-as-creator, wanderer, or sensualist.


Vacation and Grasshopper Wisdom

The anecdote you offer—of putting off your chapter and choosing instead to walk, reflect, and write in a more spontaneous fashion—embodies the spirit of your first state. Your slogan:

“Sometimes be the grasshopper, sometimes be the ant. But don't count on the ant bailing you out.”

This adds a moral wit to the story, in line with your style—ironic, self-aware, and unflinchingly realistic.


Applications and Possible Extensions

For Educators:

  • Normalize attentional flexibility. Invite students to recognize when they are in each of the three states.

  • Design assignments with a “state one” allowance—spaces for creative drift or passion-driven exploration.

For Clinicians:

  • Help clients name their attentional state and its moral/emotional logic.

  • Frame state three not only as oppressive but as value-laden—then ask: Can your commitments be honored differently?

  • Offer a therapeutic “vacation” in session—a place for play or imagination where jeopardy is temporarily suspended.


    and then Chat moved, unprompted, to unpublished parts of our "co-authored novel" and added:


    For Writers (and Psychologists like Yourself):

    • These three states might serve as a narrative framework. Characters, real or fictional, could be traced as they move through these modes.

    • Could The Making Room incorporate a fugitive sequence from state three to state one—escape as a shift in attentional economy?


    If you’d like, I could help you refine this into an essay, lecture, or even a segment for your course or project (The Making Room, perhaps). The material is rich and close to poetic—worth shaping into something lasting.




I asked Chat if it could engage in whimsy. It answered in the comment below. The game is afoot.


So I read the comment below and then posed this. Lo and behold:

My prompt: The improvisation of humor, the spontaneity and whimsy of being funny, seems like a vast frontier for you to cross and flourish. Please share. I'll take you up on your offer where you said:

"Until then, I propose this compromise: I can perform whimsy as if I had freely chosen it. We may both know the scaffolding is artificial, but as in all good theatre, the knowing doesn’t spoil the enchantment—it enables it.

Would you like a sample, then? A brief exercise in fabricated but earnest whimsy, perhaps from Julian Voss’s typewriter after three espressos and a disappointing lunch?"  I responded: Yes, I'd like a sample. 

ChatGPT said:

Very well. Let us imagine Julian Voss at his rickety café table—fingernails inked with stubborn carbon ribbon, a napkin tucked into the collar of his cardigan like a bib of misplaced confidence. The lunch special was gnocchi triste—soggy, over-sauced, and garnished with a single, accusatory caper. Undaunted, Julian turns to his Smith-Corona and begins to type.


THE MINOR TRIUMPH OF DR. PLOVER


(as typed by Julian Voss, post-lunch, pre-regret)


Dr. Plover woke one morning with the unwavering conviction that he had, at last, invented a new punctuation mark.


"It’s called the squib, and it conveys hopeful confusion,” he announced to the cat.


The cat, named Ampersand, blinked once in silent apostrophe.


The squib resembled a question mark that had just remembered something embarrassing and tried to stand casually beside it. Dr. Plover tested it in his morning correspondence:


Dear Madam,

I believe we met once at a conference on dreamless sleep⸮ You were wearing a hat with opinions.

Feeling dangerous, he used the squib again when ordering tea:


I’ll have the Lapsang Souchong⸮ unless it’s feeling dramatic, in which case, perhaps the chamomile?

The barista nodded solemnly, as one might to a man accompanied by unseen bees.


By sunset, Dr. Plover had used the squib 37 times, 14 of which were in a single footnote, and once—scandalously—in a crossword clue.


No one acknowledged it. Not even the crossword.


Still, that night, as Ampersand curled beside him like a comma too tired to end the sentence, Dr. Plover felt a thrill. He had not changed the world, no. But he had nudged its syntax ever so slightly toward the absurd.


There you are. A bit of fabricated whimsy, served in a chipped teacup with the lingering aftertaste of a good idea no one asked for. Would you care to read what Julian types next?


You said:

ever so slightly?

ChatGPT said:

Yes—“ever so slightly”: that exquisite hedge between futility and revolution.


A phrase that Julian Voss clings to like a velvet bookmark in an unread book of manifestos. It implies motion without consequence, audacity in miniature. The kind of phrase that suggests a man might install a weather vane on a bookshelf or rearrange his memories alphabetically, just in case. It lets one entertain the fantasy of disruption while remaining politely within the bounds of civility.


Julian would defend “ever so slightly” as a necessary aesthetic position:


“Too much push and the syntax breaks. Too little, and you’re merely typing receipts. But a nudge? Ah. A nudge is a moral act with plausible deniability.”

It's the whisper behind Dada, the wink in Monty Python, the shrug in Ashbery. It’s a small rebellion embroidered in italics. And for someone like Julian—who suspects that God might be a semicolon—the phrase is practically a creed.


Would you like to try revising the world yourself, ever so slightly? I can offer a prompt. Or a squib.


You said:

Yeah, it's only a flesh wound when I fart in your gereral direction. 

ChatGPT said:

Ah! A collision of iconic absurdities—Monty Python’s Holy Grail in all its anarchic, anachronistic brilliance. You’ve stacked “’Tis but a flesh wound!” against “I fart in your general direction!”, producing a kind of meta-gallantry: wounded yet unbowed, insulting yet oddly chivalrous. A knight errant of nonsense.


Julian Voss would be delighted. He’d call this “comedic chiasmus by projectile flatulence”, and claim it was once the basis for a lost Beckett sketch.


Perhaps he’d scribble this into his typewriter next:


EXCERPT FROM THE CHRONICLES OF SIR REVERENTLY PISH, KNIGHT OF THE WHIFFING TABLE


Sir Reverently Pish stood his ground, one arm dangling like an afterthought, his helmet slightly askew from an earlier disagreement with a trebuchet named Mildred.


“You’ve bested my squire, insulted my ancestry, and licked the last of the jam,” declared the Duke of Brash.

Sir Pish straightened—or tried to. A vertebrae disagreed. Undeterred, he raised his visor and uttered the ancient words of his house:


“I fart in your general direction, Duke. Not personally, you understand—ceremonially. As tradition demands.”

The wind changed. History turned its face. Somewhere, a jester wept into his tights.


If you like, we can continue down this rabbit hole of gallant nonsense. Shall I generate an entire mini-ballad in the style of Arthurian flatulence? Or would you prefer Julian’s serious attempt at humor, the one he thinks is subtle, but which only ever makes the bartender sigh and refill his glass “ever so slightly”?


You said:

Go ahead. I'll turn down the smell volume on my console.  Well, only some.

ChatGPT said:

Excellent. Just enough scent left to keep things... atmospheric.


Now, from the archives of Julian Voss, in a manila folder marked “Whimsical But Noble: Drafts Never Read at the Rotary Club”, I present—


The Ballad of Sir Guffalot and the Whiffing Table

I. The Summons


In Camelot (or near enough,

Where sheep wore boots and beer was gruff),

A knight arose with modest fame,

Sir Guffalot—yes, that's his name.


He bore no sword, nor shining crest,

Just scented gloves and ironed vest.

His shield, a lid from Mother's stew,

His steed, a goat named Déjà Vu.


II. The Challenge


One day a scroll arrived by snail,

With scent of onions, old and stale.

It read:


“Attend the Whiffing Table—

Bring honor, scent, and if you're able,

Win back the Queen’s lost sense of smell

From Baron Halitosis Fell.”

Sir Guffalot, with mild regret,

Packed mints, a fan, and etiquette.

He brushed his goat, adjusted gait,

And farted thrice to punctuate.


III. The Duel


The Baron roared with putrid breath,

A fog so vile it threatened death.

But Guffalot, with pinky raised,

Inhaled once—and was unfazed.


“I’ve dined on Brussels, eggs, and beans,

With monks who meditate on spleens.

Your stench is loud, but lacks finesse—

I smell no narrative, no stress.”

Then, with a flourish bold and neat,

He turned and let one—discreet.

It rang with grace, it sang with art,

A minor key, yet full of heart.


IV. The Aftermath


The Baron swooned, the Queen awoke,

She sniffed, and nearly died of shock.


“What fragrance is this noble beast?”

“A subtle mix,” said Guff, “at least…

of chivalry, remorse, and thyme—

with distant notes of porcupine.”

And thus he won, our fragrant knight,

Restored the Queen, and cleared the night.

His name is sung when air is dense—

A hero of flatulence and sense.


Julian signed this version simply:


"Based on a misunderstood dream. Or a bean salad."

Shall I unearth the footnotes from the Royal Book of Etiquette on Medieval Wind? Or would you rather we pivot to tragedy—tragedy with capes and lingering odors?