Mira Murati's Thinking Machines Bet: Why 'Interaction Models' Could Be the Next AI Paradigm Shift

Mira Murati's Thinking Machines Bet: Why 'Interaction Models' Could Be the Next AI Paradigm Shift

For eighteen months, Mira Murati — the former CTO of OpenAI who briefly became its interim CEO during one of the most dramatic corporate upheavals in tech history — has been almost entirely silent. She left OpenAI in late 2024, founded Thinking Machines Lab, and then vanished from the public eye while competitors like Anthropic and xAI dominated headlines with billion-dollar raises and trillion-dollar valuations. But at the Bloomberg Tech conference this week, Murati finally stepped back into the spotlight with something that could fundamentally change how humans interact with AI: a new class of models called interaction models that process and respond to audio, text, and video simultaneously in real time — more like a phone call than a text message.

What Are Interaction Models and Why Do They Matter?

Every AI model you've ever used works the same way. You type a prompt or speak a command, and you wait. The model processes your input, generates a complete response, and sends it back. During that processing window, the model is effectively deaf and blind — it receives no new information from you until it finishes generating or you manually interrupt it. This turn-based architecture, often called "half duplex," has been a fundamental constraint on human-AI collaboration since the earliest chatbots.

Thinking Machines Lab is trying to break that constraint with what it calls interaction models: AI systems designed from the ground up to process continuous streams of audio, text, and video in 200-millisecond intervals. The technical term is "full duplex" — the model can listen and speak simultaneously, picking up on the texture of human communication that turn-based systems completely miss: mid-thought corrections, hesitations, interruptions, even the pauses that indicate someone is thinking.

"We think interactivity should scale alongside intelligence," the company wrote in its research announcement. "The way we work with AI should not be treated as an afterthought."

The research preview model, TML-Interaction-Small, responds in 0.40 seconds — roughly the speed of natural human conversation and significantly faster than comparable models from OpenAI and Google. Thinking Machines claims state-of-the-art combined performance in both intelligence and responsiveness, though independent verification will have to wait for a wider release planned for later this year.

Why Is the Current Turn-Based Approach Holding AI Back?

The problem with turn-based AI interaction goes deeper than mere inconvenience. In Thinking Machines' own analysis, they cite a striking admission from a recent frontier model card (referencing Anthropic's Claude documentation) that states: "When used in an interactive, synchronous, 'hands-on-keyboard' pattern, the benefits of the model were less clear. When used in this fashion, some users perceived [our model] as too slow and did not realize as much value."

This is a remarkable confession from one of the world's leading AI companies. It reveals that the current interface paradigm is actively undermining the value of even the most capable models. Users can't fully specify their requirements upfront and walk away — good AI outcomes benefit from a collaborative process where the human stays in the loop, clarifying and giving feedback along the way. But today's interfaces push humans out of the loop not because the work doesn't need them, but because the interface itself has no room for them.

Thinking Machines frames this as a "collaboration bottleneck." Drawing on research from communication theory (Clark and Brennan's work on "grounding in communication") and economics (Hayek's insights about decentralized knowledge), the company argues that the single-threaded, turn-based model of AI interaction creates a narrow channel that limits how much of a person's knowledge, intent, and judgment can actually reach the model. Their analogy is visceral: "Picture trying to resolve a crucial disagreement over email rather than in person."

How Does Thinking Machines' Approach Differ from Existing Solutions?

The distinction matters. Most existing "conversational" AI products — including OpenAI's Advanced Voice Mode and Google's Gemini Live — achieve real-time interaction through external scaffolding: they stitch together a speech-to-text model, a language model, and a text-to-speech model, with an orchestration layer coordinating between them. Each component operates independently, creating latency at every handoff and preventing true simultaneous processing.

Thinking Machines' interaction models, by contrast, are trained from scratch as a single integrated system. The company uses a "multi-stream, micro-turn design" — rather than one long input-output sequence, the model processes many short micro-turns across audio, video, and text streams simultaneously. This architectural choice is what enables the 200-millisecond response intervals and the ability to truly listen while generating output.

The company acknowledges that existing specialized models like Moshi, PersonaPlex, and Nemotron VoiceChat have explored similar territory, but argues that these are either too small-scale or too narrowly focused to compete with general-purpose frontier models on intelligence while also maintaining real-time responsiveness. Thinking Machines claims its approach achieves both.

Who Is Mira Murati and Why Does Thinking Machines Matter?

Mira Murati's credibility in the AI industry is hard to overstate. Born in Albania, she joined OpenAI in 2018 and served as CTO for six years — during which time she oversaw the development and launch of ChatGPT, GPT-4, and DALL-E. She was the technical leader behind some of the most consequential AI products ever shipped. Her brief, chaotic stint as interim CEO in November 2023 — when OpenAI's board fired Sam Altman and the company nearly imploded — made her a figure of both respect and fascination in Silicon Valley.

At Bloomberg, Murati was measured but revealing. She said that during the OpenAI crisis, she felt "clear about her decisions in each moment" — that protecting the mission and the team was the through-line. She said OpenAI "would have imploded" if not for her involvement. But she acknowledged that "clarity of intent is not the same thing as clarity about consequences," saying she would have pushed harder for more information and better transparency. Pressed on whether she still trusts Sam Altman, she sidestepped, steering toward a larger concern about "the concentration of consequential decisions in too few hands" across the industry.

That governance-focused worldview is consistent with Thinking Machines' deliberate approach. The company has spent 18 months building quietly, shipping only one public product — Tinker, a managed API for fine-tuning open-source models that researchers at Princeton, Stanford, Berkeley, and Redwood Research have been using for mathematical theorem proving, chemistry reasoning, and reinforcement learning experiments.

When Bloomberg's Emily Chang pressed her on high-profile researcher departures from Thinking Machines, Murati downplayed the issue, noting that "building a frontier AI lab from scratch compresses years of normal organizational volatility into months." She also acknowledged the nine-figure compensation packages driving AI talent wars but quipped, "When I wake up in the morning, I am not thinking about how to kill the competitor."

What Are the Challenges and Skepticism Surrounding Interaction Models?

Despite the impressive technical claims, there are reasons for healthy skepticism. First, this is a research preview, not a product. Thinking Machines has only released TML-Interaction-Small, and the "limited research preview" isn't coming for months. A wider release isn't planned until later this year. The benchmarks are self-reported, and independent validation will be critical.

Second, the competitive landscape is brutal. In the time Thinking Machines has been heads-down, Anthropic has filed for an IPO at a $965 billion valuation, OpenAI raised $122 billion, and xAI merged with SpaceX and is targeting a $2 trillion public offering. These companies are generating billions in revenue. Thinking Machines, by contrast, has only a fine-tuning API and a research preview to show for 18 months of work. As TechCrunch noted, "In that environment, staying heads down has diminishing returns; at some point, you have to make some noise just to remind the market you exist."

Third, there's the fundamental question of whether interaction quality is actually the bottleneck. Most enterprise AI adoption challenges today aren't about response latency — they're about accuracy, reliability, cost, and integration with existing workflows. Anthropic's $47 billion revenue run rate was built on Claude's coding and enterprise capabilities, not on conversational fluidity. Thinking Machines is betting that interaction quality becomes the next frontier as the intelligence gap between frontier models narrows.

What Does This Mean for the Future of Human-AI Collaboration?

If Thinking Machines delivers on its promise, interaction models could represent a genuine paradigm shift. The move from half-duplex to full-duplex AI interaction isn't just an engineering improvement — it's a conceptual leap that could unlock entirely new use cases: real-time AI tutoring that adapts to student confusion as it happens, AI pair programming partners that understand when you're stuck before you say so, and collaborative AI assistants that work with you the way a talented colleague would, rather than treating you as a ticket-submitter in a queue.

Murati herself framed the stakes in terms that echo her governance concerns: "If humans take their hands off the wheel too soon, the future will look very different, and not better." Interaction models, in her vision, are a way to keep humans in the loop — not by limiting AI's autonomy, but by making the loop itself wider, faster, and more natural.

The question isn't whether real-time, full-duplex AI interaction is technically possible. Thinking Machines' research preview, along with earlier work from companies like Hume AI and Kyutai, suggests it is. The real question is whether it matters enough to build a business around — and whether Mira Murati's quietly assembled team can ship it fast enough before the giants catch up.