AI Watch — May 2026: Google I/O Drops Three Gemini Models, OpenAI Spins Up DeployCo, and Anthropic Bets on Gates

A Week Where Platforms Took the Mic

Weekly number 3 of the series, and the May 11-22 stretch was dominated by platform announcements. Google opened its full toolbox at I/O — three new Gemini models, a general-purpose agent, a video generator. OpenAI launched its enterprise deployment subsidiary, with a London acquisition bringing 150 engineers along for the ride. Anthropic signed two major deals: $200 million with the Gates Foundation over four years, and an extension of its PwC alliance one week after OpenAI's. And Mira Murati shipped her first model at Thinking Machines.

I've sorted through it. No hype, no recap of 25 startups that raised $5M each. Here are the announcements that actually change something for your stack and your tradeoffs this week.

The Major News This Week

OpenAI launches the Deployment Company and acquires Tomoro — May 12: OpenAI opens a subsidiary dedicated to enterprise deployment, capitalized at $4 billion, while acquiring Tomoro — an AI consulting firm based in London that brings 150 full-time Forward Deployed Engineers. The model is lifted straight from Palantir: you don't send slides, you send engineers who sit inside the client for months building the system. Tomoro has the references — Fidelity International, Virgin Atlantic, Tesco, the NBA, and Supercell, where they delivered an in-game support agent for 110 million players in 12 weeks. The signal for you: Accenture, Deloitte, and Capgemini just took a punch. If you're in traditional consulting selling AI integration to the Fortune 500, your competition now comes directly from the model vendor, with margins no consultant can match.

Anthropic and the Gates Foundation: $200M over 4 years — May 14: Anthropic signs its largest philanthropic commitment yet: $200 million in grants, Claude usage credits, and technical support spread over four years, dedicated to global health, education, life sciences, and economic mobility. First focus on the health side: polio, HPV, and eclampsia/preeclampsia in low- and middle-income countries. On the education side, Anthropic and Gates are co-developing tools for math tutoring, college advising, and curriculum design across the US, sub-Saharan Africa, and India. It's the first time a frontier lab puts this much money on the table for non-monetizable use cases. The strategic subtext: Anthropic positions itself as the serious and responsible player against OpenAI on sovereign and public-sector markets — a turf where state trust weighs more than benchmarks.

PwC extends its Anthropic alliance — 30,000 staff certified on Claude — May 14: And here comes the twist. One week after the OpenAI x PwC announcement on the finance function, PwC doubles down with Anthropic: Claude Code and Cowork rolling out across 300,000 staff, 30,000 Claude-certified, and a joint Center of Excellence. First business unit anchored on Anthropic: the Office of the CFO. The numbers are landing on the ground: insurance underwriting goes from 10 weeks to 10 days, security tasks from hours to minutes, and clients are reporting up to 70% delivery improvements. The take: PwC isn't picking a side, it's benchmarking both vendors internally to arbitrage client-by-client based on use case. That's the pattern every Big Four will adopt. If you're building B2B SaaS and you put all your eggs at one provider, watch what PwC does and copy it.

Google I/O 2026: three new Gemini models in one breath — May 19: Google opens the gates. Gemini 3.5 Flash becomes the default in the Gemini app and in Search AI Mode worldwide, with 4× the generation speed of 3.1 Flash for pricing landing around one-third of comparable frontier models. Gemini Spark, a proactive agent available in beta to Google AI Ultra subscribers at $250/month, running tasks in the background across Gmail, Calendar, Docs, and Drive without being asked at each step. And Gemini Omni, a cinematic video model that generates, remixes, and edits from text prompts, images, or existing clips. The strategy is unambiguous: Google leans on its Search/Gmail/Android distribution to charge a fourth time for the same users OpenAI and Anthropic already monetize. At $250/month, AI Ultra is now more expensive than ChatGPT Pro and Claude Max combined.

Thinking Machines ships its Interaction Models — May 11: First public release from Mira Murati since she raised $2B at $12B valuation last year. TML-Interaction-Small is a 276-billion-parameter MoE with 12 billion active, trained from scratch for full-duplex operation — it listens and speaks at the same time, like a human on the phone, with a 0.40-second response time. The difference vs OpenAI or Google voice models: it's not a harness bolted onto a sequential LLM, it's a native real-time architecture. Available in limited preview. If you work on voice assistants, live customer support, or conversational interfaces, this is the first credible demo of something that could replace the OpenAI Realtime plus post-processing combo you've probably duct-taped together.

New Models to Know

Gemini 3.5 Flash — May 19: Google's new workhorse. Generation speed up 4× vs Gemini 3.1 Flash, pricing positioned around one-third of comparable frontier models. It's the natural candidate for any code currently calling Gemini Flash: the migration is worth an afternoon of testing. For workflows with lots of short parallel requests (extraction, classification, routing), it's probably the best price/performance ratio available. Test it by swapping GPT-5.4 Mini or Claude Haiku calls in your pipelines.

Gemini Omni — May 19: Google's cinematic video model. Generation from text prompts, but also remixing from an image or existing clip, and conversational editing (make this a wide shot, swap the sky for a sunset). Direct competitor to the Chinese Seedance models and the latest Veo wave. For product marketing or social content, the real news is iterative editing in natural language: you stop regenerating prompts blindly to iterate.

TML-Interaction-Small — May 11: 276 billion total params, 12 billion active, MoE. The thing to remember: it's natively full-duplex, not a layer above a sequential LLM. The model processes audio continuously in 200ms blocks and responds while you're still speaking. Available in preview at Thinking Machines only, no open API yet. But the architectural bet matters: if real-time interaction becomes native, voice agents jump from awkward lag to natural overnight.

AI Tools to Activate This Week

Gemini Spark — May 19: Google's proactive agent, in beta for Trusted Testers and AI Ultra subscribers starting next week. Spark runs continuously across your Google apps (Mail, Calendar, Docs, Drive) and proposes actions without being explicitly prompted: daily digests, reply drafts, thread clustering, suggested Calendar blocks. If you're already deep in Google Workspace, test it immediately. If you're on Microsoft 365, wait for the Copilot Calendar Agent Microsoft shipped last week — the direct competitor is coming.

OpenAI Deployment Services: Available since May 12. If you're a large organization struggling to move from POC to production on GPT-5.5, this is the new offering you look at. Typical engagement: 6 to 12 months, OpenAI/Tomoro engineers embedded inside you, industrial-grade deliverables. Pricing not public, but positioned to take on eight-figure transformation contracts at Accenture. Concretely, you pay OpenAI to do the work you'd have subcontracted to a consulting firm — except OpenAI knows its model better than anyone.

PwC Claude Center of Excellence: Odd to list a consulting firm in tools, but that's exactly what this is. If you're a mid-cap already a PwC client on something else and you want to drive a Claude Code rollout across a hundred engineers without reinventing everything in-house, the Center of Excellence is the channel. They have the stack, the governance templates, and 30,000 certified staff incoming. Benchmark it against your historical integrator before going custom.

The Numbers That Matter

$4 billion — initial capitalization of OpenAI's Deployment Company. First time a model vendor goes head-on at the consulting services layer. For the major firms, that's a new direct competitor on a market they've dominated unopposed for 30 years.

150 Forward Deployed Engineers — the Tomoro team joining OpenAI on day one. Compare that to the few dozen solution engineers Anthropic and Google currently have inside enterprise clients. OpenAI buys, in one shot, a field presence that would have taken 18 months to recruit.

$200 million over 4 years — the Anthropic x Gates Foundation commitment. That's the equivalent of the annual revenue of several so-called leading AI startups. Anthropic can afford this spend because its API margin has exploded. The signal for the sector: we're entering the age of massive philanthropic programs financed by AI revenue.

30,000 PwC staff Claude-certified out of the firm's 300,000 total. 10% of the workforce certified by end of 2026. If you're a competing consulting firm and you haven't started a similar program, you've already lost a full cycle. This isn't R&D anymore, it's deployment.

4× faster — Gemini 3.5 Flash vs Gemini 3.1 Flash, at a cost around one-third of comparable frontier models. For high-volume workflows (extraction, classification, routing), it's probably the best price/performance ratio available today. Remains to be seen how GPT-5.5 and Claude respond.

0.40-second response latency — TML-Interaction-Small at Thinking Machines. That's under the psychological threshold for natural human conversation (around 200-600ms). For real-time voice use cases, the awkward lag you suffer today on most assistants could disappear within 6 months.

70% delivery improvement — what PwC is reporting on client engagements with Claude. Take it with a grain of salt — that's the client speaking about its client. But the concrete examples are there: insurance underwriting from 10 weeks to 10 days, security tasks from hours to minutes. It's the first time a Big Four puts productivity metrics this high on the table publicly.

My Take — What I'm Actually Doing With This

Lots of platform announcements this week. Here's how I'm prioritizing decisions after testing or benchmarking the releases I could access:

On Gemini 3.5 Flash in your pipelines → Test immediately if you're running GPT-5.4 Mini, Claude Haiku, or Gemini 3.1 Flash at volume. An afternoon of tests on your existing prompts will tell you if the migration is worth the API bill. At B&Inside, this is exactly the kind of tradeoff we revisit on every major economy-model release: the bill on extraction and classification represents more than the reasoning model.

On OpenAI's Deployment Company → If you're a large organization struggling to industrialize your AI POCs, ask them for a call. The pitch of having OpenAI engineers embedded is radical, and contracts are rare for now, so there's a short window of vendor in seduction mode to exploit. Personally I'd push Skello to benchmark the offering against its historical integrator.

On the PwC x Anthropic + OpenAI combo → The pattern to copy. No reason to put all your eggs at one vendor when both frontier players are playing who offers the best enterprise deal. If you build B2B SaaS, architect your stack to switch models in an hour: a homemade router or OpenRouter is 2 days of dev that protects you for the next 18 months.

On Gemini Spark and proactive agents → Test it if you're already deep on Google Workspace. Otherwise, wait two months: OpenAI Workspace Agents and the new Copilot Calendar Agent will ship comparable features, and the choice will come down to which ecosystem you're already anchored in, not the underlying model.

On Thinking Machines and real-time interaction → Watch, don't act. The model isn't on open API yet. But if you're prototyping a voice assistant or a live support agent, keep an eye on it: the conversational quality gap between native full-duplex and systems duct-taped together via API will become obvious by fall. This is the moment to learn the tech, not migrate.

See You Next Week

The weekly cadence is holding. Same next week: no hype, no buzzwords, just the announcements that move your stack or your decisions.

If one of these stories resonates — multi-model strategy, the vendor-consultant power shift, or latency tradeoffs on voice agents — let's talk. See you next week.