AI Watch — May 2026: Sierra Raises $950M, OpenAI Partners With PwC, and the EU Delays Its AI Act

A Week Where the Money Caught Up With the Tech

The weekly cadence holds, and the May 3-8 stretch delivered. Sierra raised nearly a billion at a $15.8B valuation, OpenAI announced its first large-scale collaboration with a Big Four (PwC), and the EU legislator quietly pushed AI Act high-risk obligations out to late 2027. Meanwhile, OpenAI shipped three voice models doing real-time translation across 70 languages, and a California lab published a MoE trained on AMD that humbles models 10× its size.

I've sorted through it. No hype, no recap of 30 startups that each raised $5M. Here's what actually changes something for your stack and your decisions this week.

The Major News This Week

Sierra raises $950M at $15.8B valuation — May 4: Bret Taylor (former Salesforce co-CEO, OpenAI chairman) and Clay Bavor (former Google Labs head) double Sierra's valuation in six months. The round is led by Tiger Global and GV, with Benchmark, Sequoia and Greenoaks following. The truly impressive number isn't the round size — it's the ARR: Sierra went from $100M in November to $150M in February, and claims 40% of the Fortune 50 as customers. Concretely, Sierra builds AI agents for customer service (insurance renewal, mortgage refinancing, after-sales support). It's the first enterprise AI player to reach this traction without building its own foundation model — Sierra orchestrates Claude, GPT and their variants depending on the use case. The market signal: there's room for specialized orchestration layers above LLMs, provided you have deep integration in a specific vertical.

OpenAI x PwC: the first AI-native finance function — May 5: OpenAI and PwC announce a collaboration to build the first enterprise finance function entirely driven by AI agents. The playing field: procurement, payments, treasury, tax, accounting close, planning and reporting. The clever part is that OpenAI prototypes the system in-house first, on its own procurement function, before PwC industrializes it for clients. For B2B CFOs, this is the signal that we're past the POC phase: AI agents are landing in core finance workflows with a Big Four packaging it and putting its name on the line. If you have an "agentify finance" file that's been sitting on your roadmap for six months, this is the week it goes back to the top.

The EU pushes AI Act high-risk obligations to December 2027 — May 7: Provisional agreement between the Commission, Council and European Parliament under the Digital Omnibus. High-risk system obligations (Article 6(2), Annex III — biometrics, hiring, credit scoring) that were due August 2, 2026 are pushed to December 2, 2027. Systems embedded in products already covered by other regulations (machinery, medical devices) wait until August 2, 2028. This isn't a softening of substance — the obligations remain — but 16 extra months of runway for European companies. If you work on AI hiring, scoring, or biometrics, you just got a full product cycle to get into compliance. Use it to do things properly instead of slapping on a patch in late 2027.

The US government tests models before public release — May 5: The Center for AI Standards and Innovation (CAISI), under the Department of Commerce, signs pre-deployment evaluation agreements with Google DeepMind, Microsoft and xAI. CAISI will evaluate models before public release. Previous agreements with OpenAI and Anthropic (dating from 2024) are renegotiated to align with America's AI Action Plan directives. Anthropic, meanwhile, remains blocked at the Pentagon — the company refuses Claude usage for autonomous weapons. The take for you: the myth that US AI regulation kills innovation died this week. Americans will get regulation, but structured as an industry partnership. Not like the EU.

IBM Think 2026: watsonx Orchestrate Gen 2 and the multi-agent control plane — May 5: IBM announces its full stack to orchestrate hundreds, even thousands, of AI agents in the enterprise. The next generation of watsonx Orchestrate becomes a control plane that drives agents built on any platform with unified governance policies. Alongside: IBM Bob (an enterprise agentic dev partner), IBM Concert for IT operations, and a managed MCP server on watsonx.data to expose data as discoverable tools. For companies already on the IBM stack, it's a natural upgrade. For everyone else, it's mostly a signal about where the market is going: the value is no longer in the model, but in the layer that orchestrates heterogeneous agents with governance and observability.

New Models to Know

OpenAI ships three real-time voice models — May 7: GPT-Realtime-2 (first voice model with GPT-5-class reasoning), GPT-Realtime-Translate (simultaneous translation across 70 input languages into 13 output languages, at $0.034 per minute), and GPT-Realtime-Whisper (live streaming transcription, $0.017 per minute). For the first time, OpenAI ships a complete and programmable voice stack via the API. For companies doing multilingual customer support, training, or live content, test it immediately. Real-time translation across 70 languages at $2/hour opens use cases that were economically out of reach 12 months ago.

Zyphra ZAYA1-8B — May 6: Reasoning MoE, 8.4 billion total parameters with only 760 million active per token. Apache 2.0 license on Hugging Face, meaning it's production-deployable without licensing fees. The real news is the training: 100% on AMD stack, on a cluster of 1024 MI300x GPUs with Pensando Pollara interconnect. ZAYA1 hits scores competitive with DeepSeek-R1, Gemini 2.5 Pro, and Claude Sonnet 4.5 on math reasoning benchmarks. First real proof that the AMD ecosystem is viable as a frontier-model training alternative to NVIDIA. For independents and startups, it's an excellent base model to fine-tune on your domain without burning $50K/month in API costs.

GPT-5.5 Instant becomes ChatGPT's default — May 5: The rollout to all ChatGPT users (free tier included) is what makes this newsworthy — not a new model per se, but a scale change. GPT-5.5 Instant replaces GPT-5.3 Instant with two big upgrades: 52.5% fewer hallucinations on high-stakes prompts (medical, legal, finance) and a search tool that pulls from past conversations, files and Gmail for personalized answers. For Pro and Plus users, that means ChatGPT understands your work context across sessions without re-explaining what you're doing. Anthropic shipped persistent memory back in February on Claude. OpenAI catches up on the consumer side.

AI Tools to Activate This Week

Perplexity Finance Search in the Agent API — May 6: Perplexity adds a `finance_search` endpoint to its Agent API that returns in a single call: licensed financial datasets, real-time market data, and cited web sources. Concretely, you fetch prices, fundamentals, earnings transcripts, analyst estimates and insider activity without integrating each provider separately. Pricing is simple: $5 per 1000 invocations, on top of model token cost. If you're building any agent that touches finance (lead investor scoring, automated watchlists, competitive monitoring), this is the API that saves you weeks of integration and four data vendor contracts.

IBM Bob, the enterprise agentic dev partner — May 5: GA this week, this is IBM's answer to Claude Code and Cursor for the enterprise market. The difference: Bob ships with native security and cost controls (which you'd wire by hand with Claude Code). For CTOs at large structures who hesitate to let a dev paste their personal Anthropic token into Claude Code, Bob is the compromise that passes compliance. Less powerful than Claude on raw coding per benchmarks, but with the guardrails procurement and security teams need.

The Numbers That Matter

$15.8 billion — Sierra's valuation after this week's round. As a reminder, Sierra is 3 years old, doesn't build its own LLM, and earns its revenue orchestrating other people's models. That's exactly the "application layer above LLMs" thesis that was contested 18 months ago.

$150 million ARR for Sierra in February 2026, up from $100M in November 2025. 50% quarter-over-quarter ARR growth is a trajectory you only see on SaaS in the early product expansion cycles. It validates that enterprise AI agents are no longer in POC phase — they're in paid production.

52.5% fewer hallucinations — GPT-5.5 Instant's improvement on high-stakes prompts (medicine, law, finance) over GPT-5.3 Instant. Probably the most useful quality jump for businesses: a model that's wrong half as often on topics where errors are expensive.

760 million active parameters — Zyphra ZAYA1-8B's signature. This model beats competitors 10 to 30× larger on reasoning benchmarks. The race is no longer about raw size, it's about efficiency per parameter. And for self-hosting, that changes everything: a model that runs on modest hardware rivals proprietary frontiers.

16 extra months of runway for European companies on AI Act high-risk obligations. Not a softening, just a reprieve. If you have AI in hiring, credit scoring, or biometrics, you just gained a full product cycle. This is the moment to do things right, not to procrastinate.

70 input languages for OpenAI real-time translation at $0.034 per minute. That's roughly $2/hour for simultaneous multilingual translation. The first 24/7 multilingual customer support service at production-grade cost is arriving this year.

My Take — What I'm Actually Doing With This

A lot of signals converge this week. Here are my pragmatic recommendations after testing or benchmarking each release:

On enterprise agent orchestration → Sierra's round and IBM's watsonx Orchestrate Gen 2 announcement are two faces of the same thesis: value is moving down from the model layer to the orchestration layer. If you build B2B SaaS, this is the moment to look at whether your product can become an agent orchestrator for your vertical, instead of staying a productivity tool with a chatbot bolted on. At Skello, that's exactly the transition we kicked off — moving from HR planning to orchestration of HR tasks by agents.

On voice models → Test immediately if you do international customer support. The GPT-Realtime-Translate + GPT-Realtime-Whisper combo costs under $10/hour for simultaneous translation plus transcription. That's below the hourly rate of a human agent in most European markets. For B&Inside, it's exactly the kind of stack that opens export markets without hiring a local team.

On the EU AI Act → Don't use the reprieve to slow down. The obligations are unchanged, and companies that get compliant early will have a commercial edge with their enterprise clients (who already require compliance contractually). If you start today, you have 18 months to map your use cases, tag high-risk systems, and write your conformity declarations. No panic, but no slacking either.

On Zyphra ZAYA1 → If you do fine-tuning or self-hosting for sovereignty or cost reasons, ZAYA1-8B deserves a POC this week. 760M active params on AMD hardware is a stack that runs on European servers at reasonable cost, without depending on NVIDIA or a US provider. For European SaaS and the public sector, it's a serious candidate to weigh against Mistral.

On OpenAI x PwC → The most important signal for B2B CFOs. When a Big Four puts its name on an AI-native transformation of a support function, it goes up to the audit committee in every mid-cap. If you have a finance agentification file that's been sleeping, pull it out. The political timing is in your favor for the next 3 to 6 months.

See You Next Week

The weekly cadence is holding. Same thing next week: no hype, no buzzwords, just the releases that actually change something for your stack or your decisions.

If one of these news items resonates particularly — finance agentification, multi-agent orchestration, or AI Act compliance — let's talk. See you next week.