88th Edition Download

Google’s Gemini model learns to interact with browsers; Sam Altman predicts agentic work weeks and prompt-run startups; and xAI debuts Imagine v0.9—video generation with audio plus voice commands

In partnership with

Don’t get SaaD. Get Rippling.

Disconnected software creates what we call the execution tax: wasted time, duplicate work, and stalled momentum. From onboarding checklists to reconciling expenses, SaaD slows every team down.

Rippling is the cure. With one system of record, you can update employee data once,and it syncs everywhere: payroll, benefits, expenses, devices, and apps.

Leaders gain real-time visibility. Teams regain lost hours. Employees get the seamless experience they deserve .

That’s why companies like Barry’s and Forterra turned to Rippling – to replace sprawl with speed and clarity.

It’s time to stop paying for inefficiency.

Don’t get SaaD. Get Rippling.

This Week in AI:

No jargon, no filler—just the biggest AI developments worth knowing right now. Perfect for quick industry insights, so you can skip the buzzwords and get straight to the good stuff. Let’s dive into this week’s AI shake-ups, just as promised:

Google DeepMind just dropped a model that actually “uses” a computer via browser actions, opening possibilities for more seamless AI agents across software. Meanwhile, Sam Altman opened up about what’s next at OpenAI: agentic work, novel discovery, and even zero-person startups built by prompts. And xAI isn’t staying quiet either—its new Imagine v0.9 tool can generate videos (audio included) and is going voice-first, letting you tell it what to create hands-free.

Let’s get into it.

In This Issue:

TL;DR:

Google’s Gemini 2.5 Computer Use model is designed to interact with UI elements in browsers—typing, dragging, submitting forms, etc. It supports about 13 actions and is built to help with interfaces that lack APIs. It’s previewed via Google AI Studio / Vertex AI and showcased in demos like “play 2048” or “browse Hacker News.”

Our Take:

Until now, agentic models mostly acted via APIs or backend hooks. Gemini’s computer-use mode brings AI directly into the user interface layer. That opens paths for automation on legacy systems, UI testing, hybrid tools, or agents acting where APIs don’t reach. For product teams, it’s a signal: build with UI affordances in mind; expect more agents to “control” frontends soon.

TL;DR:

In his Dev Day interview with Rowan Cheung, Sam Altman said AI is entering an era of “novel discovery” (scientists are already using it for breakthroughs). He predicted work might “look less like work” as agentic systems take over time-based tasks. He’s optimistic about zero-person startups—fully prompt-driven ventures. And he thinks Codex isn’t far from autonomously delivering weeks of work.

Our Take:

Two things stand out: (1) Altman doubling down on agentic time, not just tool assistance, means the next frontier is AI that does, not just suggests. (2) The idea of zero-person startups is bold—and scary: it assumes models can bootstrap companies autonomously. For builders, the key is aligning models to economic incentives. If AI can genuinely generate value, then who “owns” it (you, prompt engineers, the model provider) becomes the next battleground.

TL;DR:

xAI’s new Imagine v0.9 (built on Grok’s Aurora engine) can create videos with audio, and image + motion, in about 15–20 seconds. New voice-first interface mode also lets users “Open App in Voice Mode” to generate content hands-free.

Our Take:

Video + sound + motion is the next step after still image generation—but doing it fast and well is hard. xAI’s push into voice-first control is an interesting UX experiment: imagine telling your phone “make me a 10-second clip of this idea” and getting it instantly. The challenge will be managing coherence, style consistency, and user expectations. If voice becomes a dominant content interface, tools built for typed prompts may feel clunky fast.

🚀 Thank you for reading The Download

Your trusted source for the latest AI developments to keep you in the loop, but never overwhelmed. 🙂 

*Want to get in front of 600k+ readers? Email [email protected]

Reply

or to participate.