2 min watchJune 13, 2026

Little brains in every product

Five years building on-device first at Detail and Subwave led to Desert Ant Labs, a European AI lab shipping small audio and visual models for the 6 billion devices people already own.

Paul Veugen

For five years we've built on-device first at Detail and Subwave. Going local let us ship the best possible experience and commoditize video creation, from local rendering to captions to enhancement, while keeping your video on your device.

Over the past years I've looked for the most ambitious founders building the foundations for on-device inference, the pieces that could power Detail and Subwave and millions of other apps. With NP-Hard we backed the stellar team at Mirai, pushing LLM inference to 1,000 tokens per second. And at Detail, we use Argmax's lightning-fast on-device speech models. But there's so much more opportunity for local models and SDKs to help developers build intelligent features with audio, video, and images.

We carry more compute in our hands than sits in the entire cloud.

While all the money flows to data centers and cloud inference, we’re shipping billions of insanely capable devices a year. We carry more compute in our hands than sits in the entire cloud.

When usage doesn't hurt

Every developer I met at WWDC this week had a wish list of on-device AI features they'd build if cost wasn't an issue.

Every developer I spoke to at WWDC this week had the same story: a wish list of AI features they'd build if cost weren't a factor, or expensive tokens they spend on existing features they'd happily swap for a local model. The demand is there. The building blocks aren't.

Commoditizing intelligence

That's why we're building Desert Ant Labs, a European on-device AI lab. We're going to ship dozens of small, opinionated audio and visual models and SDKs that drop into any product with a few lines of code. No inference cost. Nothing leaving the device. Running on the 6 billion devices people already own.

The first models already power Detail and Subwave. Before the end of the year, a dozen will be ready for third-party developers to build their own features.

When inference cost drops to zero, the way we build products changes entirely.

On-device models open new ways to build, mixing free local compute with cloud-based infrastructure. When inference cost drops to zero, product design changes at a fundamental level. You stop making trade-offs between capability and cost on every feature. And with audio, video, and images not leaving your device, you can build intelligent features without giving up privacy or security.

We're going to put little brains in every product.