← All posts

Voice dictation should be free and open source.

Why open source speech-to-text makes sense

Matthew WangMatthew WangMaintainer · FreestyleMay 31, 2026 · 5 min readShare
Voice dictation should be free and open source.

Lately, speech-to-text voice dictation apps like Wispr Flow have taken the world by storm. The ads are everywhere on your social media, on YouTube, promoted by your favorite social media influencer.

Hi, I'm Matthew. I'm an engineer and maintainer of Freestyle, the free and open-source voice dictation app.

I'm not the person you'd expect writing this. I have been a keyboard maximalist for the longest time. I bought a Keychron keyboard, obsessively decked it out with accessories, and finally broke 110 words per minute last year. Voice dictation struck me as a gimmick at the time.

This year I caved into the ads and tried it out. I became obsessed. It's changed the way that I code, letting me talk directly to my agents and spin up tasks by voice. When writing articles like this one, it has allowed me to thought dump messily and polish later. I became an advocate of this way of human-computer interaction.

I'm excited to share the preview of Freestyle, and I want to lay out what our community is building and why

We should not be paying for dictation apps

Before building Freestyle, I was a heavy Wispr Flow user. Credit where it's due, they've shipped a genuinely polished product. The transcription latency is well optimized, the post-processing works great, and the whole thing feels premium. It's a delight to use.

But a couple of things didn't sit right with me. Why are we paying $12 a month for dictation, and what sense is Wispr raising a Series A with a $2B valuation building a technology that's been around for decades?

Let's do some napkin math on unit economics.

A typical knowledge worker dictates 10–15 minutes a day. Run that through OpenAI's Whisper at $0.006 per minute, add a cheap language model for post-processing to clean up the text, and you land at roughly $3 per user per month. This is a generous estimate.

You're paying $12. It costs them $3 to serve you. You're paying them an excess $9 a month for a pretty UI.

And let's talk about privacy. Every sound you make, every word you say and every transcription passes through their servers. They can wave a SOC 2 compliance or give you zero-day retention agreements. It doesn't change the fact that this service is a standing risk to your privacy

We live in a day where state-of-the-art transcription models can run 100% locally and on device. Every private thought you say and transcript generated doesn't have to ever leave your device. That's what we're building with Freestyle.

Voice dictation is a commodity, and it should be free and open source for the community.

Freestyle's open source community to further voice HCI

Wispr Flow is polished but closed. Plenty of open source alternatives exist, yet none of them feel nearly as polished. This gap is our motivation.

Freestyle will deliver sub-second transcription latency with post-processing strong enough to keep your text clean. We want to match that premium feel while still respecting your privacy and sparing you the subscription. For an open-source project, clearing that bar is an uphill climb. Proving that it is possible is our mission.

Longer term, we want Freestyle to be the frontier open-source harness for voice models: a place to discuss speech-to-text optimization and voice HCI, and a platform where researchers can test the latest voice models openly, across thousands of real users. We firmly believe open source is the best way to push this space forward.

What's coming

First off, I wanted to shout out those who have been a part of our community in the first week of existence. Aditya (maintainer), Isaac, Uday, Sri, Shikhar, and other maintainers that have helped build this project.

We're launching a preview of Freestyle available on our website and GitHub. Freestyle is a desktop app that works on Mac, Windows, and Linux. We've built support for cloud transcription models, local on-device models, and post-processing to clean your text.

We have a long road ahead of us to build a great product. Freestyle is a technically challenging and ridiculously fun project to work on. We're looking to build a community of contributors, all skill levels welcomed.

If this all sounds interesting, please consider checking out our project!

Matthew Wang
Matthew Wang
Maintainer · Freestyle

Maintainer of Freestyle Voice

Found this useful? Pass it on.