All posts
In this guide
  1. Why speech to text on Mac is better than ever
  2. How to enable built-in Mac dictation
  3. Limitations of built-in dictation
  4. Third-party speech to text apps compared
  5. Voice Keyboard Pro deep dive
  6. How to set up Voice Keyboard Pro on Mac
  7. 8 speech to text tips for Mac users
  8. Speech to text for specific workflows
  9. Privacy and offline speech to text
  10. Frequently asked questions

Why Speech to Text on Mac Is Better Than Ever in 2026

Speech to text on Mac in 2026 is a fundamentally different experience from what it was even two or three years ago, and the reason is straightforward: Apple Silicon changed the economics of on-device machine learning.

Before M-series chips, speech recognition on a Mac meant one of two things. Either you used Apple's built-in dictation, which was mediocre but free, or you sent your audio to a cloud server, which was more accurate but introduced latency, privacy concerns, and a dependency on your internet connection. Neither option was good enough to replace typing for most people.

Apple Silicon M1 through M4 chips have enough Neural Engine throughput to run full Whisper-class speech recognition models locally, in real time, with no network connection. This is the same family of models that powers the best cloud-based transcription services, except now the processing happens on your machine. Your audio never leaves your Mac. Latency drops to under a second. And accuracy on everyday speech is now high enough that you can dictate an entire document without stopping to correct errors every other sentence.

The result is that speech to text on Mac has gone from a novelty feature to a legitimate input method that some people use for the majority of their text input throughout the day. This guide walks through every option available, from the free built-in tool to specialized third-party apps, so you can find the approach that actually fits your work.

How to Enable Built-In Mac Dictation

Every Mac running macOS Ventura or later includes an on-device dictation feature. Here is how to set it up, step by step.

1 Open System Settings. Click the Apple menu in the top-left corner of your screen and select System Settings (called System Preferences on older macOS versions).
2 Navigate to Keyboard. In the left sidebar, scroll down and click Keyboard.
3 Enable Dictation. Scroll to the Dictation section at the bottom of the Keyboard settings. Toggle the switch to On. macOS will ask you to confirm and may download an on-device speech model (about 50-100 MB depending on language).
4 Choose your language and microphone. Select your preferred language from the dropdown. If you use an external microphone, select it under the Microphone source. The default is "Automatic," which usually selects the built-in mic.
5 Set your shortcut. The default shortcut is pressing the Globe key (fn) twice. You can change this to the Fn key, left or right Command key, or a custom shortcut of your choice.
6 Start dictating. Open any app with a text field — Notes, Mail, Pages, Safari — click where you want text to appear, and press your dictation shortcut. A microphone icon appears near your cursor. Speak naturally. Press the shortcut again or click "Done" to stop.

Built-in dictation supports basic voice commands: "period," "comma," "new line," "new paragraph," "exclamation point," "question mark," "open quote," "close quote," "caps on," "caps off," and "all caps." These work inline while you speak.

On Apple Silicon Macs, all dictation processing happens on-device by default, so your audio is never sent to Apple's servers. On older Intel Macs, dictation requires an internet connection unless you previously downloaded the offline model.

Limitations of Built-In Dictation

Apple's dictation is a capable baseline, but it has real limitations that become obvious within a few days of regular use. Understanding these is important because they are the reasons people seek out third-party alternatives.

No custom vocabulary

You cannot add custom words, acronyms, product names, or technical terms to Apple's dictation model. If you frequently say "Kubernetes," "Terraform," "HIPAA," or your company's product name, you will get incorrect transcriptions and have no way to teach the system. This is a dealbreaker for anyone in a specialized field.

Toggle activation, not hold-to-speak

Built-in dictation uses a toggle model: you press the shortcut to start, speak, then press again to stop. In practice, this leads to forgetting to turn it off, accidentally transcribing conversations with colleagues, picking up ambient noise, and awkward moments where you realize the microphone icon has been listening for the past five minutes. There is no hold-to-speak mode where dictation stops the instant you release the key.

No text cleanup or reformatting

Apple dictation transcribes exactly what you say, including filler words ("um," "uh," "like"), false starts, and repetitions. There is no AI layer that cleans up the transcript into polished prose. What you speak is what you get, and spoken language is almost always rougher than written language.

Inconsistent app compatibility

Dictation works reliably in native macOS apps like Notes, Mail, Pages, and TextEdit. In Electron-based apps — which includes Slack, VS Code, Notion, Obsidian, Discord, and many others — activation can be flaky. The cursor may not be in the right position, text may appear in the wrong field, or the dictation shortcut may not activate at all. Browser-based apps like Google Docs, Gmail in Chrome, and Figma are similarly hit-or-miss.

No voice isolation

If someone else is talking nearby, built-in dictation will transcribe their words too. There is no speaker isolation or voice filtering to distinguish your voice from background speech. This makes it impractical in open offices, coffee shops, or any shared space.

No AI-powered actions

Modern speech-to-text tools can do more than transcribe. They can rewrite your dictation to match a specific tone, summarize what you said, translate it, or apply formatting rules. Apple's dictation does none of this — it is strictly a speech-to-text pipe with no intelligence after transcription.

Third-Party Speech to Text Apps for Mac: Comparison

There are now several dedicated speech-to-text apps for Mac, each with different strengths. The following table compares the five most relevant options as of April 2026.

Feature Voice Keyboard Pro Wispr Flow Dragon Superwhisper Apple Dictation
Price Free tier + $8/mo $10/mo $15/mo (legacy) $8/mo Free
Hold-to-speak Yes Yes No Yes No
Works offline Yes No Yes Yes Yes (Apple Silicon)
System-wide Yes Yes Limited Yes Partial
AI text cleanup Yes (Smart Rewrite) Yes No No No
Custom vocabulary Yes No Yes No No
Profession-aware Yes No No No No
Voice isolation Yes No No No No
AI actions (translate, rewrite) Yes Yes No No No
Native Mac app Yes Yes Yes Yes Yes (built-in)
Apple Silicon optimized Yes N/A (cloud) No Yes Yes

A few notes on this table. Dragon NaturallySpeaking was the gold standard for speech recognition for over a decade, but Nuance (now owned by Microsoft) has effectively discontinued the Mac version. It still works if you have an existing license, but there are no updates and no new development. It is not a realistic option for new users in 2026.

Wispr Flow is a well-designed app that processes audio in the cloud using GPT-4o's audio capabilities. It is fast and accurate, but requires an internet connection for every dictation. If you work with sensitive content or need offline capability, this is a significant limitation.

Superwhisper runs Whisper locally on your Mac and offers solid transcription. It is a straightforward transcription tool without the AI cleanup or profession-aware features that Voice Keyboard Pro offers. A good option if you want basic, private, local transcription without extras.

For a broader look at the best dictation apps for Mac, we have a separate detailed comparison.

Voice Keyboard Pro: A Closer Look

Since this guide is published on the Voice Keyboard Pro blog, it is worth being transparent about what Voice Keyboard Pro is and what makes it different. We will try to be factual rather than promotional here.

Hold-to-speak activation

Voice Keyboard Pro uses a hold-to-speak model by default. You hold your chosen hotkey (commonly the right Option key or Fn key), speak, and release when you are done. Transcription appears at your cursor within about 500 milliseconds of releasing the key. There is no mode to toggle, no floating window to dismiss, and no chance of accidentally leaving the microphone on. This single design decision eliminates most of the friction that causes people to abandon dictation tools.

Profession-aware transcription

When you first set up Voice Keyboard Pro, you choose your profession: software developer, doctor, lawyer, writer, student, or general. This selection adjusts the speech model's vocabulary bias and formatting preferences. A software developer gets accurate transcription of terms like "kubectl," "WebSocket," and "OAuth." A doctor gets "auscultation," "methylprednisolone," and "prn." The model already knows these words — the profession setting tells it which vocabulary domain to weight more heavily.

Voice isolation

Voice Keyboard Pro uses on-device audio processing to isolate your voice from background noise and other speakers. In practice, this means you can dictate in a coffee shop, an open-plan office, or a room where someone else is talking, and Voice Keyboard Pro will transcribe only your speech. This is not noise cancellation (which reduces ambient hiss) — it is speaker diarization applied at the input level.

Smart Rewrite

After transcription, Voice Keyboard Pro can optionally run your text through an AI cleanup step called Smart Rewrite. This removes filler words, fixes grammar, adjusts punctuation, and reformats the text to read like written language rather than spoken language. You can toggle this on and off per-dictation. When Smart Rewrite is on, you can speak more naturally — "um, so basically what I'm thinking is we should probably maybe move the deadline to Friday" becomes "I think we should move the deadline to Friday."

AI actions

Beyond transcription and cleanup, Voice Keyboard Pro supports AI-powered actions triggered by voice. You can ask it to translate your dictation into another language, summarize a longer passage, rewrite in a specific tone (formal, casual, concise), or generate a reply to selected text. These actions run locally or through a private API, depending on the complexity.

Offline by default

Core transcription in Voice Keyboard Pro runs entirely on your Mac using Apple Silicon's Neural Engine. No audio leaves your device for speech recognition. The AI actions that require an LLM (Smart Rewrite, translation, etc.) use an API call, but the raw transcription is always local. If you turn off Smart Rewrite, Voice Keyboard Pro works with zero internet connection. For more on this, see our guide to offline voice to text on Mac.

How to Set Up Voice Keyboard Pro on Mac

Setting up Voice Keyboard Pro takes under a minute. Here is the process from download to first dictation.

1 Download Voice Keyboard Pro. Go to voicekeyboardpro.com and download the installer package. It is a standard macOS .pkg file, about 200 MB (the speech model is bundled).
2 Install and open. Run the installer. Voice Keyboard Pro appears as a menu bar icon (a small microphone) in your Mac's top menu bar. It does not open a window — it lives in the menu bar.
3 Grant permissions. On first launch, macOS will ask for Microphone access and Accessibility access (needed to type text into other apps). Grant both in System Settings > Privacy & Security.
4 Choose your profession. Voice Keyboard Pro asks you to select your profession during onboarding. This optimizes the speech model's vocabulary for your field. You can change this later in settings.
5 Set your hotkey. The default hold-to-speak hotkey is the right Option key. You can change this to any modifier key or key combination. Pick a key you do not use for other shortcuts.
6 Try your first dictation. Open any app — Notes, Mail, Slack, your browser — click in a text field, hold your hotkey, and speak a sentence. Release the key. Your text appears at the cursor.
7 Try Smart Rewrite. In Voice Keyboard Pro's menu bar settings, enable Smart Rewrite. Now dictate something more casually — include "um" and "like" and trail off mid-sentence. Release the key. Notice that the output is cleaned up into polished text. Toggle Smart Rewrite off when you want verbatim transcription.

The free tier gives you enough daily dictations to test whether Voice Keyboard Pro fits your workflow before you decide whether to subscribe. There is no credit card required to start.

8 Speech to Text Tips for Mac Users

These tips apply regardless of which speech-to-text tool you use. They are the habits that separate people who try dictation once and abandon it from people who make it a permanent part of their workflow.

1. Speak in complete sentences

Speech recognition models are statistical. They use context from earlier words to predict later words. If you speak in fragments — "the meeting... uh... Tuesday... should be..." — the model has less context and makes more errors. Complete sentences like "The meeting on Tuesday should be rescheduled to Thursday" give the model a full clause of context and produce dramatically better accuracy.

2. Do not slow down for the microphone

A common instinct is to speak slowly and over-enunciate when dictating. This actually hurts accuracy because the speech model was trained on natural speech patterns. Speak at your normal conversational pace. If anything, slightly faster is better than slightly slower.

3. Use a consistent microphone

Your Mac's built-in microphone is fine for dictation. AirPods and other Bluetooth headsets work but sometimes introduce latency or audio quality issues. If you dictate frequently, consider a USB desk microphone — not for sound quality per se, but for consistent positioning and rejection of background noise. Once you find a microphone that works well, stick with it.

4. Dictate first, edit later

Do not try to dictate a perfect sentence. Speak your thought, release the hotkey, and move on to the next thought. Edit everything at the end. This is the same "write drunk, edit sober" principle that applies to all writing, and it is even more important with dictation because stopping to correct errors mid-sentence breaks your verbal flow.

5. Learn the punctuation commands

Both Apple's dictation and third-party apps recognize spoken punctuation. "Period," "comma," "question mark," "exclamation point," "open parenthesis," "close parenthesis," "new line," and "new paragraph" are the essential ones. Memorize these and they become second nature within a day.

6. Close the door or use voice isolation

Background speech is the biggest source of dictation errors. If you work in a shared space and your app does not have voice isolation, close your office door or use a headset with noise-canceling microphone. If your app does have voice isolation (like Voice Keyboard Pro), this is less of a concern, but a quiet environment still produces the best results.

7. Start with email

If you are new to speech to text, start by dictating emails. Emails are short, conversational in tone, and tolerant of imperfect phrasing. Once dictating emails feels natural, expand to Slack messages, then documents, then notes. Trying to dictate a complex technical document on your first day is a recipe for frustration.

8. Keep typing for certain tasks

Speech to text is not a replacement for typing in every situation. Code, spreadsheet formulas, URLs, filenames, and heavily formatted content are still faster to type. Use dictation for prose — emails, messages, documents, notes, comments — and the keyboard for structured, symbolic content. The goal is to use the right input method for each task, not to eliminate the keyboard entirely.

Speech to Text for Specific Workflows

Email

Email is the highest-ROI use case for speech to text. Most people send 20 to 50 emails per day, and most of those emails are conversational in tone — exactly the kind of content that speech recognition handles well. The workflow is simple: click in the email body, hold your hotkey, speak the entire message, release. With Smart Rewrite enabled, the output reads like a well-composed email even if your spoken version was informal. An email that takes two minutes to type takes 20 seconds to dictate. Over a week, this recovers hours.

Slack and team chat

Slack messages are short but frequent. The per-message time savings are small (maybe 15 seconds each), but the cumulative savings across 50 to 100 daily messages are significant. More importantly, dictation removes the friction that causes people to delay responding. When sending a Slack message takes 5 seconds instead of 30, you respond immediately rather than adding it to a mental queue. For more on using voice to text in chat apps, see our overview.

Writing and content creation

For blog posts, reports, documentation, and other long-form content, speech to text changes the writing process. Instead of staring at a blank page, you talk through your ideas. The first draft comes out rough — that is expected — but it comes out fast. A 1,500-word blog post draft takes about 10 minutes to speak, versus 45 minutes to an hour to type. The editing phase takes the same amount of time either way, so the net savings come from the initial drafting speed.

Coding

This one is nuanced. You would not dictate raw code — syntax, brackets, and variable names are better typed. But a surprising amount of programming work is prose: commit messages, pull request descriptions, code review comments, documentation, Jira tickets, and inline comments. Dictation is excellent for all of these. Some developers also use speech to text to describe what a function should do, then let their AI coding assistant generate the implementation from that description.

Medical and clinical notes

Doctors and nurses spend a staggering amount of time on clinical documentation. Speech to text with profession-aware vocabulary (like Voice Keyboard Pro's medical mode) handles drug names, anatomical terms, and clinical abbreviations with high accuracy. The workflow fits naturally into clinical practice: examine the patient, then dictate the note immediately afterward while the details are fresh. SOAP notes, discharge summaries, and referral letters all work well with dictation. Privacy is critical here — offline processing means patient information never leaves the device.

Legal work

Legal professionals deal with dense, precise language and specific citation formats. A profession-aware speech model handles terms like "habeas corpus," "voir dire," "amicus curiae," and case citations more reliably than a general-purpose model. The main use cases are case notes, memo drafts, email to clients, and dictating deposition summaries. Many lawyers already used Dragon for this; Voice Keyboard Pro and other modern alternatives are the natural successors.

Privacy and Offline Speech to Text

Privacy in speech-to-text tools comes down to one question: does your audio leave your device?

When you use a cloud-based speech-to-text service, your raw audio is sent to a remote server for processing. Even if the service encrypts the transmission and deletes the audio afterward, your spoken words — which may include confidential business information, patient data, legal privileged communications, or personal thoughts — travel over the internet to a third party's infrastructure. For many professionals, this is not acceptable.

Apple Silicon (M1 through M4) changed this equation by making on-device Whisper-class models practical. The Neural Engine on these chips runs speech recognition at real-time speed or faster, with accuracy that matches or exceeds older cloud services. This means your audio can be processed entirely on your Mac, with zero network traffic.

Both Apple's built-in dictation and Voice Keyboard Pro process speech on-device by default on Apple Silicon Macs. Voice Keyboard Pro's Smart Rewrite and AI actions do use an API call for the text-processing step, but the audio itself is never transmitted — only the already-transcribed text is sent if you opt into those features. If you disable Smart Rewrite, Voice Keyboard Pro operates completely offline.

For healthcare professionals bound by HIPAA, attorneys with client privilege obligations, and anyone working with trade secrets or classified information, offline speech to text is not a nice-to-have — it is a requirement. Apple Silicon made it possible without compromising accuracy.

Frequently Asked Questions

Is Mac speech to text accurate?
Apple's built-in dictation is reasonably accurate for everyday English but struggles with technical vocabulary, proper nouns, and domain-specific terminology. Third-party apps like Voice Keyboard Pro use larger Whisper-based models that deliver significantly higher accuracy, especially for specialized fields like medicine, law, and software development. With a good microphone and quiet environment, modern speech to text on Mac achieves 95-98% accuracy on standard English prose.
Does speech to text work offline on Mac?
Yes. Apple's built-in dictation processes speech on-device on Apple Silicon Macs (M1 and later). Voice Keyboard Pro also runs its core speech recognition engine entirely on your Mac with no internet connection required. Some apps like Wispr Flow require an internet connection because they process audio on remote servers. If offline operation matters to you, check whether the app you are considering runs its speech model locally or in the cloud.
How do I fix speech to text not working on Mac?
Start with the basics: go to System Settings > Privacy & Security > Microphone and confirm the app has microphone permission. For Apple's built-in dictation, check System Settings > Keyboard and make sure Dictation is toggled on. If dictation was working but stopped, try toggling it off and on again. Switch from a Bluetooth headset to the built-in microphone to rule out audio input issues. Check that your macOS is up to date. If none of this helps, restart your Mac — this resolves most intermittent dictation problems. For third-party apps, check their settings for hotkey conflicts with other apps.
Can I use speech to text in any app on Mac?
Apple's built-in dictation works in most native macOS apps (Notes, Mail, Pages, Safari) but can be unreliable in Electron-based apps like Slack, VS Code, Notion, and Obsidian, and in some browser text fields. Third-party apps like Voice Keyboard Pro use the Accessibility API to type text directly into any application, which means they work system-wide — in every app that accepts text input, including browsers, Electron apps, and the terminal. If universal app compatibility matters, a third-party app is the better choice.
Is speech to text faster than typing?
For most people, yes. The average typing speed is 40 to 80 words per minute. Comfortable speaking speed is 130 to 150 words per minute. Even accounting for the time spent correcting errors and editing, speech to text typically produces finished text 2 to 3 times faster than typing for conversational content like emails, messages, and first drafts. The advantage is smaller for highly formatted, symbolic, or code-heavy content where the keyboard is still the better tool.
What is the best speech to text app for Mac?
It depends on your priorities. For casual, occasional use, Apple's built-in dictation is free and adequate. For daily professional use, Voice Keyboard Pro offers the best combination of accuracy, speed, offline processing, system-wide compatibility, and profession-aware vocabulary. Wispr Flow is a strong cloud-based alternative with good AI text cleanup. Superwhisper is solid for straightforward local transcription without extras. Dragon NaturallySpeaking is no longer actively developed for Mac and is not recommended for new users. Our dictation app comparison goes into more detail on each option.

Ready to Try Speech to Text on Your Mac?

Start with Apple's built-in dictation — it is already on your Mac and it costs nothing. Follow the setup steps above and use it for a few days. If you run into the limitations described in this guide (and you probably will if you use it regularly), give Voice Keyboard Pro a try.

Voice Keyboard Pro has a free tier that lets you test hold-to-speak dictation, profession-aware accuracy, and Smart Rewrite with your actual workflow. No credit card required. Download it at voicekeyboardpro.com, set it up in 60 seconds, and see whether voice input works for the way you actually use your Mac.

The best speech-to-text setup is the one you actually use. Start simple, find the friction points, and then solve them. For most Mac users in 2026, that path leads to a combination of built-in dictation for the occasional quick note and a dedicated app like Voice Keyboard Pro for everything else.