← Back to Blog

Short answer: The average person speaks at 130 to 150 WPM but types at only around 40 WPM, making speech roughly 3 to 4 times faster than typing. Even skilled typists at 80 WPM still trail conversational speaking speed, which is why voice dictation accelerates most writing workflows.

Here are two numbers that, when you put them side by side, change how you think about productivity at a desk.

The average adult types at roughly 40 words per minute. The same adult, in normal conversation, speaks at roughly 130 to 150 words per minute. The gap is somewhere between 3x and 4x.

That gap is not a fact about technology. It is a fact about you. Your brain produces language faster than your fingers can type it, and it has done so your entire life. We have just spent so long staring at keyboards that we forgot how slow they are compared to the input device built into our heads.

This article is the foundational version of the question. What exactly is the gap, why does it exist, and what happens to your work when you stop ignoring it?

The Numbers

Let us start with the actual figures and what they mean.

Typing speed

The most commonly cited adult average is 40 WPM. That number comes from large samples of keyboard users tested on common-word text. Like any average, it hides a lot of variance — beginners hunt-and-peck at 20 WPM, competent touch typists run at 50-70 WPM, and trained professionals can hit 80-100 WPM. The fastest typists in the world push past 150 WPM, but those are outliers measured in seconds-long sprints, not sustained working speed.

For our comparison, the relevant numbers are:

Speaking speed

The most commonly cited figure for comfortable conversational speech is 130 to 150 WPM. This is the speed at which people naturally talk when they are explaining something to someone they know — not lecturing, not racing, not reading from a script.

Other relevant speech rates:

For a fair comparison, the right number to use is the conversational range: 130-150 WPM. That is the speed you can produce sustainably, with normal punctuation pauses, when explaining a thought.

The ratio

Take the midpoints — 40 WPM typed, 140 WPM spoken — and the ratio is 3.5x. Even comparing an unusually fast typist (70 WPM) to the same conversational speech, the ratio is still 2x. The gap is not subtle. Spoken language production is several times faster than typed language production for almost every person on earth.

Why the Gap Exists

This asymmetry has a few honest explanations, all of them rooted in how your brain and body work.

Speech is older

Humans have been speaking for somewhere on the order of tens of thousands of years. Writing is roughly five thousand years old. Typing is barely 150 years old. Touch typing is younger still.

Speech production is supported by deeply specialized brain regions, motor circuits that coordinate dozens of muscles, and a feedback loop refined over a long evolutionary horizon. Typing is a learned skill grafted onto fine motor systems that did not evolve for it. The hardware difference alone explains some of the speed gap.

Speech is parallel; typing is serial

When you speak a sentence, multiple body systems work in parallel: vocal cords vibrate, tongue and lips shape phonemes, breath modulates volume, and intonation carries meaning. Each phoneme takes a fraction of a second, but several articulatory motions overlap. The throughput is high because the system is parallel.

When you type, the work is serial. Each keystroke is one finger pressing one key. Even at maximum speed, you are producing one character at a time. Your two hands can alternate, but the basic unit is still discrete and sequential. The throughput ceiling is set by how fast individual fingers can move and reset.

Speech is direct from thought; typing has a translation layer

When you speak, your brain produces language fairly directly — concept becomes phoneme becomes sound. There is some planning ahead, but the path from intent to output is short.

When you type, you produce language, then you also have to translate that language into a spatial motor plan — which finger goes to which key. This translation layer is fast for trained typists, but it is never zero. It is an additional cognitive step that does not exist in speech.

Errors are easier to absorb in speech

In conversation, small mistakes correct themselves. You say "the the" or trail off mid-word, and the listener follows along anyway. The throughput stays high.

In typing, errors require explicit correction — backspace, retype. Each error costs keystrokes and breaks rhythm. Even at high speed, the correction tax is real, which means sustained typing speed always sits below your peak burst speed.

You have been speaking longer

Most people have been speaking fluently since age four or five. By the time you are an adult, you have done it for two decades. Even moderate typists have been typing for less time, and most never sat down to deliberately train the skill the way they trained speech (which, granted, you also did not consciously train — but the brain did, with billions of repetitions).

What the Gap Means for Productivity

If your job involves moving thoughts into text — emails, documents, messages, notes, code comments — the typing-vs-speaking gap directly affects your output. Here is the math, made concrete.

A normal email

A medium-length work email is around 150 words. At 40 WPM, that takes about 3 minutes and 45 seconds of typing time, plus any thinking-while-typing pauses. At 140 WPM spoken, the same email takes a little over a minute.

If you send 20 of those a day, the difference is roughly 50 minutes. Over a year of work days, that adds up to more than 200 hours. Five full work weeks.

A long-form document

A 2,000-word document at 40 WPM is 50 minutes of typing. At 140 WPM dictated, it is roughly 14 minutes. The first draft alone — leaving aside editing — takes a third of the time.

This is not theoretical. Anyone who has dictated a long email or memo and then transcribed it later knows the asymmetry. Composing aloud and then editing is consistently faster than typing the first draft from scratch.

Messages and Slack

Short messages benefit less from speed because they are dominated by thinking time. But the volume is high — knowledge workers send dozens of short messages a day. Aggregate the savings and you reclaim a chunk of every workday.

Notes and journaling

This is where the gap matters most, because the bottleneck is keeping up with thought. When you are taking notes during a meeting or capturing a fleeting idea, the question is not "can I type fast enough to be accurate." It is "can I capture the thought before it disappears." Speech, being closer to the speed of thought, is the natural medium for this kind of capture.

The Counter-Arguments

It would not be honest to write this article without addressing the obvious objections.

"Speech is messier than typing"

Yes — raw transcribed speech is messier than carefully typed prose. You repeat yourself, trail off, restart sentences. Older voice-to-text tools captured all of this verbatim, producing transcripts that read like a deposition.

Modern AI transcription handles this differently. The good tools clean up filler words, normalize sentences, and produce text that reads like written language. You can also dictate cleanly — once you adjust to the medium, your spoken sentences come out closer to written form.

"Typing lets you think while you write"

True for some tasks. Code, math, technical drafting, and other domains where each sentence requires deliberation often benefit from the slower pace of typing. But for tasks where you already know what you want to say — replying to an email, drafting a known kind of document, capturing an idea you have been holding for hours — the pace mismatch costs you.

"Voice does not work in noisy places"

This used to be true. Modern voice tools handle moderate background noise well. They are still not ideal in a loud cafe or shared office, but the failure modes are smaller than they were.

"I think faster when I type"

Some people genuinely do, and that is fine. The point is not that voice replaces typing. The point is that the productivity gap is large and real, and ignoring it because typing feels familiar is leaving time on the table.

Closing the Gap

There are two ways to close the gap. You can train your typing toward your speech, or you can capture your speech directly. The math heavily favors the second.

Training a typist from 40 to 80 WPM takes months of deliberate practice. Going beyond 80 is genuinely hard. Almost nobody pushes past 120 WPM sustained, and even that is still slower than normal conversation.

Direct speech capture, by contrast, gives you 130-150 WPM today, with no training. The only thing that has changed in the last few years is that the technology actually works — accurate, fast, and produces clean text instead of stream-of-consciousness transcripts.

What Modern Voice Dictation Actually Does

The shift in voice dictation between roughly 2022 and 2026 has been larger than most people realize. AI-driven transcription models from companies like OpenAI now hit accuracy above 95% for most speakers, including accented English and moderate background noise. The output handles punctuation reasonably well, recognizes named entities, and produces text that does not look like a transcript.

Voice Keyboard Pro uses this kind of model on the Mac. It runs as a menu bar app — 1.7MB, no big interface to learn. You hold a hotkey, talk, release the key, and the text appears at your cursor. It works in any app: Gmail, Notion, Slack, code editors, terminals, anywhere a keyboard works.

Under the hood it uses the Voice Keyboard Pro Whisper API for fast cloud transcription, with an offline mode using Apple Speech for situations where you do not want audio leaving the device. Audio is not stored on servers. There is a free tier; the Pro plan is $4.99/month or $34.99/year and adds Smart Rewrite (cleans up "uh"s and restructures sentences), Voice Profile (improves accuracy for your voice), Voice Isolation (handles background noise), and custom vocabulary for domain terms. The iPhone version is a system keyboard that brings the same workflow to your phone.

You type at 40 WPM and speak at 140. That is not a problem with typing. It is a feature of how your brain produces language. The only question is whether you are still acting as if the 40 is your ceiling.

The Reframe

The typing-vs-speaking gap is the single most underused productivity lever for anyone who works with text. It is not a small efficiency tweak — it is a 3-4x throughput difference. It has existed for as long as keyboards have existed. We ignored it for a long time because the technology to capture speech reliably did not exist. Now it does.

This does not mean you stop typing. Keyboards remain essential for editing, navigation, precise control, and the kinds of tasks where every keystroke matters. But for the bulk-text portion of your day — the part where you are turning thoughts into prose — the keyboard is no longer the fastest tool you own. Your voice is.

If you have not tried modern voice dictation in the last year, try it. The difference between what it was and what it is now is the difference between an interesting curiosity and a genuine productivity shift. Voice Keyboard Pro has a free tier — dictate the next long message you would have typed, and notice how much faster it goes.

The 3-4x gap has been there your whole working life. The only new thing is that you can finally close it.