Skip to content
← Help

Transcription speed and cost

When a meeting ends, Daisy reprocesses the recording end-to-end: transcribe what was said, work out who said it, and produce a summary. This article gives you a realistic sense of how long that takes — and what it costs if you choose a cloud provider — so you can pick the option that fits how you actually work.

All numbers below come from runs against real meeting audio in May 2026. Your own machine will land in the same ballpark.

What "finalize" means

The live transcript you see during a call is a preview. When the call ends, Daisy does a quieter, higher-quality pass on the full recording:

  1. Transcribes the audio once more, against the saved file rather than a stream.
  2. Labels every utterance with a speaker, against your voiceprints.
  3. Merges any duplicates between the Me and Them sides.
  4. Saves the cleaned transcript, summary, and search index.

This finalize step is where the "wait" people sometimes notice happens. It runs once, in the background, after the meeting.

Realistic wait for a 10-minute meeting

There are really two choices: keep the work on your laptop, or hand it to a cloud provider like Deepgram.

Option Wait after the call Cost Where the audio went
Local Whisper ~30–45 seconds $0 Stayed on your laptop
Deepgram ~2–10 seconds ~5¢ Sent once for transcription

A couple of things worth saying out loud:

  • The fully-local option is genuinely usable. Under a minute after a 10-minute meeting ends, you have a complete, searchable record. No spinning loader, no "your transcript is being prepared."
  • Deepgram is the biggest single speed-up. Parallelization on their side finishes the same 10 minutes in a handful of seconds.
  • You can switch on a per-meeting basis — nothing locks you into one choice for everything you record.

Live transcription keeps up

The live transcript during a call has its own budget: it has to process audio at least as fast as it arrives, or it falls behind.

On a modest laptop, Daisy's local Whisper runs about 15× faster than real time. A five-second chunk of audio transcribes in roughly a quarter of a second. There's plenty of headroom; the live transcript only falls behind if you've swapped in a much larger model on a much slower CPU, and Daisy will warn you when that happens.

What Deepgram actually costs

If you choose Deepgram, you pay them directly with your own API key. The published nova-3 rate is roughly $0.0053 per minute of audio.

In real-meeting terms:

  • 30-minute meeting: about 16¢
  • 1-hour meeting: about 32¢
  • 5 one-hour meetings a day, every workday, for a month: roughly $32

This is all on your Deepgram account, billed by Deepgram. Daisy never sees the charge and never marks it up. You can cap usage or revoke the key from the Deepgram console at any time.

Hardware notes

Daisy was designed to run on the kind of laptop you already have.

  • A modern Intel or AMD laptop CPU is enough for the local option to run comfortably under real time. No GPU required.
  • A desktop CPU finishes the same work about a third faster, mostly because of clock speed and thermal headroom.
  • An older laptop still handles the live transcript fine; it's the finalize pass that takes a little longer.

Picking an option

  • Privacy is paramount, or you don't want a per-meeting bill. Use the local option. Under a minute to finalize a 10-minute meeting, zero outbound.
  • You want the transcript almost instantly. Use Deepgram. A few seconds to finalize a 10-minute meeting; costs scale linearly with audio length.