The pitch you've heard
'Voice messages get 3x the reply rate of text.'
'Video DMs are the new cold outreach unlock.'
Both claims circulate widely on LinkedIn-coach Twitter. Some operators swear by them. Others have tried and bounced.
The truth: both work, but in narrower contexts than the headlines suggest. Used wrong, they actively hurt reply rates and make you look unprofessional.
When voice messages work
Senior executive targets where written outreach is saturated. A VP-level prospect gets 30–50 templated DMs per week. A 45-second voice message is novel; it gets attention.
Specific, prospect-aware content. 'Hey [name], I saw your post about [specific topic] and wanted to react personally.' Generic voice messages are worse than generic text.
Operators with confident voices. If recording a voice message makes you sound nervous or rehearsed, skip. Bad voice signals desperation.
Messages under 60 seconds. 90+ second voice messages don't get listened to.
Reply-rate lift in these contexts: 1.5–2.5x over equivalent text outreach. Real but smaller than the 3x headline claims.
When voice messages don't work
Cold first-touch to mid-level prospects. Reply rates are LOWER than text in this context. The voice message reads as creepy/desperate to people who don't know you.
Operators with weak voice presence. If your voice sounds shaky, monotone, or scripted, the voice message hurts rather than helps. The medium magnifies whatever's in the delivery.
Long messages. Anything over 90 seconds is a no. Most prospects won't listen past 30 seconds.
Asking for a meeting in the voice message. The voice message creates intimacy; an immediate meeting ask burns it. Use voice to start the conversation, not to convert it.
When video DMs work
Video DMs (Loom-style or LinkedIn's native video DM feature) work in a narrower context than voice:
Highly targeted senior prospects (VP+, founder, CRO). The investment signal of personalized video lands well at this level.
Demonstrable specificity. Pull up the prospect's LinkedIn profile or company site on screen, reference specific things in the video. The 'I actually did research' signal is unfakeable in video.
Existing-relationship reactivation. Old contact you haven't talked to in 12+ months. Video reanchors the relationship.
Operators who can shoot well. 720p+ video, decent audio, reasonable lighting. Bad video looks worse than no video.
Reply-rate lift in these contexts: 2–3x over equivalent text outreach.
When video DMs don't work
Volume cold outbound. Recording 50 personalized videos per week is unsustainable for most operators. The math doesn't work even at the higher reply rate.
Operators uncomfortable on camera. Same dynamic as voice but worse. Bad video damages trust quickly.
Generic video templates. 'Hey [Name], I'm [Founder] of [Company]' shot once and re-sent to 50 prospects gets pattern-recognized fast. Either fully personalized or don't.
Mid-level prospects. Reply rates lower than text. The video investment doesn't match the prospect's perceived importance to you, which feels off.
The math on volume
Voice and video are time-intensive. A typical voice message: 90 seconds to record + 60 seconds to listen + 60 seconds to draft the wrap-around text = 3.5 minutes per outreach.
Compared to AI-personalized text: 30 seconds to review + 15 seconds to send = under a minute.
Voice/video is 3.5x slower per outreach. So you'd need a reply-rate lift of 3.5x just to break even on time. Real-world lifts are 1.5–3x. The math works only in the highest-conversion scenarios — senior prospects, deep specificity, established voice presence.
How to integrate voice/video into a sequence
Don't make it the first touch. Use it as touch 3 or 4 in a multi-channel sequence, after the prospect has seen your name a couple of times.
Lead with text. Earn enough engagement signal that the prospect knows who you are. Then send the voice/video as the 'I actually want to talk to you' escalation.
Reserve for high-value prospects only. Top 10–20% of your queue by ICP fit and account size. Volume voice/video is a losing strategy.
Track reply rates separately. If voice/video isn't outperforming your text baseline by 1.5x, stop doing it. The time cost isn't justified.
FAQ
Should I use LinkedIn's native voice/video features or third-party tools?
Native is fine for ad-hoc use. Loom and Vidyard offer better analytics if you're running this systematically. The platform is less important than the content.
How long should a voice or video message be?
Voice: under 60 seconds. Video: under 90 seconds. Most people stop watching/listening at 30-45 seconds, so front-load the value.
Can I use AI to generate voice or video?
Voice cloning works technically but feels off when the prospect detects it. Video deepfakes are a hard no — they'll burn trust if discovered. Real human voice and video remain the bar.
What about Loom-style asynchronous video for follow-ups (not cold)?
Different game and yes, useful. Recording a 3-minute Loom showing how your product solves a specific prospect's problem after the demo is high-leverage. We're talking about cold first-touch in this post; warm follow-up is a different question.
Run this with Infonet
Free 14-day trial. AI-personalized LinkedIn outreach with home IP protection.
Start free trial