You're cooking dinner and your hands are covered in flour. You need to know how long to roast the chicken at 375°. You could wash your hands, dry them, unlock your phone, type the question... or you could just say "Hey Meggy, how long do I roast a chicken at 375?" and get an answer spoken back to you.
Voice Chat turns Meggy into a hands-free assistant. It listens for your voice, understands what you say, thinks about the answer using the same powerful AI engine behind text conversations, and speaks the response back to you. Same tools, same memory, same intelligence — just no keyboard required.
Voice Chat is built on a four-stage pipeline:
Meggy listens for its wake word — "Hey Meggy" — using a lightweight local detection model. This runs continuously in the background without sending any audio to the cloud. When the wake word is detected, the microphone activates and recording begins.
You can also use push-to-talk mode if you prefer — hold a key to speak, release to send. Both modes are available in settings.
Once you finish speaking, your audio is transcribed into text. Meggy supports multiple STT providers:
| Provider | Model | Runs Locally? |
|---|---|---|
| OpenAI | Whisper | Cloud |
| Cloud Speech-to-Text | Cloud | |
| Local | Whisper.cpp | ✅ Yes — fully on-device |
If privacy is your priority, the local Whisper option means your voice never leaves your machine.
The transcribed text is processed through the exact same AI pipeline as typed messages. This means Voice Chat has access to:
You can ask voice questions that trigger tool calls — "What's the weather like?", "Turn off the bedroom lights", "Add milk to my shopping list" — and Meggy will use the appropriate tools to fulfill the request.
The AI's response is spoken back to you using natural-sounding voice synthesis:
| Provider | Voices | Quality |
|---|---|---|
| ElevenLabs | Hundreds of natural voices | Premium, highly expressive |
| OpenAI | 6 built-in voices (Alloy, Echo, Fable, Onyx, Nova, Shimmer) | High quality, fast |
| Cloud TTS with multiple languages | Good quality, wide language support |
You can choose your preferred voice in settings, adjusting speed, pitch, and provider.
Meggy uses Voice Activity Detection to know when you've finished speaking. VAD analyzes the audio stream in real time to detect speech boundaries — it knows when you start talking and when you stop, so it doesn't cut you off mid-sentence or wait awkwardly after you've finished.
Voice Chat works on all supported platforms:
Voice Chat integrates with all of Meggy's channels — you can start a voice conversation on desktop and continue it via text on WhatsApp, or vice versa. It's all the same conversation, the same memory, the same assistant.