You ask Meggy to debug a tricky race condition, and a heavyweight reasoning model spends thirty seconds thinking through the code. Five minutes later you ask "what's 20% of 85?" — and the same expensive model fires up for a one-line answer. That's wasteful.
Meggy solves this by organizing every AI model into categories — logical tiers that match the right model to each task automatically. A quick calculation goes to a fast, cheap model. A complex debugging session goes to a powerful reasoning model. An image prompt goes to an image generator. You don't have to think about it — but you can control it.
Every model in Meggy's catalog belongs to one or more of these categories:
| Category | Icon | What It Does | Used By Default For |
|---|---|---|---|
| Thinking | 💭 | Powerful reasoning models for complex tasks | Chat conversations, multi-step tool use, engine execution, agent pipelines |
| Fast | ⚡ | Lightweight models optimized for speed | Quick mode, tier classification, message routing, DAG planning |
| Utility | 🔧 | Balanced models for everyday tasks | Summarization, auto-tagging, classification, fallback when Thinking is unavailable |
| Image | 🖼️ | Image generation models | Creating images from text prompts (DALL-E, Flux, Gemini Image) |
| Video | ▶️ | Video generation models | Creating videos from text or image prompts (Sora, Veo, Runway, Kling) |
| TTS | 🎙️ | Text-to-speech synthesis | Voice output, narration, spoken responses in voice chat |
| STT | 🎧 | Speech-to-text transcription | Voice input, audio transcription, wake word processing |
| Embedding | 🗄️ | Vector embedding models | Vault document indexing, semantic search, RAG retrieval |
This is your workhorse. Thinking models handle the heavy lifting — multi-step reasoning, tool orchestration, code generation, and deep analysis. When you send a message in standard chat, Meggy routes it to the first model in your Thinking chain.
Thinking models must support tool calling, because Meggy relies on them to invoke tools, read files, search the web, and execute multi-step workflows. Models like GPT-5.2, Claude Sonnet 4.6, and Gemini 3.1 Pro are typical choices.
Fast models are the sprinters. They handle tasks where speed matters more than depth — classifying your message tier, routing it to the right execution path, generating quick one-line answers in Quick mode, and planning DAG task graphs.
Like Thinking models, Fast models must support tool calling. GPT-5 Mini, Claude Haiku 4.5, and Gemini 3 Flash are common picks.
Utility models sit between Thinking and Fast. They're capable enough for summarization, classification, and auto-tagging, but cheaper than Thinking models. Meggy uses them as a fallback when a Thinking model is unavailable and for background tasks like conversation summarization.
Image models generate pictures from text prompts. When you ask Meggy to "draw a sunset over a mountain lake", it routes the request to your configured Image model — DALL-E 3, Flux 2 Pro, Gemini Flash Image, or one of the other 16 image models in the catalog.
Video models create short clips from text or image prompts. Meggy supports Sora 2, Veo 3.1, Runway Gen-4 Turbo, and Kling 3.0 — though video generation typically takes longer and costs more than image generation.
TTS models convert text into spoken audio. They power voice chat responses, narration, and any feature where Meggy speaks to you. Providers include OpenAI TTS, ElevenLabs, Deepgram Aura, Cartesia Sonic, and local options like Piper.
STT models do the reverse — converting your spoken words into text. They power voice input, wake word detection, and audio transcription. Whisper variants, Deepgram Nova-3, and AssemblyAI are typical choices.
Embedding models convert text (and sometimes images) into numerical vectors for semantic search. They're invisible to you but essential behind the scenes — powering Vault document indexing, RAG retrieval, and memory search. Models like OpenAI Embedding 3 and Gemini Embedding handle this role.
For each category, you configure an ordered list of models — a fallback chain. If the first model fails (network error, rate limit, key issue), Meggy automatically tries the next one in the chain.
For example, your Thinking chain might be:
If GPT-5.2 is rate-limited, Claude picks up seamlessly. If Claude is also down, Gemini handles the request. You never see an error — just a smooth response from whichever model is available.
Meggy's onboarding wizard sets up sensible default chains based on the providers you configure. You can always customize them later.
Not all categories show up in the chat model picker. The Smart Bar lets you select from chat-selectable categories:
Categories like TTS, STT, and Embedding work behind the scenes and are configured in Settings → Models rather than selected per-conversation.
During first-time setup, the onboarding wizard automatically assigns the best available model to each category based on your configured providers. No manual configuration needed unless you want to fine-tune.