Create voice group
To create a new voice group, click “Voices” on the left navbar.
- Voice group name: The name of the voice group (must be unique within organization)
- Description (optional): Description of the voice group.
- Languages: Here you can select which languages are associated with the language group. For each language, you must enter a DTMF code (a number between 1 and 9, inclusive), select a voice provider, and select a voice from that provider. Note that your voice options are limited to voices that support the selected language. Each language in the group must be unique and have a unique DTMF code.
- The option “Skip first language when reading out language options” allows you to omit the first language in the group from the DTMF menu presented as part of the greeting message (see below). You can also configure speed and pitch (depending on if the voice being used supports it), both of which take decimal numbers.
- Create a voice group as outlined above.
- Navigate to the agent. Your voice group should appear as an option in the Voice Group field on the agent edit screen. Select the voice group and save changes.
- Navigate to the agent’s message. Insert the tag
{{ language.mode }}into the message (and/or the text of any message rules) where you would like the language selection menu to be delivered by the agent. Save changes.
{{ language.mode }} tag, it will instead read off a menu automatically generated from the voice group linked to the agent.
For example, for the voice group with:
Dynamic language switching
In addition to the keypad-based DTMF menu, callers can switch languages at any point in the session by speaking naturally—no digit press required. See Dynamic language switching in the Agents guide for setup instructions and behavior details.Bridge phrases
When Responsive Dialogue is enabled and the system is slow to respond or is making a tool call (e.g., looking up information), the agent can say short phrases so the caller knows to wait. You can customize these per language—select a language in the Phrases editor to set delay and phrase options for that locale.- Go to Voices in the sidenav and open the Phrases tab.
- Select a language from the sidebar (for example, English (Default) or Spanish).
- Press Edit and add your bridge phrase to the appropriate section:
- Bridge phrase delay: Seconds of silence before the agent speaks the first bridge phrase. If processing continues, additional bridge phrases follow at intervals of 2×, 3×, and 4× this base value. Valid range is 0.25–30 seconds. When unset, the service-wide default applies (2.75 seconds).
- Initial bridge phrases: Used the first time a response is taking longer than expected. If no custom config exists, the system uses hardcoded defaults by language (see Default bridge phrases below).
- Additional bridge phrases: Used if the wait continues. If empty, falls back to Initial bridge phrases.
- Tool bridge phrases: Used when a specific tool call is in progress. No hardcoded defaults—if empty, the agent continues using regular timeout bridge phrases.
- Add or edit phrases (one per line), then save.
Default bridge phrases
If you don’t configure custom phrases, the system uses hardcoded defaults by language. English- “Ok, one moment.”
- “Just a second.”
- “Got it, just a moment please.”
- “Sorry this is taking a minute.”
- “Just taking a look here.”
- “un momento…”
- “un minuto…”
- “ok, veamos…”
- “espere…”
- “un segundo…”
- “ok, un momento…”
- “hmmm, veamos…”
- “solo un segundo…”
- “ok, espere…”
- “solo un momento…”
- “hmmm, espere…”
- “ok, solo un segundo…”
- If config exists but a localized language field is empty, that field falls back to the config’s English (default) value.
- If no config exists at all, fallback is hardcoded language defaults (Spanish if available, otherwise English).
Available voices and languages
Google Cloud Text‑to‑Speech
| Voice | Gender | Model | Languages |
|---|---|---|---|
| Aoede | Female | Chirp 3 HD | en‑US, ko‑KR, zh‑CN, es‑US, th‑TH, vi‑VN |
| Charon | Male | Chirp 3 HD | en‑US, ko‑KR, zh‑CN, es‑US, th‑TH, vi‑VN |
| Fenrir | Male | Chirp 3 HD | en‑US, ko‑KR, zh‑CN, es‑US, th‑TH, vi‑VN |
| Kore | Female | Chirp 3 HD | en‑US, ko‑KR, zh‑CN, es‑US, th‑TH, vi‑VN |
| Leda | Female | Chirp 3 HD | en‑US, ko‑KR, zh‑CN, es‑US, th‑TH, vi‑VN |
| Orus | Male | Chirp 3 HD | en‑US, ko‑KR, zh‑CN, es‑US, th‑TH, vi‑VN |
| Puck | Male | Chirp 3 HD | en‑US, ko‑KR, zh‑CN, es‑US, th‑TH, vi‑VN |
| Zephyr | Female | Chirp 3 HD | en‑US, ko‑KR, zh‑CN, es‑US, th‑TH, vi‑VN |
| en‑US‑Neural2‑D | Male | Neural2 | en‑US |
| en‑US‑Neural2‑F | Female | Neural2 | en‑US |
| es‑US‑Neural2‑A | Female | Neural2 | es‑US |
| es‑US‑Neural2‑B | Male | Neural2 | es‑US |
| ko‑KR‑Neural2‑A | Female | Neural2 | ko‑KR |
| vi‑VN‑Neural2‑A | Female | Neural2 | vi‑VN |
| cmn‑TW‑Wavenet‑A | Female | WaveNet | cmn‑TW |
| yue‑HK‑Standard‑C | Female | Standard | yue‑HK |
| en‑US‑Studio‑O | Female | Studio | en‑US |
OpenAI Text‑to‑Speech
| Voice | Gender | Model | Languages |
|---|---|---|---|
| Alloy | Male | TTS‑1 | en‑US, ko‑KR, zh‑CN, fa‑IR, es‑US, th‑TH, vi‑VN |
| Ash | Male | TTS‑1 | en‑US, ko‑KR, zh‑CN, fa‑IR, es‑US, th‑TH, vi‑VN |
| Coral | Female | TTS‑1 | en‑US, ko‑KR, zh‑CN, fa‑IR, es‑US, th‑TH, vi‑VN |
| Echo | Female | TTS‑1 | en‑US, ko‑KR, zh‑CN, fa‑IR, es‑US, th‑TH, vi‑VN |
| Fable | Female | TTS‑1 | en‑US, ko‑KR, zh‑CN, fa‑IR, es‑US, th‑TH, vi‑VN |
| Nova | Female | TTS‑1 | en‑US, ko‑KR, zh‑CN, fa‑IR, es‑US, th‑TH, vi‑VN |
| Onyx | Male | TTS‑1 | en‑US, ko‑KR, zh‑CN, fa‑IR, es‑US, th‑TH, vi‑VN |
| Sage | Male | TTS‑1 | en‑US, ko‑KR, zh‑CN, fa‑IR, es‑US, th‑TH, vi‑VN |
| Shimmer | Female | TTS‑1 | en‑US, ko‑KR, zh‑CN, fa‑IR, es‑US, th‑TH, vi‑VN |
ElevenLabs
| Voice | Gender | Model | Languages |
|---|---|---|---|
| Alice | Female | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Bill | Male | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Brian | Male | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Callum | Male | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Charlie | Male | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Charlotte | Female | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Chris | Male | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Daniel | Male | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Eric | Male | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| George | Male | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Jessica | Female | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Laura | Female | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Liam | Male | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Lily | Female | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Matilda | Female | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| River | Male | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Roger | Male | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Sarah | Female | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
| Will | Male | eleven_flash_v2_5 | en‑US, ko‑KR, zh‑CN, es‑US |
- Log into the Syllable Console
- Click the Voices tab, then select New Voice Group
- Name your new Voice Group.
- Select the default language.
- Pick a voice provider.
- Click the speaker icon to listen to a sample of the voice you want to hear.
- Tweak the Voice Speed and Pitch to your liking. We recommend small increments to maintain a human timbre.
Pronunciations
Text-to-Speech systems often mispronounce proper nouns, brands, acronyms, and medical terms. Pronunciations will allow users to correct how agents pronounce words so that users can ensure agents are properly and clearly communicating with callers.- Note: Pronunciations are organization-wide and affect all agents using voices and languages.

- Click “upload file” or download a sample CSV template, fill it in, and upload.

- text: The original word or phrase as it appears in transcripts or responses
- replacement: The phonetic or re-spelled version the TTS engine should use (For example: “kernel” for “colonel”).
- language: The relevant language (e.g., en-US, English, es-US, Spanish, etc.)

- Download CSV: Download your current pronunciation table from the Pronunciations page. The CSV will have the following columns: text | replacement | language
- Edit CSV: Add new rows for any words or phrases you wish to customize.
-
Example:

- Tip: Use respelling or phonetic spelling. You may need to experiment to find the best pronunciation for your TTS voice.
-
Example:
- Upload CSV: Once you have made your changes, upload the CSV back to the platform. The system will immediately begin using your updated pronunciations.
- Select a TTS provider, voice, and language
- Enter the replacement text or phrase
- Playback the result to ensure it sounds correct
- Adjust your replacement spelling as needed for clarity or naturalness
- Be specific: The text column should match exactly how the word/phrase appears in your outputs.
- Phonetic spelling: Start with simple respellings, but if necessary, use more detailed phonetic hints or hyphenation.
- Language matching: Ensure the language field matches the TTS model you’re targeting.
- If you want to add more pronunciation replacements, download the latest CSV, add your changes, and re-upload.
- Pronunciation customizations are organization-wide and affect all agents using affected voices and languages.
- There is no limit to the number of entries you can add.

