Actions16
Overview
The ElevenLabs Speech node with the "Voice changer" operation allows users to transform an existing audio file by changing its voice characteristics. This is useful for applications such as creating voiceovers with different voices, anonymizing speakers, or generating creative audio effects by modifying the original voice in an audio clip.
Typical scenarios include:
- Altering a recorded message to sound like a different speaker.
- Enhancing audio content for entertainment or marketing by applying unique voice styles.
- Privacy-focused transformations where the original speaker's identity needs to be masked.
For example, you can input an audio file containing a spoken sentence and select a target voice ID to produce a new audio output where the speech sounds like it was spoken by the chosen voice.
Properties
Name | Meaning |
---|---|
Binary Input Field | The name of the binary property that contains the audio file to transform (e.g., "data"). This is the input audio that will have its voice changed. |
Voice ID | The identifier of the voice to use for the transformation. You can select from a list of available voices or enter a specific voice ID manually. |
Additional Fields | A collection of optional parameters to customize the request: |
- Binary Name | Change the output binary property's name (default is "data"). |
- File Name | Change the output file name (default is "voice"). |
- Streaming Latency | Optimize streaming latency at some cost to quality. Values range from 0 (no optimization) to 4 (max optimization with text normalizer off). |
- Output Format | Choose the output audio format. Options include various MP3 and PCM formats, and μ-law encoding. |
- Model Name or ID | Identifier of the model used for voice transformation. Select from a list or specify an ID. |
- Stability | Defines voice stability; a number between 0 and 1 controlling how stable the voice sounds. |
- Similarity Boost | Controls how closely the output voice matches the target voice; a value between 0 and 1. |
- Style | Exaggerates the voice style; a number between 0 and 1. |
- Speaker Boost | Boolean to activate speaker boost feature. |
- Seed | A number between 0 and 4294967295 to make the voice transformation deterministic (same seed + same input = same output). |
- Enable Logging | Boolean to enable or disable logging (affects retention/history features). |
- Remove Background Noise | Boolean to remove background noise from the input audio before processing. |
Output
The node outputs the transformed audio data in the specified binary property (default "data"). The audio format corresponds to the selected output format option (e.g., MP3, PCM). The binary data represents the voice-changed audio file ready for further use or download.
No additional JSON fields are explicitly described for output beyond the binary audio content.
Dependencies
- Requires an API key credential for ElevenLabs API authentication.
- Network access to the ElevenLabs API endpoint (
https://api.elevenlabs.io/v1
). - Proper configuration of the node with valid voice IDs and model identifiers.
- Optional: Enabling logging requires appropriate permissions and may affect data retention.
Troubleshooting
- Invalid Voice ID: If the voice ID is incorrect or not found, the API will likely return an error. Verify the voice ID by listing available voices.
- Missing Binary Input: Ensure the binary input field name matches the actual binary property containing the audio file.
- Unsupported Audio Format: Input audio must be compatible with the API requirements; otherwise, the transformation may fail.
- API Authentication Errors: Check that the API key credential is correctly configured and has necessary permissions.
- Latency Optimization Quality Tradeoff: Using higher streaming latency optimization values may degrade audio quality; adjust according to needs.
- Logging Disabled: Disabling logging disables history features; enable if you require usage tracking.
- Background Noise Removal: Enabling this may increase processing time; disable if not needed.