Overview
This node processes an audio file from a given URL by sending it to a remote Automatic Speech Recognition (ASR) service. It downloads the audio, converts it to a standard WAV format if necessary, and then uploads it to the ASR API for transcription. The resulting text transcription is added to the output JSON under a user-defined field name.
Common scenarios where this node is useful include:
- Transcribing recorded meetings or interviews hosted online.
- Extracting spoken content from podcasts or audio lectures available via URLs.
- Automating subtitle generation for videos with accessible audio files.
- Integrating speech-to-text functionality into workflows that handle audio data.
Example: Given a URL to an MP3 podcast episode, the node downloads the audio, converts it to WAV, sends it to the ASR service, and outputs the transcription text in a specified JSON field.
Properties
Name | Meaning |
---|---|
Audio File URL | The URL of the audio file to process. Must be a valid link pointing to an audio resource. |
Language | Language code for the ASR engine to use (e.g., "en" for English, "zh" for Chinese). Use "auto" for automatic language detection. |
Output Field Name | The name of the field in the output JSON where the transcription result will be stored. |
Output
The node outputs an array of items corresponding to each input item. Each output item's json
property contains all original input fields plus an additional field named as specified by the "Output Field Name" property. This field holds the transcription text returned by the ASR service.
If the transcription fails or no result is returned, the output field contains a default failure message (in Chinese, meaning "Retrieval failed").
No binary data is output by this node.
Example output JSON structure for one item:
{
"originalField1": "...",
"originalField2": "...",
"transcription": "This is the transcribed text from the audio."
}
(Note: The key "transcription"
can be customized.)
Dependencies
- Requires an external ASR service accessible via HTTP POST with an API URL and API key configured in the node credentials.
- Depends on
ffmpeg
being installed and available in the system PATH to convert audio files to WAV format if they are not already in the expected format. - Uses the
request-promise-native
library for HTTP requests. - Uses Node.js built-in modules such as
child_process
for spawning ffmpeg andstream
for handling audio data streams. - Temporary files are written to
/tmp/
directory for debugging purposes.
Troubleshooting
ffmpeg not found or conversion fails: The node spawns
ffmpeg
to convert audio to WAV. Ifffmpeg
is not installed or not in the system PATH, the node will throw an error indicating failure to spawn ffmpeg.
Resolution: Install ffmpeg and ensure it is accessible from the command line.Audio download issues: If the audio URL is invalid, inaccessible, or times out (30 seconds timeout), the node will fail to download the audio.
Resolution: Verify the URL is correct, publicly accessible, and the server allows downloading.ASR service errors: If the ASR API URL or API key is missing or incorrect in credentials, the node throws an error. Also, if the ASR response is malformed or does not contain expected fields, warnings are logged and a failure message is output.
Resolution: Ensure the ASR service credentials are correctly configured and the service is operational.Unexpected ASR response structure: The node expects a JSON response with a
result
array containing objects with atext
field. If this structure changes, the node logs warnings and returns a failure message.
Resolution: Confirm the ASR API response format matches expectations or update the node accordingly.Small or corrupted audio files: If the downloaded audio buffer is very small (<200 bytes), the node logs the hex dump for debugging. Corrupted or unsupported audio formats may cause conversion or transcription failures.
Continue On Fail: If enabled, the node continues processing subsequent items even if some fail, returning error details per item.
Links and References
- FFmpeg Official Website — For installing and understanding ffmpeg.
- n8n Documentation — General information about creating and using custom nodes.
- FormData npm package — Used internally for multipart form uploads.
- Request-Promise-Native — HTTP client used for requests.
This summary is based solely on static analysis of the provided source code and property definitions.