Actions20
- AI Scrape Actions
- Analyze Sentiment Actions
- Convert to SQL Actions
- Generate Embedding Actions
- HTML to Any Actions
- Image Generation Actions
- Make Prediction Actions
- NSFW Detection Actions
- Object Detection Actions
- Process Image Actions
- Profanity Detection Actions
- Search Web Actions
- Spam Detection Actions
- Speech to Text Actions
- Spell Check Actions
- Summary Actions
- Text to Speech Actions
- Translate Actions
- Translate Image Actions
- Web Suggestion Actions
Overview
This node integrates with the JigsawStack API to convert audio content into text. It supports transcription from either a direct audio URL or an audio file stored in a file storage system. The node can automatically detect the language of the audio or use a specified language code, and it optionally translates the transcribed text into English or another specified language. Additionally, it can identify and separate different speakers within the audio. For large audio files, it processes the audio in configurable batch sizes. The node also supports asynchronous processing by sending results to a user-provided webhook URL.
Common scenarios:
- Transcribing interviews or meetings recorded as audio files.
- Converting podcasts or lectures into searchable text.
- Automatically generating subtitles or captions for videos.
- Translating foreign language audio into English text.
- Speaker diarization to distinguish multiple speakers in conversations.
Properties
Name | Meaning |
---|---|
Audio Source | Choose the source of the audio: "Audio URL" or "File Store Key". |
Audio URL | The URL of the audio file to be transcribed (required if Audio Source is "Audio URL"). |
File Store Key | The key identifying the audio file stored in the file storage system (required if Audio Source is "File Store Key"). |
Language | Language code for transcription (e.g., "en", "es", "fr"). Leave blank for automatic language detection. |
Translate | Whether to translate the transcribed content into English or the specified language (true/false). |
By Speaker | Whether to identify and separate different speakers in the audio (true/false). |
Webhook URL | URL to send the transcription result asynchronously. If provided, processing happens asynchronously and results are sent to this URL. |
Batch Size | Number controlling how the audio is chunked for processing. Maximum value is 40. Default is 30. |
Output
The node outputs JSON data containing the transcription results. This typically includes the transcribed text, language information, and if enabled, speaker separation details. When asynchronous processing via webhook is used, the node may not output immediate results but instead relies on the webhook callback.
If binary data is involved (not explicitly indicated here), it would represent audio or related media, but this node primarily focuses on JSON transcription output.
Dependencies
- Requires an API key credential for authenticating with the JigsawStack API.
- Needs internet access to call the external JigsawStack API endpoint at
https://api.jigsawstack.com/v1
. - Optional webhook URL must be publicly accessible to receive asynchronous transcription results.
Troubleshooting
- Invalid or missing API key: Ensure that a valid API key credential is configured in n8n for this node.
- Incorrect audio source configuration: Verify that when "Audio URL" is selected, a valid URL is provided; similarly, when "File Store Key" is selected, the correct key is entered.
- Language code issues: Using unsupported or incorrect language codes may lead to inaccurate transcription or errors. Use standard ISO language codes.
- Webhook failures: If using a webhook URL, ensure it is reachable and correctly handles incoming POST requests; otherwise, transcription results will not be received.
- Batch size limits: Setting batch size above 40 may cause errors; keep it at or below 40.
- Network connectivity: Since the node depends on an external API, network issues can cause timeouts or failures.
Links and References
- JigsawStack API Documentation (for detailed API usage and supported languages)
- ISO 639-1 Language Codes (for specifying language codes)
- n8n Webhook Documentation (for setting up webhook endpoints)