JigsawStack icon

JigsawStack

Use JigsawStack API

Actions20

Overview

This node integrates with the JigsawStack API to convert audio content into text. It supports transcription from either a direct audio URL or an audio file stored in a file storage system. The node can automatically detect the language of the audio or use a specified language code, and it optionally translates the transcribed text into English or another specified language. Additionally, it can identify and separate different speakers within the audio. For large audio files, it processes the audio in configurable batch sizes. The node also supports asynchronous processing by sending results to a user-provided webhook URL.

Common scenarios:

  • Transcribing interviews or meetings recorded as audio files.
  • Converting podcasts or lectures into searchable text.
  • Automatically generating subtitles or captions for videos.
  • Translating foreign language audio into English text.
  • Speaker diarization to distinguish multiple speakers in conversations.

Properties

Name Meaning
Audio Source Choose the source of the audio: "Audio URL" or "File Store Key".
Audio URL The URL of the audio file to be transcribed (required if Audio Source is "Audio URL").
File Store Key The key identifying the audio file stored in the file storage system (required if Audio Source is "File Store Key").
Language Language code for transcription (e.g., "en", "es", "fr"). Leave blank for automatic language detection.
Translate Whether to translate the transcribed content into English or the specified language (true/false).
By Speaker Whether to identify and separate different speakers in the audio (true/false).
Webhook URL URL to send the transcription result asynchronously. If provided, processing happens asynchronously and results are sent to this URL.
Batch Size Number controlling how the audio is chunked for processing. Maximum value is 40. Default is 30.

Output

The node outputs JSON data containing the transcription results. This typically includes the transcribed text, language information, and if enabled, speaker separation details. When asynchronous processing via webhook is used, the node may not output immediate results but instead relies on the webhook callback.

If binary data is involved (not explicitly indicated here), it would represent audio or related media, but this node primarily focuses on JSON transcription output.

Dependencies

  • Requires an API key credential for authenticating with the JigsawStack API.
  • Needs internet access to call the external JigsawStack API endpoint at https://api.jigsawstack.com/v1.
  • Optional webhook URL must be publicly accessible to receive asynchronous transcription results.

Troubleshooting

  • Invalid or missing API key: Ensure that a valid API key credential is configured in n8n for this node.
  • Incorrect audio source configuration: Verify that when "Audio URL" is selected, a valid URL is provided; similarly, when "File Store Key" is selected, the correct key is entered.
  • Language code issues: Using unsupported or incorrect language codes may lead to inaccurate transcription or errors. Use standard ISO language codes.
  • Webhook failures: If using a webhook URL, ensure it is reachable and correctly handles incoming POST requests; otherwise, transcription results will not be received.
  • Batch size limits: Setting batch size above 40 may cause errors; keep it at or below 40.
  • Network connectivity: Since the node depends on an external API, network issues can cause timeouts or failures.

Links and References

Discussion