Actions5
- Image Recognition Actions
- Speech Recognition Actions
- Text Actions
Overview
The node provides integration with Lark AI services, specifically supporting speech recognition from audio files. The "Audio File Speech Recognition Asr" operation allows users to transcribe spoken content from an audio file into text using Lark's AI-powered automatic speech recognition (ASR) engine.
This node is beneficial in scenarios where automated transcription of audio recordings is needed, such as converting meeting recordings, voice notes, or customer service calls into searchable and editable text. It can be used to streamline workflows that require speech-to-text conversion without manual intervention.
Practical examples:
- Transcribing recorded interviews for documentation.
- Converting voice messages into text for further processing.
- Automating subtitle generation for video content.
Properties
Name | Meaning |
---|---|
Authentication | Method of authenticating API requests; options are "Tenant Token" or "OAuth2". |
Config | Configuration parameters for the ASR request: |
- engine_type : The speech recognition engine type to use. |
|
- file_id : Identifier of the audio file to transcribe. |
|
- format : Audio file format (e.g., mp3, wav). |
|
Speech | Contains the speech data payload: |
- speech : The actual speech content or reference required by the API. |
|
Custom Body | Allows sending a fully custom JSON body for the request, overriding standard config/speech. |
Options | Additional options: |
- Use Custom Body : Boolean flag to enable sending a custom request body. |
Output
The node outputs JSON data containing the transcription results returned by the Lark AI speech recognition API. This typically includes recognized text and possibly metadata about the recognition process (such as confidence scores or timing information).
If binary data output is supported (not explicitly shown here), it would represent audio or related media content, but this node primarily focuses on JSON transcription results.
Dependencies
- Requires access to Lark Suite Open APIs.
- Needs either a Tenant Token or OAuth2 credentials configured in n8n for authentication.
- Network connectivity to
https://open.larksuite.com/open-apis
is necessary.
Troubleshooting
- Authentication errors: Ensure that the correct authentication method is selected and valid credentials are provided.
- Invalid file ID or format: Verify that the
file_id
corresponds to an existing audio file accessible by the API and that theformat
matches the file type. - API rate limits or quota exceeded: Check Lark API usage limits and ensure your account has sufficient quota.
- Malformed request body: If using the custom body option, ensure the JSON structure matches the expected schema.
- Empty or incorrect transcription results: Confirm that the audio file contains clear speech and that the correct engine type is selected.