Lark AI

Lark AI Management

Actions5

Image Recognition Actions
- Basic Image Recognition Ocr
Speech Recognition Actions
- Streaming Speech Recognition Asr
- Audio File Speech Recognition Asr
Text Actions
- Translate With Machine Translation
- Text Language Recognition

Overview

The node provides integration with Lark AI services, specifically supporting speech recognition from audio files. The "Audio File Speech Recognition Asr" operation allows users to transcribe spoken content from an audio file into text using Lark's AI-powered automatic speech recognition (ASR) engine.

This node is beneficial in scenarios where automated transcription of audio recordings is needed, such as converting meeting recordings, voice notes, or customer service calls into searchable and editable text. It can be used to streamline workflows that require speech-to-text conversion without manual intervention.

Practical examples:

Transcribing recorded interviews for documentation.
Converting voice messages into text for further processing.
Automating subtitle generation for video content.

Properties

Name	Meaning
Authentication	Method of authenticating API requests; options are "Tenant Token" or "OAuth2".
Config	Configuration parameters for the ASR request:
	- `engine_type`: The speech recognition engine type to use.
	- `file_id`: Identifier of the audio file to transcribe.
	- `format`: Audio file format (e.g., mp3, wav).
Speech	Contains the speech data payload:
	- `speech`: The actual speech content or reference required by the API.
Custom Body	Allows sending a fully custom JSON body for the request, overriding standard config/speech.
Options	Additional options:
	- `Use Custom Body`: Boolean flag to enable sending a custom request body.

Output

The node outputs JSON data containing the transcription results returned by the Lark AI speech recognition API. This typically includes recognized text and possibly metadata about the recognition process (such as confidence scores or timing information).

If binary data output is supported (not explicitly shown here), it would represent audio or related media content, but this node primarily focuses on JSON transcription results.

Dependencies

Requires access to Lark Suite Open APIs.
Needs either a Tenant Token or OAuth2 credentials configured in n8n for authentication.
Network connectivity to https://open.larksuite.com/open-apis is necessary.

Troubleshooting

Authentication errors: Ensure that the correct authentication method is selected and valid credentials are provided.
Invalid file ID or format: Verify that the file_id corresponds to an existing audio file accessible by the API and that the format matches the file type.
API rate limits or quota exceeded: Check Lark API usage limits and ensure your account has sufficient quota.
Malformed request body: If using the custom body option, ensure the JSON structure matches the expected schema.
Empty or incorrect transcription results: Confirm that the audio file contains clear speech and that the correct engine type is selected.