Embeddings Google Gemini Extended icon

Embeddings Google Gemini Extended

Use Google Gemini Embeddings with extended features like output dimensions support

Overview

This node generates text embeddings using Google Gemini's embedding models with extended features such as specifying output dimensions and task types. It is useful for transforming textual data into vector representations that can be used in various AI tasks like document retrieval, semantic similarity, classification, clustering, question answering, fact verification, and code retrieval.

Typical use cases include:

  • Creating vector embeddings of documents or queries to enable efficient search and retrieval.
  • Generating embeddings tailored for specific tasks (e.g., semantic similarity or classification).
  • Customizing the dimensionality of embeddings for downstream applications.
  • Batch processing multiple texts to optimize API usage and handle rate limits.

Properties

Name Meaning
Model Name The model used for generating embeddings. Examples: text-embedding-004, embedding-001, gemini-embedding-001. More info at Google Gemini Models.
Output Dimensions Number of dimensions for the output embeddings. Set to 0 to use the model default. Supported only by certain models like text-embedding-004 and gemini-embedding-001.
Options Additional options for embedding generation:
- Task Type The type of task for which embeddings will be used. Options: Retrieval Document, Retrieval Query, Semantic Similarity, Classification, Clustering, Question Answering, Fact Verification, Code Retrieval Query.
- Title Optional title for the text; applicable only when Task Type is "Retrieval Document".
- Strip New Lines Whether to remove new line characters from input text before embedding (default: true).
- Batch Size Maximum number of texts to embed in a single request. Lower this if you encounter rate limits (default: 100).

Output

The node outputs an array of embeddings corresponding to the input texts. Each embedding is a numeric vector (array of numbers) representing the semantic content of the input text.

Output format example (simplified):

[
  [0.123, 0.456, 0.789, ...],  // Embedding vector for first input text
  [0.234, 0.567, 0.890, ...],  // Embedding vector for second input text
  ...
]

If binary data were supported, it would represent the raw embedding vectors, but this node outputs JSON arrays of floats.

Dependencies

  • Requires an API key credential for Google Palm API (Google Gemini generative language service).
  • Node uses the Google Gemini embedding REST API endpoint: https://generativelanguage.googleapis.com/v1beta/models/{model}:embedContent.
  • Network access to Google APIs must be configured.
  • No additional external libraries beyond those bundled (@langchain/google-genai).

Troubleshooting

  • API Errors: If the Google Gemini API returns errors, the node throws an error with status and message. Common causes:
    • Invalid or missing API key credential.
    • Using unsupported model names or parameters.
    • Exceeding rate limits (reduce batch size).
  • Empty or Missing Embeddings: Ensure input texts are non-empty and properly formatted. If stripNewLines is enabled, verify that removing new lines does not corrupt the text.
  • Unsupported Output Dimensions: Setting output dimensions on unsupported models may cause errors or ignored settings.
  • Batch Size Too Large: If rate limits occur, reduce the batch size property.

Links and References

Discussion