OpenSearch icon

OpenSearch

Consume the OpenSearch API

Actions9

Overview

The node interacts with the OpenSearch API to perform various operations on documents and indices. Specifically, for the Document - Get Many operation, it retrieves multiple documents from a specified index in OpenSearch based on query parameters and options.

This operation is useful when you want to fetch a batch of documents matching certain criteria from an OpenSearch index. For example, you might use it to:

  • Retrieve all user activity logs stored in an index.
  • Fetch product data filtered by category or price range.
  • Query documents with specific attributes for reporting or analysis.

The node supports returning either all matching documents (with pagination considerations) or a limited number of results, and can simplify the output for easier consumption.

Properties

Name Meaning
Index ID The identifier of the OpenSearch index containing the documents to retrieve.
Return All Whether to return all matching documents or only up to a specified limit.
Limit Maximum number of documents to return if not returning all.
Simplify Whether to return a simplified version of the response (only document ID and source fields) instead of the full raw data.
Options Additional optional parameters to customize the search request:
   Allow No Indices If false, returns an error if any target indices are missing or closed. Defaults to true.
   Allow Partial Search Results If true, partial results are returned even if some shards fail or timeout. Defaults to true.
   Batched Reduce Size Number of shard results reduced at once on the coordinating node. Default 512.
   CCS Minimize Roundtrips Minimizes network round-trips for cross-cluster search requests. Defaults to true.
   Doc Value Fields Comma-separated list of fields to return as docvalue representation for each hit.
   Expand Wildcards Type of indices wildcard expressions can match. Options: all, closed, hidden, none, open. Default "open".
   Explain Returns detailed score computation info per hit. Defaults to false.
   Ignore Throttled Whether frozen indices are ignored. Defaults to true.
   Ignore Unavailable Whether missing or closed indices are excluded from response. Defaults to false.
   Max Concurrent Shard Requests Number of shard requests executed concurrently per node. Default 5.
   Pre-Filter Shard Size Threshold to enforce pre-filtering of shards based on query rewriting. Default 1.
   Query Elasticsearch Query DSL JSON object defining the search query.
   Request Cache Enables caching of search results for size=0 requests. Defaults to false.
   Routing Target primary shard routing value.
   Search Type Method for calculating distributed term frequencies. Options: DFS Query Then Fetch, Query Then Fetch. Default "query_then_fetch".
   Sequence Number and Primary Term Returns sequence number and primary term of last modification per hit. Defaults to false.
   Sort Comma-separated list of field:direction pairs to sort results.
   Source Excludes Comma-separated list of source fields to exclude from response.
   Source Includes Comma-separated list of source fields to include in response.
   Stats Tag string for logging and statistics purposes.
   Stored Fields Whether to retrieve stored document fields instead of the _source. Defaults to false.
   Terminate After Max number of documents to collect per shard. Default 0 (no limit).
   Timeout Time period to wait for active shards. Default "1m" (one minute).
   Track Scores Whether to calculate and return document scores even if not used for sorting. Defaults to false.
   Track Total Hits Number of hits to count accurately. Default 10000.
   Version Whether to return document version as part of each hit. Defaults to false.

Output

The output is a JSON array where each element represents a retrieved document. Each document object contains:

  • _id: The unique identifier of the document.
  • Other fields correspond to the document's source fields as stored in the index.

If the Simplify option is enabled, the output includes only the _id and the flattened source fields for easier consumption.

If Simplify is disabled, the output contains the full raw response from OpenSearch for each hit, including metadata like _index, _score, and other details.

The node does not output binary data for this operation.

Dependencies

  • Requires an active connection to an OpenSearch cluster.
  • Needs an API authentication credential configured in n8n to authorize requests to the OpenSearch API.
  • Uses the OpenSearch REST API endpoints for document retrieval.

Troubleshooting

  • Invalid JSON in 'Query' option: If the Query DSL JSON is malformed, the node will throw an error indicating invalid JSON. Ensure the query is valid JSON and follows Elasticsearch Query DSL syntax.
  • Index Not Found Errors: If the specified index ID does not exist or is closed, errors may occur unless the "Allow No Indices" option is set to true.
  • Pagination Limits: By default, you cannot page through more than 10,000 hits without using the "Sort" option. To retrieve more results, add sorting to your query.
  • Shard Failures or Timeouts: Partial results may be returned if "Allow Partial Search Results" is enabled; otherwise, the node may throw errors.
  • Authentication Errors: Ensure the API key or credentials used have sufficient permissions to read from the specified indices.

Links and References

Discussion