Overview
The node interacts with the OpenSearch API to perform various operations on documents and indices. Specifically, for the Document - Get Many operation, it retrieves multiple documents from a specified index in OpenSearch based on query parameters and options.
This operation is useful when you want to fetch a batch of documents matching certain criteria from an OpenSearch index. For example, you might use it to:
- Retrieve all user activity logs stored in an index.
- Fetch product data filtered by category or price range.
- Query documents with specific attributes for reporting or analysis.
The node supports returning either all matching documents (with pagination considerations) or a limited number of results, and can simplify the output for easier consumption.
Properties
Name | Meaning |
---|---|
Index ID | The identifier of the OpenSearch index containing the documents to retrieve. |
Return All | Whether to return all matching documents or only up to a specified limit. |
Limit | Maximum number of documents to return if not returning all. |
Simplify | Whether to return a simplified version of the response (only document ID and source fields) instead of the full raw data. |
Options | Additional optional parameters to customize the search request: |
Allow No Indices | If false, returns an error if any target indices are missing or closed. Defaults to true. |
Allow Partial Search Results | If true, partial results are returned even if some shards fail or timeout. Defaults to true. |
Batched Reduce Size | Number of shard results reduced at once on the coordinating node. Default 512. |
CCS Minimize Roundtrips | Minimizes network round-trips for cross-cluster search requests. Defaults to true. |
Doc Value Fields | Comma-separated list of fields to return as docvalue representation for each hit. |
Expand Wildcards | Type of indices wildcard expressions can match. Options: all, closed, hidden, none, open. Default "open". |
Explain | Returns detailed score computation info per hit. Defaults to false. |
Ignore Throttled | Whether frozen indices are ignored. Defaults to true. |
Ignore Unavailable | Whether missing or closed indices are excluded from response. Defaults to false. |
Max Concurrent Shard Requests | Number of shard requests executed concurrently per node. Default 5. |
Pre-Filter Shard Size | Threshold to enforce pre-filtering of shards based on query rewriting. Default 1. |
Query | Elasticsearch Query DSL JSON object defining the search query. |
Request Cache | Enables caching of search results for size=0 requests. Defaults to false. |
Routing | Target primary shard routing value. |
Search Type | Method for calculating distributed term frequencies. Options: DFS Query Then Fetch, Query Then Fetch. Default "query_then_fetch". |
Sequence Number and Primary Term | Returns sequence number and primary term of last modification per hit. Defaults to false. |
Sort | Comma-separated list of field:direction pairs to sort results. |
Source Excludes | Comma-separated list of source fields to exclude from response. |
Source Includes | Comma-separated list of source fields to include in response. |
Stats | Tag string for logging and statistics purposes. |
Stored Fields | Whether to retrieve stored document fields instead of the _source. Defaults to false. |
Terminate After | Max number of documents to collect per shard. Default 0 (no limit). |
Timeout | Time period to wait for active shards. Default "1m" (one minute). |
Track Scores | Whether to calculate and return document scores even if not used for sorting. Defaults to false. |
Track Total Hits | Number of hits to count accurately. Default 10000. |
Version | Whether to return document version as part of each hit. Defaults to false. |
Output
The output is a JSON array where each element represents a retrieved document. Each document object contains:
_id
: The unique identifier of the document.- Other fields correspond to the document's source fields as stored in the index.
If the Simplify option is enabled, the output includes only the _id
and the flattened source fields for easier consumption.
If Simplify is disabled, the output contains the full raw response from OpenSearch for each hit, including metadata like _index
, _score
, and other details.
The node does not output binary data for this operation.
Dependencies
- Requires an active connection to an OpenSearch cluster.
- Needs an API authentication credential configured in n8n to authorize requests to the OpenSearch API.
- Uses the OpenSearch REST API endpoints for document retrieval.
Troubleshooting
- Invalid JSON in 'Query' option: If the Query DSL JSON is malformed, the node will throw an error indicating invalid JSON. Ensure the query is valid JSON and follows Elasticsearch Query DSL syntax.
- Index Not Found Errors: If the specified index ID does not exist or is closed, errors may occur unless the "Allow No Indices" option is set to true.
- Pagination Limits: By default, you cannot page through more than 10,000 hits without using the "Sort" option. To retrieve more results, add sorting to your query.
- Shard Failures or Timeouts: Partial results may be returned if "Allow Partial Search Results" is enabled; otherwise, the node may throw errors.
- Authentication Errors: Ensure the API key or credentials used have sufficient permissions to read from the specified indices.