DOCX Find & Replace

Find and replace text in DOCX documents with support for regex patterns and multiple operations

Overview

This node performs find-and-replace operations on DOCX files contained in the binary input data. It supports multiple replacement operations in a single run, including the use of regular expressions with options for case sensitivity and global replacement. The node reads the DOCX file content, applies all specified text replacements within the document XML, and outputs a modified DOCX file.

Common scenarios include:

Automatically updating templated DOCX documents by replacing placeholder text.
Cleaning or modifying DOCX reports or contracts by batch replacing terms or phrases.
Removing sensitive information from DOCX files before sharing.

For example, you could replace all occurrences of "ClientName" with an actual client’s name or remove confidential keywords using regex patterns.

Properties

Name	Meaning
Binary Property	Name of the binary property that contains the DOCX file to process (default: `data`).
Operations	A list of find-and-replace operations to perform on the document. Each operation includes:
	- Find: Text or regex pattern to search for (required).
	- Replace: Text to replace matches with (can be empty to delete).
	- Use Regex: Whether to treat the find text as a regular expression (true/false).
	- Case Sensitive: Whether the search is case sensitive (only if not using regex).
	- Global Replace: For regex, whether to replace all occurrences globally (true/false).
Options	Additional options for output:
	- Output Binary Property Name: Name for the output binary property containing the modified DOCX file.
	- Output Filename: Custom filename for the output DOCX file (optional).

Output

The node outputs items with the following structure:

json: Contains original JSON data plus a docxFindReplace object summarizing the operation:
- operationsPerformed: Number of find-and-replace operations executed.
- totalReplacements: Total count of replacements made across all operations.
- originalFileName: Original filename of the input DOCX.
- outputFileName: Filename used for the output DOCX.
- processedAt: ISO timestamp when processing occurred.
- operationResults: Array detailing each operation with the find/replace strings and number of replacements made.
binary: Contains the modified DOCX file under the specified output binary property name. This binary data has:
- data: Base64 encoded DOCX file content.
- mimeType: Always set to application/vnd.openxmlformats-officedocument.wordprocessingml.document.
- fileName: Filename of the output DOCX.
- fileExtension: Always "docx".

Dependencies

Requires the external pizzip package to read and manipulate DOCX files (ZIP archive format).
No additional API keys or external services are needed.
The node expects valid DOCX files in the input binary data.

Troubleshooting

Missing pizzip package: If the node throws an error about missing pizzip, install it via npm install pizzip in your n8n environment.
No binary data found: Ensure the input item contains binary data with the specified binary property name.
Invalid file type: Input binary must be a DOCX file; other file types will cause errors.
Failed to read DOCX file: The input file may be corrupted or not a valid DOCX.
Invalid regex pattern: If a regex pattern is malformed, the node will throw an error specifying which pattern failed.
Failed to update or generate DOCX: Indicates issues manipulating the document XML or generating the output file, possibly due to file corruption or unsupported content.

To resolve these, verify input files, regex syntax, and ensure dependencies are installed correctly.

Links and References

DOCX File Format Overview
RegExp MDN Documentation
PizZip GitHub Repository (PizZip is a fork of JSZip specialized for Office files)