Package Information
Available Nodes
Documentation
WaterCrawl Nodes for n8n
This is a community node for n8n that provides integration with WaterCrawl - a powerful web scraping and crawling API.
Installation
To install this community node, please follow the official n8n community node installation guide.
Features
Crawl Operations
Scrape URL: Extract data from any webpage with a single request, supports various options:
- Wait time after page load
- Custom timeout settings
- HTML content inclusion
- Link extraction
- Main content filtering
- Cookie acceptance handling
- Custom locale settings
Create Crawl Requests: Start larger crawling jobs with advanced options:
- Spider configuration (depth, domain limits, etc.)
- Page rendering settings
- Custom plugins and extensions
Get Crawl Request Details: Retrieve information about a specific crawl job
- Status updates
- Creation timestamps
- Configuration details
List Crawl Requests: View all your crawling jobs with pagination support
Get Crawl Results: Retrieve the data extracted from crawling operations
- Download full results
- Filter by various criteria
Stop Crawl: Cancel running crawl jobs to save resources
Advanced Features
- Asynchronous Processing: Start jobs and retrieve results later
- Synchronous Mode: Wait for results in a single request
- Detailed Metadata: Access comprehensive information about crawled pages
- Customizable Extraction: Configure exactly what data to retrieve
Development
For setting up local development and testing your n8n community node, please refer to these resources:
- Clone the repository
- Install dependencies:
pnpm install
- Build the code:
pnpm build
Testing
For testing your node, follow the official n8n testing guide.
Run tests with:
pnpm test
Commit Guidelines
This project uses Conventional Commits for semantic versioning. When committing changes, use:
pnpm commit
This will guide you through creating a properly formatted commit message.