Actions7
- Content Actions
- Navigation Actions
Overview
The node "CloudBrowser" enables interaction with websites through a cloud-based browser instance. Specifically, the Content - Get PDF From Website operation navigates to a specified URL and generates a PDF snapshot of the webpage. This is useful for automating the capture of web pages as PDFs for archiving, reporting, or sharing purposes.
Common scenarios include:
- Automatically generating PDFs of invoices, reports, or articles from web pages.
- Archiving web content snapshots for compliance or record-keeping.
- Creating printable versions of dynamic web pages without manual intervention.
Example: You want to generate a PDF version of a product page on an e-commerce site every day to track changes in pricing or layout. This node can navigate to the URL and produce a PDF file automatically.
Properties
Name | Meaning |
---|---|
URL to Navigate | The URL of the website to open and convert into a PDF. |
Navigation Options | Options controlling how navigation behaves: - Wait Until: When to consider navigation finished (load, domcontentloaded, networkidle0, networkidle2). - Timeout (Ms): Max time to wait for navigation. |
Browser Configuration | Settings for the browser instance: - Browser Type: Chrome, Chromium, or ChromeHeadlessShell. - Headless Mode: Run browser without UI. - Stealth Mode: Enable stealth to avoid detection. - Keep Open (Seconds): How long to keep browser open. - Label: Name for the browser instance. - Save Session: Save session for reuse. - Recover Session: Recover saved session. |
Custom Arguments | Additional command-line arguments to pass to the browser. |
Ignored Default Arguments | Default browser arguments to ignore when launching. |
Proxy Configuration | Proxy server settings: - Host, Port, Username, Password. |
PDF Options | PDF generation options: - Format: Paper size (A0, A1, A2, A3, A4, A5, A6, Legal, Letter, Tabloid). - Landscape: Generate PDF in landscape orientation. - Print Background: Include background graphics. - Scale: Scale factor for rendering (0.1 to 2). - Margin: Margins in millimeters (top, right, bottom, left). - Page Ranges: Specific pages to print (e.g., "1-5,8,11-13"). |
Output
The output JSON object includes:
url
: The final URL of the loaded page.title
: The page title.pdf
: A base64-encoded string representing the generated PDF file, prefixed withdata:application/pdf;base64,
.pdfBinary
: The raw binary data buffer of the PDF.filename
: Suggested filename for the PDF, e.g.,webpage_<timestamp>.pdf
.fileExtension
: Always"pdf"
.mimeType
: Always"application/pdf"
.
This output allows downstream nodes to save the PDF file, send it via email, or upload it to storage.
Dependencies
- Requires access to the external CloudBrowser API service at
https://production.cloudbrowser.ai/api/v1/Browser/Open
to open and control browser instances. - Needs an API token credential for authentication with the CloudBrowser service.
- Uses Puppeteer library internally to connect to the browser WebSocket endpoint and perform navigation and PDF generation.
- No local browser installation is required; all browser operations are performed remotely via the cloud service.
Troubleshooting
- No WebSocket address received from the browser service: Indicates failure to open a browser instance. Check API token validity, network connectivity, and CloudBrowser service status.
- Timeout errors during navigation: If the page takes too long to load, increase the timeout value in Navigation Options.
- PDF generation issues: Ensure the URL is accessible and returns valid HTML content. Some sites may block automated browsers or require authentication.
- Proxy configuration problems: Verify proxy host, port, and credentials if used.
- Session recovery failures: If recovering a saved session fails, try disabling session recovery or saving a new session.
Links and References
- Puppeteer Documentation – For understanding browser automation concepts.
- CloudBrowser Service – Official site for the cloud browser API (general reference).
- PDF Paper Sizes – Explanation of standard paper formats supported.