HTML splitter Node
Overview
The HTML Splitter Node divides HTML content into segments based on header tags (h1, h2, etc.). It's particularly useful for structuring content hierarchically and maintaining the semantic relationship between different sections of HTML documents.
Usage cost: 1 credit
Configuration
Settings
Documents Selection
Documents to Split*: Select input content to process
Supports:
HTML strings
Document objects with HTML content
Arrays of HTML content
Header Configuration
Headers to Split On: Define header tags and their metadata keys
Header Tag: HTML tag to split on (e.g., h1, h2)
Metadata Key: Key used to store header content (e.g. Header 1, Header 2)
Return Each Element: Toggle to control output granularity
When enabled: Returns each element with associated headers
When disabled: Groups content between headers
Output Ports
split_documents
(Document[]): Array of split documentsEach document contains the content between headers
Metadata includes hierarchical header information
Maintains original document metadata
Best Practices
Header Selection
Choose appropriate header levels
Maintain logical hierarchy
Use consistent header structure
Metadata Keys
Use descriptive key names
Follow consistent naming convention
Consider hierarchical relationships
Common Issues
Inconsistent HTML structure
Missing header tags
Invalid HTML formatting
Last updated