Waterflai
  • Welcome to Waterflai
  • Getting Started
    • Concepts
    • Quickstart
  • Providers
    • Providers Overview
    • Providers setup
    • AI models
    • Choose the right models
  • Knowledge
    • Knowledge Overview
    • Knowledge connectors
    • Knowledge collections
  • Studio
    • Studio Overview
    • Studio Builders
      • Light Builder
      • Dream Builder
      • Workflow Builder
      • Flow components (nodes)
        • Input Node
        • Output Node
        • LLM model Node
        • Multimodal LLM Node
        • Dall-E 2 (image generation) Node
        • Dall-E 3 (image generation) Node
        • Sora video generation Node
        • Text-to-Speech (TTS) Node
        • Speech-to-Text (STT) Node
        • OCR Node
        • Agent Node
        • Reranker Node
        • Knowledge retrieval Node
        • Vector store insert Node
        • Vector store record delete Node
        • Gitbook loader
        • Notion Database Node
        • Figma Node
        • Webpage scraper Node
        • Sitemap Scraper Node
        • API Request Node
        • Document metadata extraction Node
        • Document metadata update Node
        • Character splitter Node
        • HTML splitter Node
        • Markdown Splitter
        • Calculator tool Node
        • Text as tool Node
        • Knowledge retrieval tool Node
        • Conditional Node
        • Iteration loop Node
      • Testing and Debugging
    • Publishing
    • Integration with API
    • Embedding in website
  • Analytics
    • Analytics Overview
    • Dashboards
    • Logs
  • Administration
    • Organization users
    • Workspace
    • Security and permissions
  • Troubleshooting
    • Support
Powered by GitBook
On this page
  • Overview
  • Configuration
  • Best Practices
  • Common Issues
  1. Studio
  2. Studio Builders
  3. Flow components (nodes)

Markdown Splitter

Overview

The Markdown Splitter Node divides Markdown content into segments based on header levels. It enables structured splitting of Markdown documents while preserving the hierarchical relationship between sections, making it ideal for processing documentation, articles, and other Markdown-formatted content.

Usage cost: 1 credit

Configuration

Settings

  1. Documents Selection

    • Documents to Split*: Select input content to process

    • Supports:

      • Markdown strings

      • Document objects with Markdown content

      • Arrays of Markdown content

  2. Header Configuration

    • Headers to Split On: Define header syntax and metadata keys

      • Header Syntax: Markdown header symbols (e.g., #, ##)

      • Metadata Key: Key used to store header content (e.g., Header 1, Header 2)

  3. Processing Options

    • Return Each Line: Split content line by line

      • When enabled: Each line becomes a separate document

      • When disabled: Groups content between headers

    • Strip Headers: Remove header syntax from content

      • When enabled: Headers removed from output content

      • When disabled: Headers preserved in content

Output Ports

  • split_documents (Document[]): Array of split documents

    • Each document contains sectioned content

    • Metadata includes header information

    • Preserves original document metadata

Best Practices

  1. Header Definition

    • Use consistent header levels

    • Start with highest level needed

    • Maintain logical hierarchy

    • Use clear metadata keys

  2. Content Processing

    • Consider document structure

    • Plan metadata organization

    • Test with sample content

  3. Options Selection

    • Use Return Each Line for granular analysis

    • Enable Strip Headers for clean content

    • Consider downstream processing needs

    • Balance granularity vs. context

Common Issues

  • Inconsistent header formatting

  • Missing header levels

  • Special character handling

PreviousHTML splitter NodeNextCalculator tool Node

Last updated 3 months ago