Gitbook loader

Overview

The GitBook Loader Node allows you to extract content from GitBook documentation pages. It can either load a single page or recursively load all pages from a GitBook documentation site, making it ideal for creating knowledge bases from GitBook documentation.

Usage cost: 1 credit for unique page / 10 credits for whole documentation

Configuration

Settings

  1. GitBook URL

    • URL to the GitBook documentation

    • Can be a specific page URL or root URL when loading all paths

    • Required field

    • Supports variable interpolation

  2. Load Options

    • Load All Paths: Toggle to recursively load all pages

      • When enabled: URL must be the GitBook root

      • When disabled: Loads only the specified page

Output Ports

  1. documents (List[Document]): List of Document objects containing:

    • Page content

    • Metadata (URL, title, etc.)

  2. documents_content (List[string]):

    • List of extracted text content

    • Content only, without metadata

Best Practices

  1. URL Configuration

    • Use root URL when loading all paths

    • Ensure URLs are accessible (public documentation)

    • Verify URL format before execution

  2. Content Loading

    • Use single page loading for specific content

    • Consider load time for large documentation sites

Common Issues

  • Rate limiting from GitBook servers

  • Memory limitations with large documentation sites

  • Slow loading times for recursive fetching

  • Malformed URLs causing loading failures

Last updated