Docs

Crawler documentation

Compact product docs for using SourceOfTruth.io crawler workflows, credits, exports, billing gates, and source-data preparation.

Quick actions
Core docs
Crawler workflow

Search + Web Crawler

The active product is the crawler workbench: search saved source content, scrape one page, crawl a bounded website, then export clean outputs.

  • Search saved source content
  • Scrape a single web page
  • Crawl a bounded website
  • Estimate usage before launch
  • Export Markdown, JSON, or CSV
Open crawler
Output workflow

Clean export package

Exports should stay readable, portable, and easy to move into another tool or RAG workflow after a crawl completes.

  • Markdown for human review
  • JSON for structured automation
  • CSV for spreadsheet workflows
  • History and job records for logged-in users
  • Cloud storage remains optional/prewired
View history
Launch guardrails
ETL/ELT remains coming soonDo not sell the ETL/ELT Pipeline as active until a future production thread completes implementation, testing, connector work, governance, and launch readiness.
Billing is controlled by launch gatesCheckout and billing portal access remain controlled by live-billing gates. Existing access should stay unchanged when gates are closed.

Crawler credits

Metered units used by search, scrape, crawl, render, extraction, and export workflows.

Estimate before run

The crawler workbench should show a pre-run estimate before launching bounded crawling work.

Clean exports

Markdown, JSON, and CSV output packages designed for review, automation, and downstream AI workflows.

ETL/ELT Pipeline

Coming soon. The production ETL/ELT product remains outside the active public product menu until the full roadmap is complete.