Local free
Run Scout on your machine with CLI, HTTP, Docker, or skill usage. You bring compute, browser, storage, and optional keys.
Evidence-grade acquisition
Scout is a local-first acquisition workbench for AI workflows. It does more than fetch markdown: it preserves source evidence, blocked pages, citations, validation, and typed records your agents, search indexes, and intelligence pipelines can trust.
Scout beta demo
Watch the beta-safe flow: Scout starts from a public category URL, preserves the evidence trail, produces product records, and exports them to downstream tools.
No hard-site bypass guarantee.
What Scout produces
Every fetched, rendered, saved, blocked, or API-provided source gets a stable identity, URL, provider, status, timestamp, and hash.
Records carry citations back to source evidence. Missing citations become validation warnings instead of hidden risk.
Outputs are files you can inspect, version, export, hand to agents, load into search, or replay without re-crawling.
Why not just another crawler?
Firecrawl-style APIs are excellent when you need hosted markdown from a URL. Scout is for the next step: turning acquisition into records that can be audited, exported, and reused.
Use Scout when the question is not only "can I fetch this page?" but "can I prove where this field came from, explain what was blocked, and ship the result into a downstream workflow?"
Acquisition ladder
Default local acquisition for normal pages, sitemaps, crawls, and screenshots.
Launch a controlled browser, capture screenshots, DOM, links, and source evidence.
When your browser can see the page, capture authorized session evidence locally.
Replay saved HTML, DOM, markdown, and artifacts without refetching the website.
Artifact contract
run/
manifest.json
records.json
records.jsonl
source_pages.json
blocked_pages.json
validation.json
extraction_report.md
raw/
screenshots/
Typed outputs
Exports beyond Algolia
Prepare object IDs, names, prices, categories, URLs, images, variants, citations, and completeness warnings for search indexes.
Export product and intelligence records to CSV or Google Sheets workflows for review, enrichment, and lightweight operations.
Keep local SQLite/JSONL datasets for repeatable research, diffing, replay, and agent consumption.
Distribution model
Free for beta users. Run on your machine, choose a workdir, keep artifacts locally, and bring your own optional keys.
Convenience API for teams that want managed workers and API keys. Hosted usage must be metered.
Launch status
Crawl4AI currently resolves lxml 5.4.0, and the dependency audit reports a known vulnerability fixed in lxml 6.1.0. Scout will not claim public production readiness until the dependency audit is clean, or a formal exception is approved.
Hosted beta access is capped, approved-tester only, and metered. Pricing is not approved yet; hosted access is limited credits, not unlimited hosted scraping or lifetime API usage.
Scout is safest to test locally: your machine, your workdir, your artifacts, your optional keys. No clean security claim is made for hosted production until the remaining launch gates close.
What people use it for
Extract listing records, preserve source evidence, and prepare exports for Algolia, JSONL, CSV, SQLite, or spreadsheet adapters.
Collect company, investor, careers, news, docs, and website-quality evidence into repeatable artifact folders.
Give local agents a controlled acquisition layer with provenance instead of raw, untraceable fetches.
Pricing direction
Install Scout locally and run with your own compute, browser, storage, and optional provider keys.
Invite-only hosted credits with usage caps and short artifact retention while compute, browser, LLM, storage, security, and support costs are measured.
Prepaid credits are the leading paid model. Monthly plans come later only if real usage patterns justify them.
Quickstart
Local Scout is the cleanest beta path because you own the machine, browser, artifacts, and keys. Hosted Scout comes after tenancy, quotas, and payment are production-ready.
git clone https://github.com/arijitchowdhury80/scout.git
cd scout
git checkout codex/scout-platform-foundation
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
playwright install chromium
scout serve
Private beta
The first beta should test whether Scout makes acquisition more inspectable, reusable, and trustworthy. Not whether it can promise infinite hosted crawling.
Local Scout stays free. Hosted beta creates a metered convenience API key after Stripe confirms payment.