ScrapingTest logoscrapingtest

Scrape.do

99.4 percent success across 20 domains and 100 percent across 550 Amazon stress trials at a blended $0.87/1K.

Published 2026-06-05 · 2519-word independent review · ScrapingTest Research

────────────────────────────────────────────────────────────────────────────────────────────────────

Verdict

Grade: A. Highest sustained success rate in our cohort and the only provider unaffected by Amazon's June 2026 anti-bot updates. Dragged down a notch by opaque auto-upgrade pricing and a thin AI/MCP story.

Best for

  • Production Amazon scraping at scale. 100 percent across 550 stress trials at $0.11/1K via /plugin/amazon/pdp.
  • Hard-target SR work (LinkedIn, G2, Idealista) where ScrapingAnt, Zyte, ScraperAPI all FAIL. Scrape.do reaches all three at over 99 percent SR.
  • Cost-sensitive teams with mixed-domain workloads who will instrument scrape.do-request-cost for honest per-domain cost tracking.
  • Async overnight batch jobs that need to run alongside live traffic. Separate 30-percent-of-plan async pool.
  • Engineering teams that prefer a single HTTP endpoint over compute-unit accounting or actor configuration.

Avoid if

  • You need AI extraction or schema-by-prompt. No /extract endpoint, no MCP server, no LangChain integration (use Firecrawl, ScrapegraphAI, or Apify).
  • You need predictable headline pricing. The per-domain auto-upgrade matrix means the $0.11/1K headline is wrong by 8x on protected domains.
  • You need an official Python or Node SDK. Only the scrapy-scrapedo wrapper and raw HTTP examples exist.
  • Your workflow centers on a marketplace of pre-built scrapers. Scrape.do ships an Amazon plugin plus a seven-sub-API Google suite, which is closer to SerpApi than to Apify's 31k Actors or Bright Data's 250+ scrapers for niche regional coverage.

What we found in the lab

Scrape.do finished our 20-domain benchmark at 99.39 percent mean success rate, the highest sustained result in our cohort at this volume, with a blended cost of $0.87 per 1,000 successful requests (auto-upgrade matrix factored in) and a mean latency of 4.19 seconds. Of the 20 domains tested, 11 came back at 100 percent success and another 7 landed at 99.5 percent or better. Two domains dropped below 99.5 percent: booking.com at 98.7 percent and x.com at 90.8 percent (the latter the same domain that locks out most providers in our cohort). One weak spot (x.com), one transient blip (booking.com), and no domain returned empty content across the audit. What bites you on Scrape.do is not reliability. It is cost transparency. The published headline price is $0.11 per 1K credits at the Hobby plan and 1 credit per call for datacenter scraping, which implies a $0.11/1K blended rate. Our request-cost header probe across all 20 domains showed nine of them silently auto-upgrade. Capterra, google, idealista, indeed, reddit, tripadvisor, and trustpilot all bill 10 credits per call ($1.10/1K). G2.com bills 25 credits ($2.75/1K). LinkedIn bills 30 credits ($3.30/1K). The real-world blended rate across protected domains is roughly 8x the headline. The audit findings doc (AUDIT-FINDINGS.md section 5) calls this out explicitly and recommends every harness read scrape.do-request-cost on every response.

Per-domain breakdown

DomainSRNotes
amazon.com100.0% (n=trials)Perfect SR at $0.11/1K. Average 3.4s, p90 4.8s. Uses the /plugin/amazon/pdp endpoint, which billed at the documented 1 credit. The standout result of the audit. See the Amazon stress deep dive below.
google.com100.0%Auto-upgrades to 10 credits ($1.10/1K) regardless of params requested. 1.9s average latency. The /plugin/google/search structured endpoint is the recommended path here and is what gets you to SR=1.0.
linkedin.com99.8%Cleanest LinkedIn result in our cohort, but at a price: 30 credits per call, or $3.30/1K. Zyte's account-policy 451 and ScrapingAnt's 423 wall mean Scrape.do is one of only a handful of providers that returns content here at all.
g2.com99.7%25 credits per call ($2.75/1K), DataDome target. ScrapingBee and Zyte cannot reach G2. Scrape.do can. Slow latency at 10.2s average and 18.3s p90, because DataDome's challenge dance is in-band and not retried. ScrapingDog and ScrapingAnt both GENUINE-FAIL here.
idealista.com100.0%10 credits/call ($1.10/1K). Sub-second latency (992ms average). The geo-routing to a Spain-clean residential IP is fast. ScrapingAnt and Zyte struggle here. Scrape.do handles it transparently.
x.com90.8%The one weak spot. $0.55/1K (5cr, datacenter+render auto-applied), 10.2s average, 15.7s p90. ScraperAPI ToS-denies x.com entirely. Scrape.do reaches it, but with the lowest SR of the 20 domains.
bestbuy.com99.9%10 credits/call ($1.10/1K). The Bright Data 'premium domain' that costs $2.50 elsewhere. ZenRows GENUINE-FAILs on bestbuy PDPs with a geo-fence interstitial. Scrape.do delivers.
instagram.com99.9%1 credit/call ($0.11/1K), 1.3s average latency. The cheapest Instagram in our cohort. ScraperAPI ToS-blocks, Firecrawl ToS-blocks, ZenRows ToS-blocks. Scrape.do delivers at the headline rate.

Amazon 550-trial stress test

The 2026-06-05 Amazon stress test (docs/amazon-stress/REPORT.md) ran 500 trials against the canonical product page /dp/B07FZ8S74R, 25 trials against search /s?k=laptop, and 25 trials against the bestseller page /Best-Sellers/zgbs/electronics. That's 550 trials in one provider session. Scrape.do returned 550/550 = 100.0 percent across all three endpoints, with zero retries fired, zero transport errors, zero HTTP failures, and zero verifier mismatches. Latency stayed tight at p50 3.31s, p90 4.78s, p99 9.17s on product, and effectively identical on search (3.5s) and bestseller (3.4s). Quoting the report directly: 'Scrape.do held perfectly across all 3 endpoints. The only provider with zero failures of any kind across 550 trials. No retries needed. Headline numbers identical to the audit; no Amazon-defense impact whatsoever.' Competitive context matters here. In the same stress test, Zyte caught Amazon's new Akamai bm-ver interstitial on 6/25 search trials (76 percent SR). ScrapingDog hit it 4 times and exhausted its concurrency ceiling for another 50 trials on product (90 percent SR product, 76 percent search). Decodo's bestseller endpoint returned 0/15 with status_code=613. ScrapingBee had 4 transport-level failures. Scrape.do alone shrugged off Amazon's June 2026 anti-bot updates with no measurable impact. Total spend for those 550 Amazon trials: $0.06 (550 x $0.00011).

Pricing deep dive

Scrape.do's pricing card reads as one of the simplest in the category. A free-forever 1,000-credit tier, then five paid plans from $29/mo (Hobby) to $699/mo (Advanced), plus Custom/Enterprise. Per the live pricing page (scrape.do/pricing/, fetched 2026-06-08): Free $0, Hobby $29/mo for 250K credits and 10 concurrency, Pro $99/mo for 1.25M credits and 50 concurrency, Business $249/mo for 3.5M credits and 100 concurrency, Advanced $699/mo for 10M credits and 200 concurrency, Custom for unlimited. The headline blended rate runs from $0.11/1K at Hobby to $0.06/1K at Advanced. The catch is the credit multiplier table (1cr basic, 5cr render, 10cr super, 25cr super+render) plus an opaque per-domain auto-upgrade matrix that bills 9 of our 20 benchmark domains at 10 to 30 credits each regardless of which params you requested. Our header-probe audit found the real blended rate across the 20 benchmark domains is $0.87/1K, about 8x the headline. The scrape.do-request-cost response header is the only source of truth. Any cost estimate based on the multiplier card alone will be wrong by an order of magnitude on protected domains.

Plans

PlanPriceVolumeConcurrencyWhat unlocks
Free$01,000 successful credits / mo5All features included (datacenter + residential + JS render). No card required. Free-forever (not a one-time trial).
Hobby$29/mo250,000 credits ($0.11 per 1K)10Email support. Geo-targeting for datacenter restricted to Pro+ per docs (Hobby gets US-default routing).
Pro$99/mo1,250,000 credits ($0.08 per 1K)50Priority email support. Datacenter geo-targeting unlocks at this tier.
Business$249/mo3,500,000 credits ($0.07 per 1K)100Marked 'Most Popular' on the pricing page. Premium residential & mobile proxies. Account manager + dedicated support. Per /documentation/proxies/super, super=true 'requires a minimum of Business Plan and above', but our Hobby-tier audit calls succeeded on auto-upgrade domains anyway, an apparent runtime/docs divergence.
Advanced$699/mo10,000,000 credits ($0.06 per 1K)200Custom WAF bypass strategies. Custom SLA. Dedicated Slack channel.
Custom / EnterpriseContact salesUnlimited credits, unlimited concurrencyCustomCustom firewall bypass tuned to target dynamics.

Cost multipliers

Base multiplier table per /documentation/api-response/request-costs: Normal request (datacenter) = 1 credit. Datacenter + headless browser (render=true) = 5 credits. Residential & mobile (super=true) = 10 credits. Residential & mobile + headless browser (super=true&render=true) = 25 credits. On top of that, a per-domain special-pricing table auto-applies regardless of params: Amazon.* = 1cr (uses plugin), Google.* = 10cr (residential by default), LinkedIn = 30cr, Shopee = 100cr, realestate.com.au = 25cr, chewy.com = 50cr, aircanada.com = 75cr, carsales.com.au = 25-200cr, g2.com = 25cr, leboncoin.fr = 10cr, capterra.com = 10cr, akakce.com = 10cr, naver.com = 25cr, therealreal.com = 25cr, mscdirect.com = 25cr, cineworld.co.uk = 20cr, mouser.com = 10cr, hermes.com = 10cr, fastpeoplesearch.com = 25cr, jd.com = 75cr, idealista.* = 10cr, Shein.* = 25cr, sainsburys.co.uk = 200cr, leroymerlin.fr = 25cr. The audit-measured table (AUDIT-FINDINGS.md section 5) added indeed, reddit, tripadvisor, trustpilot to the 10cr auto-upgrade list. None of those four appear in the docs special-pricing table, which means the published list is incomplete relative to live runtime behavior (capterra is correctly published at 10cr; the four newly-discovered ones are not).

Hidden costs (not on the pricing card)

Effective cost per workload

Amazon product monitoring, 100K req/mo on /plugin/amazon/pdp
Real cost: $11/mo (100,000 x 1cr x $0.00011 = $11). Fits inside Hobby at $29/mo with 230K credits to spare. Plugin concurrency=1 per token means sustained throughput tops out at about 1 req/sec per token. For higher throughput, parallel tokens or async queue.
Why: Amazon is one of three domains that actually costs 1 credit per the docs and our request-cost header audit. The 550-trial stress at $0.06 total cost (= $0.11/1K) confirms the headline rate is real on this workload.
Google SERP scraping, 100K queries/mo via /plugin/google/search
Real cost: $110/mo (100,000 x 10cr x $0.00011 = $110). Doesn't fit Hobby. Needs Pro at $99/mo + overage or Business at $249/mo with 2.5M leftover credits.
Why: Google.* auto-upgrades to 10 credits per the docs (residential by default for success rate). Headline math ($0.11/1K) would suggest $11/mo, but the real cost is 10x because google.com is on the auto-upgrade list.
LinkedIn profile enrichment, 50K profiles/mo
Real cost: $165/mo (50,000 x 30cr x $0.00011 = $165). Requires Pro at $99/mo + overage or Business at $249/mo.
Why: LinkedIn is the highest auto-upgrade rate at 30 credits per call per the special-pricing table. The trade-off is real. Scrape.do is one of the few providers in our cohort that delivers LinkedIn at 99.8 percent SR. Zyte 451s, ScrapingAnt 423s. At $3.30/1K LinkedIn is roughly competitive with Bright Data's $1.50 to $2.50/1K but with substantially higher SR.
Mixed-domain monitoring, 200K req/mo across an average of our 20-domain set (Amazon, eBay, BestBuy, GitHub, etc.)
Real cost: $174/mo at the audit-measured blended rate of $0.87/1K (200,000 x $0.00087 = $174). Fits Pro at $99 + overage, or Business at $249/mo headline-fit.
Why: Headline $0.11/1K at Hobby would predict $22/mo for this mix. The audit-measured 8x blended difference comes from 9 of the 20 domains being on the auto-upgrade list. Use the per-domain expected_cost_per_1k_usd lookup in config/default-tiers.json to budget honestly.
Async overnight catalog refresh, 1M product pages/mo on /plugin/amazon/pdp
Real cost: $110/mo (1M x 1cr x $0.00011 = $110). Async pool = 30 percent of plan concurrency, so at Business ($249/mo) you'd get 100 main + 30 async concurrency = 130 effective. Comfortable fit.
Why: Async API runs on a separate background pool that does not consume main concurrency, so overnight batches don't compete with daytime traffic. The 30 percent additive concurrency is one of the better async-batch architectures in the cohort.

Features deep dive

Core features

Single-endpoint REST API (api.scrape.do)

One GET endpoint at https://api.scrape.do/ accepting 40+ parameters (token, url, super, render, geoCode, sessionId, customHeaders, playWithBrowser, waitSelector, and others). No SDK required. Works with any HTTP client. Pay-per-success billing: 2xx, 400, 404, and 410 are billable. Everything else is free.

Our take: We wired it up in 5 minutes from the audit harness. The pay-per-success rule held in practice. Failed trials never appeared on the bill in our 25,000+ call audit run. Cleanest API surface in the cohort after raw proxy products.

Super proxy (super=true) with residential + mobile rotation

95M+ residential and mobile IPs rotated automatically when super=true is passed. Geo-targetable by country (geoCode=us) or continent (regionalGeoCode=eu). Docs claim it requires Business plan ($249) and above per /documentation/proxies/super, but our Hobby-tier audit calls succeeded on residential auto-upgrade targets. That is an apparent doc-vs-runtime divergence worth flagging.

Our take: Per the request-cost header audit, super gets auto-applied for you on protected domains. Google, idealista, indeed, reddit, tripadvisor, trustpilot, and capterra all bill 10cr regardless of params. You don't need to set super=true. The system decides. Useful, but cost prediction becomes non-deterministic (see pricing deep dive).

Headless browser rendering (render=true)

Chromium-based headless browser activation. Parameters: waitUntil (load, domcontentloaded, networkidle0), customWait (ms), waitSelector (CSS), device (desktop/mobile/tablet), width/height (default 1920x1080), blockResources (default true blocks CSS/images/fonts for speed), playWithBrowser (chain of Click/Wait/Execute actions).

Our take: 5 credits ($0.55/1K) at the documented basic+render rate. Our audit hit a documented quirk: render=true sometimes returns 502 on Amazon. The workaround is super=true, which handles anti-bot for most tier-2 domains transparently. We logged this in config/default-tiers.json quirks.

Async API with separate concurrency pool

Base URL q.scrape.do with /api/v1/jobs (POST to create, GET to poll). Authentication via X-Token header. Runs on a separate background pool with capacity equal to 30 percent of plan concurrency, additive to your main API limit. Supports the structured plugins (Amazon, Google search, Google Trends) for bulk processing.

Our take: The 30-percent additive pool is useful. Overnight batch jobs don't starve daytime live traffic. We didn't burn-in test the async path, so we cannot confirm sustained-load behavior, but the architecture is the cleanest async story in our cohort other than Apify Actors.

Structured-data plugins

/plugin/amazon/pdp, /plugin/amazon/offer-listing, /plugin/amazon/search, /plugin/amazon/ (raw), plus a seven-sub-API Google suite: /plugin/google/search (with ai-mode variant), /maps, /shopping, /flights, /hotels, /news, /trends. Each returns parsed JSON instead of HTML. ZIP-code and language parameters across 21 Amazon marketplaces. Concurrency limit on Amazon plugin endpoints is 1 per token.

Our take: The Amazon plugin is the standout feature: 1 credit per call ($0.11/1K) and 100 percent SR in the 550-trial stress test. The concurrency-1 limit on the Amazon plugin per token is real. For sustained throughput you would need to either provision multiple tokens or fall back to raw /plugin/amazon/ HTML. The Google suite at seven sub-APIs (Search, Maps, Shopping, Flights, Hotels, News, Trends) is closer to SerpApi's coverage than the marketplace breadth criticism would suggest, though it's still narrower than Apify's 31k Actors for niche regional sites.

Proxy mode (proxy.scrape.do:8080)

Alternative to API mode. Configure http://YOUR_TOKEN:[email protected]:8080 as a standard HTTP/HTTPS proxy in Scrapy, Selenium, Puppeteer, Playwright, or any client. Removes URL-encoding. Same parameter set via the password field of the proxy URL.

Our take: The cleanest integration path for existing Scrapy/Playwright stacks. Drop it in as a proxy and you are done. Documented at /documentation/proxy-mode and /documentation/libraries. Lower friction than ScrapingBee or Zyte's equivalent.

Proxy pool

Scrape.do publishes 110M+ proxies in 150 countries on the homepage, broken into a 90,000+ datacenter rotating pool (cheap, blockable by advanced anti-bot) and a 95,000,000+ residential and mobile rotating pool reached via super=true. ASN-aware routing per /documentation/proxies/super. Geo-targeting at country level (geoCode=us) and continent level (regionalGeoCode=eu) is supported. The docs note that datacenter geo-targeting requires Pro plan or above (Hobby gets US-only datacenter rotation). Postal-code / ZIP-level rotation is published at /documentation/proxies/postal-code as a general feature requiring super=true and geoCode (about 21 supported countries including US, UK, DE, FR, IT, ES, CA, AU, JP, IN, BR, MX, TR, NL, SG, AE, SA, PL, SE, BE, EG). Both postalcode and zipcode parameter names work interchangeably against any target URL, not just the Amazon plugin. Sticky-session support via sessionId=<integer> for IP persistence across calls. SessionId must be a numeric value the client controls. Rotation is automatic per-request unless sessionId is set. Default routing on residential super=true with no geoCode is US. Pool composition is opaque. The docs don't break down country-by-country residential IP counts, and unlike Bright Data (which publishes 400M+ residential, 1.3M datacenter, 1.3M ISP, 7M mobile per its docs) Scrape.do markets a single 110M+ aggregate figure. Our audit showed reliable geo-fenced content delivery (idealista.com from Spain, bestbuy.com from US), implying the geo routing works in practice, but we have no independent verification of the 95M residential headline figure. There is no published mobile-specific opt-in (mobile and residential share the same super=true flag) and no published city-level or ASN-level rotation outside the documented country and postal/ZIP granularity.

Structured endpoints

SDKs and integrations

AI capabilities

Scrape.do does not pitch itself as an AI-first scraping product. There is no first-party LLM extraction endpoint, no published MCP server, no schema-by-prompt feature, and no in-platform AI agent. Two things it ships are useful for LLM pipelines. First, the output=markdown response transformation strips HTML to LLM-friendly markdown for token efficiency. Second, structured-JSON plugins (Amazon /plugin/amazon/pdp plus a seven-sub-API Google suite covering Search, Maps, Shopping, Flights, Hotels, News, and Trends) return parsed JSON instead of HTML, so you don't need an LLM (or your own parser) for the most common scrape targets. Past those, AI features are just marketing positioning. The docs talk about LLM use cases for the markdown output, but entity extraction, schema mapping, and semantic search are left to your own model.

Feature inventory

output=markdown responseGA

Setting output=markdown converts text/html responses to markdown before returning, which cuts token count for downstream LLM ingestion. Same credit cost as raw HTML. Documented at /documentation/api-response/response-output#output-format.

Pricing: No additional cost. Same credits as raw response (1 credit datacenter, 5 with render, 10 super, 25 super+render).

Amazon Scraper API plugin (/plugin/amazon/pdp, /search, /offer-listing)GA

Pre-parsed JSON responses for Amazon product pages, search, and offer listings across 21 marketplaces. ZIP-code and language parameters supported. Documented at /documentation/amazon-scraper-api/. Removes the need for an LLM extractor on the most common e-commerce target.

Pricing: 1 credit per successful call (~$0.00011 at Hobby), but limited to concurrency 1 per token on the plugin endpoints.

Google Scraper API plugin (Search, Maps, Shopping, Flights, Hotels, News, Trends + ai-mode)GA

Structured JSON across seven Google sub-APIs (Search, Maps, Shopping, Flights, Hotels, News, Trends) plus an AI Overviews / ai-mode endpoint under /plugin/google/search/ai-mode that returns the LLM-generated overview Google itself ships. Documented under /documentation/google-scraper-api/.

Pricing: Auto-billed at 10 credits per call (~$1.10/1K at Hobby) because google.com auto-upgrades to residential per /documentation/api-response/request-costs.

Zapier and n8n integrations with AI-extraction recipesGA

Official integrations documented at /documentation/integrations/zapier and /documentation/integrations/n8n. Both docs recommend output=markdown so downstream AI nodes (Zapier AI Formatter, n8n OpenAI/Claude nodes) can extract structured fields from the response cheaply.

Pricing: Free integration. Scrape.do API costs only. You supply the LLM token.

MCP serverExperimental / undocumented

No first-party MCP server is documented at /documentation/ or referenced on the blog. Competitors (ScraperAPI, ScrapingBee, Bright Data, Apify) all ship MCP servers. Scrape.do is conspicuously absent here as of June 2026.

Pricing: N/A. Marketed but undocumented.

AI extraction / schema-by-promptExperimental / undocumented

No /extract endpoint exists. Schema-by-prompt is not offered. Marketed but undocumented. The AI/LLM landing page references AI use cases, but the actual feature is just raw HTML plus markdown formatting.

Pricing: N/A. Feature does not exist as a first-party product.

Our assessment

Scrape.do's AI story is the opposite of Firecrawl's. Firecrawl wraps its entire product around LLM ingestion (markdown-first, /extract endpoint, MCP server). Scrape.do is a proxy-and-anti-bot company that happens to support markdown output. For developers building agentic stacks, this is a real gap. You will still need a separate model call to extract fields from the HTML or markdown Scrape.do returns, whereas a Firecrawl /extract or a ScraperAPI structured endpoint can hand you a JSON object directly. On the other hand, Scrape.do isn't billing you per AI token consumed inside their stack, and the structured plugins it does ship (Amazon plus the seven Google sub-APIs) are dramatically cheaper and more reliable than asking an LLM to extract the same fields. For 2026: Scrape.do is the cheap and reliable HTML/markdown layer underneath your AI pipeline, not the AI pipeline itself. If your workflow is Amazon product monitoring or SERP scraping, the structured plugins are excellent and the missing MCP/extract endpoint doesn't matter. If your workflow is 'point a model at arbitrary URLs and get JSON back,' look at Firecrawl, ScrapegraphAI, or Apify Actors instead.

Where it excels

Where it falls short