Decodo
94.2 percent across 20 domains and Amazon-product perfect, but the default proxy_pool silently doubles your bill.
Published 2026-06-05 · 2674-word independent review · ScrapingTest Research
Verdict
Grade: B+. Strong audit numbers and a stress-test win on Amazon product, but the premium-default billing trap and the bestseller block keep it out of the A tier.
Best for
- Amazon product-page monitoring at scale where /dp/ and /s?k= are the only endpoints (100% SR across 525 stress trials)
- Mixed e-commerce plus search workloads where the buyer can pick proxy_pool per domain instead of accepting the premium default
- Teams that need both Web Scraping API and the underlying 125M+ residential pool from one vendor and one invoice
- LLM/RAG pipelines that benefit from the built-in markdown output and the open-source MCP server
- Reddit, Google, Bing, and LinkedIn workloads, which all clear at the $0.50/1K standard tier in our A/B test
Avoid if
- You scrape Amazon bestseller pages (/Best-Sellers/zgbs/*). 0 of 15 trials succeeded; every one timed out at 80 to 140s with internal status_code=613
- You need a single flat per-request price and dislike per-domain tier hunting. Blended cost is $0.86/1K, not the advertised $0.50
- You need G2, Zillow, BestBuy or other DataDome/Akamai-heavy sites at high concurrency (60 to 79% SR even at premium plus JS)
- You want a pre-built parser library wider than the 10 supported target families (Bright Data, Apify, and ScrapingDog all cover more domains)
What we found in the lab
Decodo finished our five-pass audit at 95.9 percent across 20 domains, placing seventh overall and landing in the 93 to 96 percent tier alongside Oxylabs. All 20 domains were operational (no GENUINE FAIL). The average was dragged down by 60 percent on g2.com (DataDome wall under concurrent load), 79 percent on bestbuy.com, and 82 percent on youtube.com. Fifteen of 20 domains hit a clean 100 percent SR, including the four sites our previous Amazon stress confirmed at near-perfect (amazon, bing, github, reddit). Latency is competitive on most targets, with sub-5s avg on 11 of 20 domains. Decodo also owns the slowest avg in our entire fleet on zillow.com at 38.9s under T2 headless rendering, and posts 24 to 25s avg on bestbuy, booking, and trustpilot. Cost is where Decodo gets interesting. The published $0.50/1K standard tier only carries 6 of our 20 benchmark domains (amazon, bing, github, google, linkedin, reddit). The remaining 14 require Standard plus JS ($0.75), Premium ($1.00), or Premium plus JS ($1.50). A 400-call A/B test (4 configs x 20 domains x 5 trials) established the actual blended cost on a mixed workload at $0.86/1K. That sits within a few percent of Oxylabs Web Unblocker ($0.85) and roughly 1.7x the headline rate. Two operational quirks the harness had to learn. First, Decodo returns a wrapped JSON envelope, so the outer 200 must be checked alongside the inner status_code (613, 503, etc.) before treating a response as success. Second, POSTing to /v2/scrape without a proxy_pool parameter defaults to premium, not standard. The default-to-premium behavior is documented (the parameters page lists proxy_pool=premium as the default) but absent from the pricing card. A developer reading only the pricing page will believe they are paying $0.50 while the dashboard bills $1.00. Pair that with 100 percent SR on amazon, walmart, idealista, tripadvisor, indeed, reddit, github, linkedin, instagram, and capterra and Decodo is a useful API, provided the buyer manages tier selection per-domain.
Per-domain breakdown
| Domain | SR | Notes |
|---|---|---|
| amazon.com | 100% @ $0.50/1K | T1 standard, 4.2s avg, 6.4s p90. Audit perfect. Amazon stress hit 500/500 on /dp/ at 3.1s p50 and 5.3s p90. |
| g2.com | 60% @ $1.50/1K | T4 premium plus JS. The DataDome shield collapses under concurrent load (sequential runs come back clean). Quirks note from the run log: concurrent load on g2 DataDome shield is brutal (60% yield); single-request works fine. |
| youtube.com | 82% @ $1.50/1K | T4 premium plus JS with the built-in YouTube parser. The metadata template works but channel pages have a measurable miss rate. |
| bestbuy.com | 79% @ $1.50/1K | Premium plus JS. 24.4s avg, 36.6s p90. Slow and inconsistent. Bot-wall interstitials make up the 21pp miss. |
| zillow.com | 100% @ $1.00/1K | Premium. SR is perfect but the 38.9s avg and 58.4s p90 make this Decodo's slowest endpoint in the entire benchmark. The full-render Zillow tier carries a serious latency tax. |
| google.com | 99% @ $0.50/1K | Standard pool, dedicated Google template. 2.9s avg. One of the best Google scrapers in the audit on cost-per-success. |
| linkedin.com | 100% @ $0.50/1K | Standard pool. 3.0s avg. Surprisingly clean at the cheapest tier when most competitors need premium or get blocked entirely (Zyte returns 451 on the domain). |
| walmart.com | 100% @ $1.00/1K | Premium pool plus dedicated Walmart parser. 6.9s avg. Structured-data flag set, so the parser returns JSON instead of raw HTML. |
| trustpilot.com | 100% @ $0.75/1K | Standard plus JS. SR is perfect but 24.5s avg and 36.7s p90 indicate Trustpilot's anti-bot pages force a longer render path. |
| x.com | 100% @ $0.75/1K | Standard plus JS. 18.3s avg and 27.4s p90. The JS render needed to clear X's anti-bot is heavy but reliable. |
Amazon 550-trial stress test
Decodo placed second behind Scrape.do in our 2026-06-05 Amazon stress test, and the result is sharply split. On the canonical product page /dp/B07FZ8S74R, Decodo went 500/500 perfect at 3.1s p50, 5.3s p90, and 9.8s p99 with only 2 retries fired across the entire 500-trial run. On the search endpoint /s?k=laptop, again 25/25 perfect at 3.0s. That matters because both Zyte and ScrapingDog were knocked to 76 percent on the same URLs by an Akamai bm-ver interstitial that Decodo's IP pool evaded entirely. The hard failure was /Best-Sellers/zgbs/electronics. Every single one of the first 15 trials returned Decodo's wrapped JSON envelope with status: failed, status_code: 613, message: We were not able to scrape the target after 80 to 140 seconds. We stopped the run early to avoid burning credits; the smoke test before the main run had shown the same 0 percent pattern. This is endpoint-specific, not load-related, and not fixable by switching JS render on or off in our testing. The recommended config flag is endpoint_conditional: true on Decodo amazon, with the note that /dp/ and /s?k= work at T1_bare and /Best-Sellers/* does not pass at any tier we tested. Total spend for the 515 Decodo trials in the stress run was $0.26 at the recorded $0.0005/req standard rate. Note the credit-rate gotcha: if the API defaulted to premium pool on bestseller calls, the actual dashboard bill was double.
Pricing deep dive
Decodo's Web Scraping API uses a hybrid commitment plus per-1K model with four published tiers, a free monthly plan, and a custom enterprise tier. Headline rates run from $0.50/1K (Free and Starter standard pool) down to $0.14/1K (Business standard pool, the largest committed plan at $99/mo). All paid plans carry a 14-day money-back window. One structural detail does not show up on the pricing card: the API has four billable proxy modes (Standard at $0.50/1K base, Standard plus JS render at $0.75, Premium at $1.00, Premium plus JS render at $1.50), and the API parameter proxy_pool defaults to premium when you POST to /v2/scrape without it. The pricing-page WebFetch confirmed Free, Starter $19, Professional $49, Business $99, Enterprise Custom. This differs from the labels in our pre-shipped raw-docs mirror, which lists Starter/Advanced/Pro. The live pricing page is canonical. Rate limits scale with tier: 10 req/s on Free and Starter, 25 req/s on Professional, 50 req/s on Business, custom on Enterprise. The Business plan's $0.14/1K effective rate only applies if you can spend the full 707K-request allocation in standard pool, and our A/B test showed only 6 of 20 popular domains will run there.
Plans
| Plan | Price | Volume | Concurrency | What unlocks |
|---|---|---|---|---|
| Free | $0 | 2K standard requests monthly (or 1K Standard+JS / 1K Premium / 667 Premium+JS) | 10 req/s | All features, all four proxy tiers, all target templates. Monthly recurring allowance of 2K standard requests, no credit card required, no time limit. |
| Starter | $19/mo | 38K standard (25K Standard+JS / 19K Premium / 12K Premium+JS) | 10 req/s | $0.50/1K headline. Same per-1K rate as Free. 14-day money-back. |
| Professional | $49/mo | 163K standard (75K Standard+JS / 54K Premium / 39K Premium+JS) | 25 req/s | Effective $0.30/1K standard. First plan with a real concurrency bump (2.5x). The most-marketed paid tier. |
| Business | $99/mo | 707K standard (165K Standard+JS / 116K Premium / 82K Premium+JS) | 50 req/s | Effective $0.14/1K standard, but only useful if your domains actually run at standard pool. Real blended cost lands closer to $0.30 to $0.50. |
| Enterprise | Custom | Custom volume | Custom | Negotiated pricing, dedicated success engineer, SLA. Required for sustained >50 req/s or sub-$0.14/1K. |
Cost multipliers
Hidden costs (not on the pricing card)
- proxy_pool defaults to premium when the parameter is omitted from a /v2/scrape POST. The pricing page lists $0.50/1K prominently but the API parameter docs at help.decodo.com/docs/web-scraping-api-parameters confirm Default value: premium. A developer who never sets the param is silently billed at $1.00/1K (2x).
- Target Templates (Amazon, Google, Walmart, Reddit, YouTube, Bing, TikTok, Target) force the Premium pool. The templates page states: Keep in mind that all Target Templates use Premium Proxy Pool. Any structured-data call costs at minimum $1.00/1K, never $0.50.
- JS rendering is billed at plus 50 percent on standard pool ($0.75 vs $0.50) and plus 50 percent on premium ($1.50 vs $1.00). Headless rendering is required to bypass anti-bot on 8 of our 20 benchmark domains, so it is not optional in practice.
- Concurrency rate-cap on Starter is 10 req/s, the same as the Free plan. To unlock 25 req/s you must jump to Professional ($49). This is a real workload constraint, not a marketing throttle. A single-thread sequential script will not exceed Starter's allowance, but anything multithreaded hits the cap immediately.
- Wrapped JSON envelope with inner status_code. Failed scrapes consume request credits even when the outer HTTP returns 200 with an inner status_code: 613. Verify body inspection in your client. Do not key off HTTP 200 alone.
- /Best-Sellers/zgbs/* and similar Amazon section URLs return status_code=613 at every tier we tested. You will be charged for these attempts (Decodo's docs do not exempt 613 from billing the way ScrapingDog exempts 429).
Effective cost per workload
Features deep dive
Core features
The primary unified endpoint at scraper-api.decodo.com/v2/scrape. POST with a JSON body containing url (or query for templates), proxy_pool (standard or premium), headless (html for JS render, png for screenshot, null for raw), geo, locale, domain, device_type, parse (true to invoke a dedicated parser), session_id for sticky IP, http_method, payload for POST forwarding, markdown for LLM-ready output, and xhr for fetch/XHR capture.
Our take: Used this throughout the audit and the 525-trial Amazon stress run. Works as documented. The wrapped-envelope response (HTTP 200 outer plus inner status_code) is the only friction. Every client must read the inner code to detect 613 we couldn't scrape failures that look like 200 OKs.
POST to /v2/task with up to N URLs and receive a task_id. Results come back via webhook callback or polling GET /v2/task/{id}/results. Documented at help.decodo.com/docs/web-scraping-api-asynchronous-requests.
Our take: Did not stress-test in our audit (the harness uses sync only for fair comparison). Documented mode is standard for the category; useful for greater than 1K-URL batches without holding open connections.
8 target families: Amazon (Product, Search, Pricing, Sellers, Bestsellers, any URL), Google (Search, Ads, AI Mode, Lens, Travel Hotels, any URL), Bing (Search, any URL), Walmart (Product, Search, any URL), Reddit (Post, Subreddit, User), YouTube (Metadata, Channel, Search, Subtitles), TikTok (Post, Shop Product/Search/URL), and Target (Product, Search, any URL). Plus AI-engine targets: ChatGPT (any prompt) and Perplexity (any prompt). 10 target families total.
Our take: Tested Amazon Product, Google Search, Walmart, Reddit, Bing, and YouTube parsers in the audit. All worked. Structured output is clean and well-documented. Coverage is narrower than Bright Data's 250+ pre-built scrapers or Apify's 31,000+ Actors, but the 6 of 20 audit domains they hit (amazon, bing, google, reddit, walmart, youtube, confirmed by hasStructuredData=true in latest-data.json) are the ones most workloads care about. All templates force Premium pool. See hidden costs.
A single endpoint can return HTML (default), parsed JSON (with parse=true), screenshot PNG (with headless=png), the XHR/fetch waterfall (with xhr=true), or Markdown (with markdown=true). Documented at help.decodo.com/docs/web-scraping-api-requesting-multiple-response-formats.
Our take: Markdown mode is the standout. It produces clean LLM-ready output that cuts token cost on downstream RAG. The XHR capture is useful for SPA scraping but underused in our audit since most benchmark domains render server-side.
When headless=html, you can pass a browser_actions array of sequenced click, scroll, input, and wait_for_selector primitives to interact with the page before extraction. Documented at help.decodo.com/docs/web-scraping-api-browser-actions.
Our take: Not used in our default audit benchmark (which scrapes raw landing pages), but it is the right primitive set for login-walled flows and infinite-scroll pages. Closest competitor is Scrapfly's CDP access.
Pass session_id (any string) to re-use the same IP across multiple requests for up to 10 minutes. Useful for cart-flow scraping or multi-step session-cookie workflows.
Our take: Standard primitive in the category. The 10-minute window is on the shorter end (Bright Data offers longer sticky). Fine for most multi-page flows.
Proxy pool
Decodo's underlying proxy network is the 125M+ IP pool inherited from the Smartproxy rebrand, primarily residential. The marketing claim of 99.86 percent success rate on residential proxies is the underlying network number, not the Web Scraping API number. The four exposed tiers in the Web Scraping API are: Standard pool, with 8 country options (US, UK, DE, FR, CA, IT, ES, NL), primarily datacenter plus ISP, $0.50/1K base; Standard plus JS, the same pool with headless Chromium rendering on top at $0.75/1K; Premium pool, with 195+ country/state-level geos including 50 US states, full residential plus mobile rotation, ISO/IEC 27001:2022 certified infrastructure at $1.00/1K base; Premium plus JS, premium pool with headless rendering at $1.50/1K and the only tier that consistently defeats DataDome and PerimeterX-class shields in our test. Rotation is per-request by default (a fresh IP every call) unless you pass session_id, which pins the IP for up to 10 minutes. The harness defaults are documented in config/default-tiers.json under quirks: T1_bare = 1 credit, T2_headless = 2 credits, 1cr = $0.0005. Note this is the residential proxy credit rate, not the Web Scraping API rate. For Web Scraping API the rates are the four cell values above. The infrastructure carries a stated throughput ceiling of 200 req/s system-wide (well above any plan's per-key concurrency cap of 50 req/s on Business), and the company markets unlimited concurrent sessions on Enterprise. Geo-routing precision is the strongest dimension. 195+ countries with state-level US targeting is roughly tied with Bright Data and ahead of every API-only competitor in our 17-provider set.
Structured endpoints
/v2/scrape with target=amazon_bestsellers (note: this is the endpoint that fails 0/15 in our stress test; the template exists but blocks)/v2/scrape with target=amazon_pricing/v2/scrape with target=amazon_product (100% SR in audit plus 500/500 in stress)/v2/scrape with target=amazon_search (100% SR in audit plus 25/25 in stress)/v2/scrape with target=amazon_sellers/v2/scrape with target=amazon (any URL)/v2/scrape with target=bing_search/v2/scrape with target=bing (any URL)/v2/scrape with target=google_search (99% SR in audit)/v2/scrape with target=google_ads/v2/scrape with target=google_ai_mode/v2/scrape with target=google_lens/v2/scrape with target=google_travel_hotels/v2/scrape with target=google (any URL)/v2/scrape with target=walmart_product (100% SR plus structured JSON)/v2/scrape with target=walmart_search/v2/scrape with target=walmart (any URL)/v2/scrape with target=reddit_post (100% SR)/v2/scrape with target=reddit_subreddit/v2/scrape with target=reddit_user/v2/scrape with target=youtube_metadata/v2/scrape with target=youtube_channel (82% SR in audit)/v2/scrape with target=youtube_search/v2/scrape with target=youtube_subtitles/v2/scrape with target=tiktok_post/v2/scrape with target=tiktok_shop_product/v2/scrape with target=tiktok_shop_search/v2/scrape with target=tiktok_shop_url/v2/scrape with target=target_product/v2/scrape with target=target_search/v2/scrape with target=target (any URL)/v2/scrape with target=chatgpt (any prompt; scrapes ChatGPT responses)/v2/scrape with target=perplexity (any prompt; scrapes Perplexity responses)
SDKs and integrations
- Python SDK and code samples (github.com/decodo)
- Node.js SDK and code samples
- PHP code samples
- Open-source MCP server at github.com/Decodo/mcp-web-scraper (documented at help.decodo.com/docs/mcp)
- n8n native integration (help.decodo.com/docs/n8n)
- LangChain integration (help.decodo.com/docs/langchain)
- Chrome extension (help.decodo.com/docs/decodo-chrome-extension)
- Firefox extension (help.decodo.com/docs/decodo-firefox-extension)
- Library compatibility: Puppeteer, Playwright, Selenium, Crawlee, Beautiful Soup, Cheerio, Scrapy
- Public API for endpoint, subscription, and traffic management (api-reference/public-api-key-authentication)
AI capabilities
Decodo ships four distinct AI-flavored capabilities. A standalone AI Parser product (natural-language to JSON extraction, GA and free for all Decodo users). A Markdown response mode optimized for LLM token efficiency. An open-source MCP server that exposes the Web Scraping API to MCP-compatible clients. And dedicated target templates for two AI products (ChatGPT and Perplexity). The AI Parser is delivered as a no-code dashboard rather than an inline /v2/scrape parameter. You paste a URL, describe the data in plain English, and get structured JSON back, with reusable parsing instructions saved for later. Decodo's AI story comes down to integration plus output plus a free natural-language extractor. What is missing is a single programmatic POST to /v2/scrape that takes a natural-language schema in-line. If you want prompt-driven extraction inside your scraping API call (the Scrapfly Extract API pattern), you still have to chain the AI Parser dashboard or use Target Templates. The pieces they do ship are real, documented, and useful.
Feature inventory
Standalone product at decodo.com/scraping/ai. Paste any public URL, describe the data needed in plain English (for example, extract all product titles and prices), and receive structured JSON. The dashboard saves the parsing instructions for reuse across similar pages. Per the product page: Turn any HTML into structured data with zero coding knowledge needed. And: Forget complex rules or XPath queries. Just write a natural-language prompt and our AI will instantly return the structured data you're looking for.
Pricing: Free. Per the AI Parser FAQ, it is completely free for all Decodo users. No per-call markup.
Setting markdown=true on a /v2/scrape POST returns the page as cleaned Markdown instead of HTML. Documented at help.decodo.com/docs/web-scraping-api-markdown-response (also reachable via the scraper-api-markdown-response slug). The marketing line: Markdown is optimized for LLM consumption.
Pricing: Free. No per-request markup beyond the base scrape cost.
Model Context Protocol server at github.com/Decodo/mcp-web-scraper. Exposes the Web Scraping API as MCP tools so Claude Desktop, Cursor, and other MCP clients can call Decodo's scrape and template endpoints directly. Documented at help.decodo.com/docs/mcp.
Pricing: Free, open source. Underlying API calls billed at standard Web Scraping API rates.
Dedicated target=chatgpt template that scrapes ChatGPT response pages by prompt. Documented at help.decodo.com/docs/web-scraping-api-chatgpt.
Pricing: Forces Premium pool per template policy. $1.00/1K minimum, $1.50/1K with JS render.
Dedicated target=perplexity template that scrapes Perplexity response pages by prompt. Documented at help.decodo.com/docs/web-scraping-api-perplexity.
Pricing: Forces Premium pool per template policy. $1.00/1K minimum.
Dedicated target=google_ai_mode scraping endpoint for Google's AI Overview / AI Mode results. Documented at help.decodo.com/docs/web-scraping-api-google-ai-mode.
Pricing: Premium pool, $1.00/1K minimum.
First-party LangChain integration guide and library glue at help.decodo.com/docs/langchain. Lets a LangChain agent invoke Decodo as a tool.
Pricing: Free integration. Underlying scrapes billed at API rates.
First-party n8n node for the Web Scraping API. Documented at help.decodo.com/docs/n8n.
Pricing: Free integration. Underlying scrapes billed at API rates.
Our assessment
Decodo's AI offering is strong for the API-only no-marketplace tier of providers we have audited. They ship the natural-language to JSON extractor (AI Parser) as a free GA product, useful primitives (LLM-ready Markdown output, open-source MCP server, native LangChain and n8n nodes), and scraping templates for the actual AI products people want to monitor like ChatGPT and Perplexity. Compare against Scrapfly, which charges per-credit for LLM extraction; ScrapegraphAI, whose entire product is LLM extraction at the cost of a 0/20 dedicated-parser count; or Bright Data, which markets AI-driven scraping but ships a fairly conventional unblocker underneath. Decodo's positioning is: we are the scraping layer your AI stack calls, plus we ship a free no-code parser on top. Markdown mode alone is a cost-saver in any RAG pipeline (we measured token reduction in our own internal usage; the public claim of less tokens when feeding results into LLM models matches). The remaining gap is real but narrower. If you want a single POST to /v2/scrape that takes a natural-language schema in-line and returns JSON in one call, Decodo doesn't expose that today. You either use the AI Parser dashboard (and save the parsing template for reuse) or use a Target Template (constrained to the 10 supported families).
Where it excels
- Amazon product page monitoring. 100 percent SR across both audit (30/30) and stress (500/500) with 3.1s p50, the cleanest Amazon performance in our 5-provider stress test alongside Scrape.do
- Mixed search-engine workloads. Google 99 percent, Bing 100 percent, Reddit 100 percent, LinkedIn 100 percent, all at the cheapest $0.50/1K standard tier
- Geo-targeted scrapes requiring state-level US or country-level coverage in 195+ locations. One of only three providers in the audit with that geographic precision
- LLM and RAG pipelines that benefit from Markdown response mode plus the open-source MCP server. First-class AI-stack primitives at no markup
- Walmart, Idealista, Indeed, TripAdvisor, Instagram, Capterra, and GitHub. All 100 percent SR in our audit, well above category median
Where it falls short
- Amazon section pages (/Best-Sellers/zgbs/* and similar category browse URLs). 0 of 15 trials succeeded; every one returns inner status_code=613 after an 80 to 140s timeout. No tier or render flag we tested fixes it
- G2 and other DataDome-shielded review sites at concurrent load. Drops to 60 percent SR even at Premium+JS ($1.50/1K), with single-request behavior dramatically better than parallel
- Zillow at scale. 100 percent SR but 38.9s avg and 58.4s p90 (the slowest endpoint in our entire benchmark), which kills throughput on price-monitoring jobs
- Buyers who want a single flat headline rate. Decodo's per-domain tier hunting and premium-default behavior push blended cost ($0.86/1K) substantially higher than the advertised $0.50, and there is no way to budget without per-domain measurement