Scraping Cloudflare-protected sites isn't what it used to be. Modern anti-bot systems deploy JavaScript challenges, Turnstile CAPTCHAs, and sophisticated fingerprinting that stops most scrapers cold. If you're building data pipelines or running web scraping operations at scale, you need solutions that actually work—not outdated tutorials or brittle workarounds that fail on the second request.
This guide walks through five proven methods to bypass Cloudflare protection, testing each against real challenge pages to show you what works, what doesn't, and what trade-offs you're making.
When you're building production systems, you don't have time to babysit browsers or debug TLS handshakes. A dedicated web scraping API handles Cloudflare challenges automatically—proxies, headers, browser rendering, and CAPTCHA solving all work behind a single endpoint.
Scrape.do is purpose-built for this. It maintains infrastructure specifically designed to bypass modern WAFs, delivers 99.98% success rates, and scales from testing to millions of requests without code changes.
Maintained by scraping experts: The reverse engineering team constantly updates detection evasion as Cloudflare evolves. When new fingerprinting techniques roll out, the API adapts automatically.
Direct developer support: Problems get escalated to engineers who understand the technical details, not support queues reading from scripts.
Production-ready from day one: Built-in retry logic, geo-targeting, session management, and output formats (HTML/JSON/screenshots) mean you're not patching together five different tools.
It's a paid service beyond the free tier. You get 1000 successful requests monthly at no cost, but high-volume operations require a subscription. The other limitation: you can't route through your own proxy infrastructure since the API manages rotation internally.
No installations, no browser configurations, no dependency hell. Just HTTP requests.
Get your API token: Sign up at the dashboard and copy your authentication token. The free tier activates immediately.
Install basic libraries:
For Python:
bash
pip install requests beautifulsoup4
For Node.js:
bash
npm install axios cheerio
Send your request:
javascript
const axios = require("axios");
const cheerio = require("cheerio");
const token = "your-token-here";
const targetUrl = encodeURIComponent("https://scrapingtest.com/cloudflare-challenge");
const config = {
method: "GET",
url: https://api.scrape.do/?token=${token}&url=${targetUrl}&render=true,
headers: {},
};
axios(config)
.then(function (response) {
console.log("Status Code:", response.status);
const $ = cheerio.load(response.data);
const h2 = $("h2").first().text().trim();
const h3 = $("h3").first().text().trim();
console.log(h2 || null);
console.log(h3 || null);
})
.catch(function (error) {
console.log(error);
});
Output:
Status Code: 200
✅ Challenge Passed
Your request was verified and allowed through.
The render=true parameter enables JavaScript rendering through a headless browser, handling dynamic challenges automatically.
If you're running high-volume operations or need infrastructure that just works while you focus on building products, 👉 check out how ScraperAPI handles Cloudflare at scale with zero configuration headaches. Modern scraping infrastructure should adapt to anti-bot changes automatically, not break your pipelines every few weeks.
Undetected-Chromedriver patches Selenium to avoid bot detection signals. It runs a real Chrome browser on your machine, bypassing JavaScript challenges and fingerprinting checks that stop standard headless setups.
You get full browser automation—scrolling, clicking, JavaScript injection—making it great for quick prototypes or small-scale projects where you need granular control.
The downside: it's slow (spinning up browsers takes seconds), resource-intensive, and fails against Turnstile CAPTCHAs entirely.
Active community: GitHub issues and discussions have solved most common problems. If you hit an error, someone likely documented the fix.
Complete browser control: Headers, user agents, navigation timing, element interaction—everything's configurable.
Built-in stealth: Automatically masks WebDriver detection flags and automation signals.
Turnstile CAPTCHAs break it: If the site uses Cloudflare's interactive challenge, the script hangs indefinitely with no workaround.
Performance bottleneck: Launching full browser instances eats RAM and CPU. Not viable for parallel sessions without serious infrastructure.
Requires Python 3.7 or later. Install dependencies:
bash
pip install selenium undetected-chromedriver
Make sure Google Chrome is updated and installed—the script auto-detects your local browser.
Bypass example:
python
import undetected_chromedriver as uc
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
url = 'https://scrapingtest.com/cloudflare-challenge'
options = uc.ChromeOptions()
options.add_argument('--no-sandbox')
options.add_argument('--disable-blink-features=AutomationControlled')
driver = uc.Chrome(options=options)
try:
driver.get(url)
wait = WebDriverWait(driver, 30)
h2_text = wait.until(EC.presence_of_element_located((By.TAG_NAME, 'h2'))).text.strip()
h3_text = wait.until(EC.presence_of_element_located((By.TAG_NAME, 'h3'))).text.strip()
print('Status Code: 200 (browser)')
print('First
Missing distutils: Python 3.12+ removed it by default. Fix: pip install setuptools
ChromeDriver version mismatch: Update Chrome to match the driver version
WinError 6: Harmless garbage collection warning on Windows, ignore it
Rebrowser-Puppeteer uses modified Chrome binaries with evasions baked in at a deeper level than typical headless patches. It mimics real user behavior more convincingly and can interact with Turnstile CAPTCHAs programmatically—something most tools can't do.
If you need local control with Turnstile support and don't mind manual setup, this is one of the few open-source options that works.
Turnstile compatibility: Can click through Cloudflare's interactive CAPTCHA, unlike most libraries.
Simple installation: No browser patching or binary downloads—just npm install and run.
Realistic behavior: Mimics human interactions well enough to pass behavioral detection.
No active maintenance: The library is discontinued. If Chrome updates break compatibility, you're on your own.
Niche ecosystem: Smaller community means fewer tutorials and less troubleshooting help.
Resource-heavy: Not designed for scale or parallel sessions.
Install Node.js, then:
bash
npm init -y
npm install rebrowser-puppeteer puppeteer-extra puppeteer-extra-plugin-stealth
Bypass with Turnstile clicking:
javascript
const vanillaPuppeteer = require("rebrowser-puppeteer");
const { addExtra } = require("puppeteer-extra");
const puppeteer = addExtra(vanillaPuppeteer);
const createPuppeteerStealth = require("puppeteer-extra-plugin-stealth");
const puppeteerStealth = createPuppeteerStealth();
puppeteerStealth.enabledEvasions.delete("user-agent-override");
puppeteer.use(puppeteerStealth);
async function scrape() {
let browser = null;
try {
browser = await puppeteer.launch({
ignoreDefaultArgs: ["--enable-automation"],
headless: false,
args: ["--disable-features=AutomationControlled"],
defaultViewport: null,
});
const [page] = await browser.pages();
const url = "https://scrapingtest.com/cloudflare-turnstile";
let response = await page.goto(url, { waitUntil: "domcontentloaded" });
let h2 = null, h3 = null;
while (true) {
h2 = await page.$eval("h2", (el) => el.textContent.trim()).catch(() => null);
h3 = await page.$eval("h3", (el) => el.textContent.trim()).catch(() => null);
if (h2 || h3) break;
let cfInput = await page.$('[name="cf-turnstile-response"]');
if (cfInput) {
const parentItem = await cfInput.evaluateHandle((element) => element.parentElement);
const coordinates = await parentItem.boundingBox();
if (coordinates) {
await page.mouse.click(coordinates.x + 25, coordinates.y + coordinates.height / 2);
}
}
await new Promise((resolve) => setTimeout(resolve, 1500));
}
console.log("Status Code:", response.status());
console.log("First <h2>:", h2);
console.log("First <h3>:", h3);
} catch (e) {
console.error(e);
} finally {
if (browser) await browser.close();
}
}
scrape();
Output:
Status Code: 403
First
CF-Clearance-Scraper runs as a local HTTP server on your machine. You send POST requests to localhost, and it launches a real browser behind the scenes to solve challenges and return clean HTML.
It handles both regular Cloudflare challenges and Turnstile CAPTCHAs, working with any language that can send HTTP requests.
High success rate: Real browser interaction passes both challenge types reliably.
Language-agnostic: Works from Python, Node.js, or any HTTP client.
Flexible output: Returns rendered HTML or screenshots depending on your needs.
Complex setup: Requires cloning from GitHub, installing Node dependencies, and running a local server.
Resource drain: Uses Chromium headlessly, consuming significant CPU and RAM.
Doesn't scale easily: Local-only by default; distributing requires Docker or manual server replication.
Install Node.js, then:
bash
git clone https://github.com/zfcsoftware/cf-clearance-scraper
cd cf-clearance-scraper
npm install
npm run start
The server listens on http://localhost:3000. Send POST requests to /cf-clearance-scraper with your target URL to get bypassed content.
Python example:
python
import requests
from bs4 import BeautifulSoup
url = "https://scrapingtest.com/cloudflare-challenge"
response = requests.post(
"http://localhost:3000/cf-clearance-scraper",
json={"url": url}
)
soup = BeautifulSoup(response.text, 'html.parser')
h2 = soup.find('h2').get_text(strip=True) if soup.find('h2') else None
h3 = soup.find('h3').get_text(strip=True) if soup.find('h3') else None
print(f"Status Code: {response.status_code}")
print(f"First
Camoufox modifies Firefox at the C++ level to manipulate fingerprints before JavaScript can detect them. Unlike tools that patch behavior through JavaScript injection, Camoufox intercepts fingerprinting APIs directly in the browser engine.
The result is a browser indistinguishable from real users—even to sophisticated anti-bot systems. It includes realistic mouse movements, typing delays, and automatic ad blocking.
Undetectable fingerprinting: Modifies navigator properties, WebGL, screen dimensions, and fonts at the engine level with no JavaScript traces.
Human behavior simulation: Built-in realistic mouse movements and interaction timing.
Open source and free: Full source access with active development.
Resource-intensive: Runs full Firefox instances. Not suitable for serverless or high-volume parallel scraping.
More complex setup: Requires downloading custom browser binaries and managing dependencies.
Smaller community: Less documentation and fewer tutorials compared to mainstream tools.
Install with Python 3.7+:
bash
pip install -U camoufox[geoip] browserforge
Download the custom Firefox build:
bash
camoufox fetch
Bypass with stealth fingerprinting:
python
from camoufox.sync_api import Camoufox
from browserforge.fingerprints import Screen
import time
def scrape_cloudflare_with_camoufox(url):
with Camoufox(
headless=True,
os=["windows"],
screen=Screen(max_width=1920, max_height=1080),
) as browser:
page = browser.new_page()
response = page.goto(url)
page.wait_for_load_state('networkidle')
for _ in range(15):
time.sleep(1)
for frame in page.frames:
if frame.url.startswith("https://challenges.cloudflare.com"):
frame_element = frame.frame_element()
bbox = frame_element.bounding_box()
if bbox:
click_x = bbox["x"] + bbox["width"] / 9
click_y = bbox["y"] + bbox["height"] / 2
page.mouse.click(x=click_x, y=click_y)
time.sleep(3)
break
if page.query_selector("h2"):
break
page.wait_for_load_state("networkidle")
h2_el = page.query_selector("h2")
h3_el = page.query_selector("h3")
h2_text = h2_el.inner_text().strip() if h2_el else None
h3_text = h3_el.inner_text().strip() if h3_el else None
print(f"Status Code: {response.status}")
print(f"First <h2>: {h2_text}")
print(f"First <h3>: {h3_text}")
scrape_cloudflare_with_camoufox("https://scrapingtest.com/cloudflare-challenge")
scrape_cloudflare_with_camoufox("https://scrapingtest.com/cloudflare-turnstile")
Output:
Status Code: 200
First
You can bypass Cloudflare without paying—as long as you're willing to manage browsers, burn local resources, and watch your success rate drop after a few hundred requests. Then you'll need proxies. And then you'll wonder if you're really saving money.
When scraping becomes infrastructure instead of experimentation, 👉 tools like ScraperAPI remove the friction entirely—headless rendering, proxy rotation, CAPTCHA handling, and developer support included. You get 1000 successful requests free every month, and after that you're paying for reliability, not piecing together workarounds.
This guide showed you five working methods. Choose based on your scale, budget, and tolerance for maintenance. For production systems where downtime costs money, managed infrastructure usually wins. For learning or small projects, local tools work fine—until they don't.