How to fix 403 Forbidden errors when calling APIs using Python requests?

Introduction

Web scraping can feel like navigating a minefield when servers block your requests with 403 Forbidden errors. These errors often occur because websites detect non-browser traffic (like scripts) through mechanisms like TLS fingerprinting, header validation, or IP blocking. While tools like Selenium mimic browsers, they’re resource-heavy. In this guide, I’ll share multiple proven techniques to bypass 403 errors using Python, including a hidden gem: curl_cffi.


The Problem: 403 Forbidden Hell

While trying to scrap some data from a website, my Python script using the popular requests library kept hitting a brick wall:

import requests
response = requests.get(url, headers=perfect_headers)# Always returns 403!

Despite:

  • Perfectly replicated headers (via MITMproxy)
  • Matching cookies
  • Correct user-agent
  • Proper TLS configuration

The server kept rejecting my requests with 403 Forbidden errors. Why?


Why 403 Errors Happen

Missing/Invalid Headers: Servers check for browser-like headers (e.g., sec-ch-ua, user-agent).TLS/JA3 Fingerprinting: Servers detect non-browser TLS handshakes.IP Rate Limiting: Too many requests from the same IP.Path/Protocol Validation: URLs or HTTP versions may trigger suspicion.

in my case it was The Culprit: TLS Fingerprinting

Modern websites don’t just check headers — they analyze your TLS handshake fingerprint (JA3). Libraries like requests and urllib have distinct fingerprints that scream "BOT!" to servers.


The Solution

Use curl_cffi to Impersonate Browser TLS Fingerprints, The curl_cffi library mimics browser TLS fingerprints, bypassing JA3 detection. it combines cURL’s power with browser-like TLS fingerprints.

1. Installation

pip install curl_cffi

2. The Magic Code

# Install: pip install curl_cffi
from curl_cffi import requests
headers = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36...",
"accept": "*/*","referer": "https://example.com"}
response = requests.get("https://example.com/",headers=headers,impersonate="chrome110" # Mimics Chrome 110 TLS)

Impersonation Targets:

# Available options 

  • impersonate="chrome110" 
  • impersonate="chrome120" 
  • impersonate="safari16"

Key Differentiators

  • impersonate parameter specifying Chrome 110
  • No SSL verification needed
  • Automatic handling of HTTP/2 and brotli encoding

Why This Works

  • Spoofs Chrome’s TLS fingerprint, making the request appear browser-like.
  • Avoids the need for Selenium or headless browsers.

Tips:

  • Add random delays between requests
  • Rotate user-agent strings
  • Use proxy rotation

Other Solutions to try

1. Refine Headers to Match Browser Requests

Capture headers from a real browser (using Chrome DevTools or mitmproxy) and include all critical headers like: sec-ch-ua, sec-fetch-*, referer, origin

Example:

headers = {"sec-ch-ua": '"Google Chrome";v="131", "Chromium";v="131", "Not-A Brand";v="24"',"sec-ch-ua-mobile": "?0", "sec-ch-ua-platform": "Windows","sec-fetch-site": "same-origin","sec-fetch-mode": "cors","referer": "https://example.com/","priority": "u=1, i"}

Tip: Simplify headers if they conflict (e.g., use accept: */* instead of complex values).

2. Use Sessions and Rotate User-Agents

Persist cookies and rotate headers with requests.Session:

import requests
from fake_useragent import UserAgent
session = requests.Session()
ua = UserAgent()
headers = {"user-agent": ua.chrome,"accept-language": "en-US,en;q=0.9"
session.headers.update(headers)
response = session.get("https://example.com/")

3. Spoof HTTP/2 with httpx

Some sites require HTTP/2 support. Use httpx for HTTP/2 compatibility:

# Install: pip install httpx
import httpx
with httpx.Client(http2=True, headers=headers) as client:
	response = client.get("https://example.com/")

4. Bypass Path Validation

Modify the URL to trick path-based filters:

url = "https://example.com// # Add trailing slashes 
# OR
url = "https://example.com/?cache=1" # Add dummy params

5. Route Through Proxies

Rotate IPs to avoid blocks:

proxies = {"http": "http://user:pass@proxy_ip:port", "https": "http://user:pass@proxy_ip:port"}
response = requests.get(url, headers=headers, proxies=proxies)

Free Proxies: Use services like FreeProxyList, but expect instability.


6. Disable SSL Verification (Last Resort)

If the site blocks non-browser SSL handshakes:

response = requests.get(url, headers=headers, verify=False) # Use with caution!

Conclusion

Bypassing 403 errors requires mimicking browsers at multiple levels: headers, TLS fingerprints, and request patterns. While curl_cffi is a game-changer, combining it with header refinement, HTTP/2, and proxies ensures robust scraping. Always respect robots.txt and avoid overloading servers.

Got your own 403 horror story? Share your experiences in the comments!


⚠️ Disclaimer: This article is for educational purposes only. Always obtain proper authorization before scraping any website.

Post a Comment

© infoTequick. All rights reserved. Distributed by ASThemesWorld