Set up NinjaProxy in Scrapy with downloader middleware, environment-based credentials, rotating gateway username controls, and per-spider proxy overrides.

Scrapy already gives you concurrency controls, retries, and item pipelines. The missing piece for production scraping is usually the proxy layer. If every request leaves through the same IP, the crawl eventually slows down under bans, rate limits, or geo mismatches.
This setup keeps the NinjaProxy endpoint model aligned with the public docs: copy the exact host and port from the portal, keep credentials in environment variables, and only add rotation controls where they belong.
You can set request.meta["proxy"] one request at a time, but middleware is the cleaner default for most projects.
Use middleware when you want proxy behavior applied consistently across the crawl instead of repeating the same setup inside every spider.
Before touching Scrapy, copy three values from NinjaProxy:
usernameapiKeyDo not invent hostnames or ports. Public docs use placeholders like because the real endpoint is account-specific.
The minimal middleware only has to build one authenticated proxy URL and attach it to the outgoing request.
import os
def require_env(name: str) -> str:
value = os.getenv(name)
if not value:
raise RuntimeError(f"Missing required environment variable: {name}")
return value
class NinjaProxyMiddleware:
def process_request(self, request, spider):
username = require_env("NINJAPROXY_USERNAME")
api_key = require_env("NINJAPROXY_API_KEY")
http_endpoint = require_env("NINJAPROXY_HTTP_ENDPOINT")
request.meta["proxy"] = f"http://{username}:{api_key}@{http_endpoint}"That matches the example file in this repo at ipn-190-ninjasproxy-examples/python/scrapy_middleware.py, so the blog post and the sample stay consistent.
Scrapy only runs downloader middleware after you register it.
DOWNLOADER_MIDDLEWARES = {
"myproject.middlewares.NinjaProxyMiddleware": 543,
}If your project already uses retry, user-agent, or ban-detection middleware, keep those entries and add NinjaProxy alongside them. The important part is that the dotted import path matches the file where you defined the middleware class.
For assigned/static endpoints, the basic URL is enough. For rotating gateways, keep the same endpoint and append controls to the username only.
def build_rotating_username(base_username: str, session_id: str) -> str:
return (
f"{base_username}"
f"--session-{session_id}"
f"--duration-90"
f"--provider-res"
f"--geo-country-us"
)Then use that routed username when building request.meta["proxy"]. Reuse the same session_id when you want a sticky IP for a short flow. Change or remove it when you want a fresh route.
Middleware does not lock you into one proxy profile forever. A spider can still override the default route for a sensitive request.
def start_requests(self):
for url in self.start_urls:
yield scrapy.Request(
url,
meta={
"proxy": "http://<USERNAME>--session-serp-us-1:<API_KEY>@<ROTATING_HTTP_ENDPOINT>"
},
)That pattern is useful when one spider needs geo-targeting, a sticky checkout session, or a separate route for login pages.
After wiring the middleware, validate with a target that echoes your IP or headers before starting a full crawl.
NINJAPROXY_USERNAME, NINJAPROXY_API_KEY, and NINJAPROXY_HTTP_ENDPOINThttps://ip.ninjasproxy.com/ or https://httpbin.org/ipThis catches bad credentials and malformed endpoints early, before Scrapy fans the mistake across hundreds of requests.
--session-... token or never switched from a static endpoint to a rotating gateway.DOWNLOADER_MIDDLEWARES import path is wrong or the settings module was not loaded for that crawl.