logoalt Hacker News

Benderyesterday at 11:27 PM4 repliesview on HN

Are they in this space? [1] One could map the ranges into a web daemon and rate limit them or just 'ip route add blackhole ${cidr}' each cidr block.

[1] - https://ip-ranges.amazonaws.com/ip-ranges.json


Replies

phdelightfultoday at 11:55 AM

I didn't check thoroughly, but the first one I happened to grep out was not on that list:

"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot) Chrome/119.0.6045.214 Safari/537.36"

"x-forwarded-for":"44.210.204.255" "x-real-ip":"44.210.204.255"

This is a bit outside my area of expertise, so I don't know how reliable these x-forwarded-for and x-real-ip are.

show 1 reply
rnhmjojtoday at 6:28 AM

I just do this for the IP ranges of Amazon, OpenAI, Huawei and other companies that run these insane crawlers: it's 100% effective and it doesn't annoy real users with a captcha or some PoW thing. There's simply no reason for them to reach my homeserver other than to scrape the hell out of it.

Symbiotetoday at 7:26 AM

That's all of Amazon AWS, not just Amazon's AI system.

show 1 reply
lofaszvanitttoday at 7:05 AM

That list is a tad bit too long. Why don't they enforce a rule on these big corps to publicly state which range does what.

show 1 reply