logoalt Hacker News

Phelinofisttoday at 1:54 PM2 repliesview on HN

I selfhost Gitea. The instance is crawled by AI crawlers (checked the IPs). They never cloned, they just browse and take it directly from there.


Replies

Phelinofisttoday at 4:14 PM

For reference, this is how I do it in my Caddyfile:

   (block_ai) {
       @ai_bots {
           header_regexp User-Agent (?i)(anthropic-ai|ClaudeBot|Claude-Web|Claude-SearchBot|GPTBot|ChatGPT-User|Google-Extended|CCBot|PerplexityBot|ImagesiftBot)
       }

       abort @ai_bots
   }
Then, in a specific app block include it via

   import block_ai
show 1 reply
Zambytetoday at 2:21 PM

i run a cgit server on an r720 in my apartment with my code on it and that puppy screams whenever sam wants his code

blocking openai ips did wonders for the ambient noise levels in my apartment. they're not the only ones obviously, but they're they only ones i had to block to stay sane

show 1 reply