logoalt Hacker News

marginalia_nuyesterday at 6:02 PM1 replyview on HN

I doubt they're actually interested in the git repos.

From the shape of the traffic it just looks like a poorly implemented web crawler. By default, a crawler that does not take measures to actively avoid git hosts will get stuck there and spend days trying to exhaust the links of even a single repo.


Replies

kjuulhyesterday at 11:47 PM

For me it was specifically crawlers from the large companies, they we're at least announcing themselves as such. They did have different patterns, bytedance was relatively behaved, but some of the less known ones, did have weird patterns of looking at comparisons.

I do think they care about repos, and not just the code, but also how it evolves over time. I can see some use, if marginal in those traits. But if they really wanted that, I'd rather they clone my repos, I'd be totally fine with that. But i guess they'd have to deal with state, and they likely don't want to deal with that. Rather just increase my energy bill ;)