logoalt Hacker News

neoromantiqueyesterday at 9:19 PM2 repliesview on HN

Ask HN: How does one archive websites like this without being a d-ck?

I want to save this for offline use, but I think recursive wget is a bit poor manners, is there established way one should approach it, get it from archive somehow?


Replies

ssl-3yesterday at 11:54 PM

In the old-web days, I just used wget with slow pacing (and by "pacing" I mean: I don't need it to be done today or even this week, so if it takes a rather long time then that's fine. Slow helped keep me from mucking up the latency on my dial-up connection, too.)

I don't think that's being a dick for old-web sites that still exist today. Most of the information is text, the photos tend to be small, it's all generally static (ie, light-weight to serve), and the implicit intent is for people to use it.

But it's pretty slow-moving, so getting it from archive.org would probably suffice if being zero-impact is the goal.

(Or, you know: Just email the dude that runs it like it's 1998 again, say hi, and ask. In this particular instance, it's still being maintained.)

OJFordyesterday at 10:23 PM

A single user's one-off recursive wget seems fine? Browsers also support it iirc, individual pages at very least (and saved to the same place, the links will work).

No doubt it's already in many archive sites though, you could just fetch from them instead of the original?

show 1 reply