Not opposed, Wikimedia tech folks are very accessible in my experience, ask them to make a GET or PO...

toomuchtodo • yesterday at 9:07 PM • 4 replies • view on HN

Not opposed, Wikimedia tech folks are very accessible in my experience, ask them to make a GET or POST to https://web.archive.org/save whenever a link is added via the Wiki editing mechanism. Easy peasy. Example CLI tools are https://github.com/palewire/savepagenow and https://github.com/akamhy/waybackpy

Shortcut is to consume the Wikimedia changelog firehose and make these http requests yourself, performing a CDX lookup request to see if a recent snapshot was already taken before issuing a capture request (to be polite to the capture worker queue).

Replies

Gander5739 • yesterday at 9:36 PM

This already happens. Every link added to Wikipedia is automatically archived on the wayback machine.

➕ show 2 replies

jsheard • yesterday at 9:11 PM

I didn't know you can just ask IA to grab a page before their crawler gets to it. In that case yeah it would make sense for Wikipedia to ping them automatically.

ferngodfather • yesterday at 9:19 PM

Why wouldn't Wikipedia just capture and host this themselves? Surely it makes more sense to DIY than to rely on a third party.

➕ show 2 replies

RupertSalt • yesterday at 9:11 PM

Spammers and pirates just got super excited at that plan!

➕ show 1 reply

alt Hacker News

Replies