Jump to content

irc.wikimedia.org

From Wikitech
Operators: Handle irc.wikimedia.org with caution. Interruption in this service will cause projects to suffer damage in the form of vandalism that will go largely unrecovered or cost significant volunteer efforts to fix retroactively.

irc.wikimedia.org is a IRC service for broadcasting recent changes events from public Wikimedia Foundation wikis. It is currently powered by Ircstream.

Service

The service is currently hosted on irc[12]003.wikimedia.org. Events are broadcasted to both servers but the irc.wikimedia.org CNAME is only pointed to one server and needs to be switched manually if the active ones goes down.

From MediaWiki's introduction of RC2UDP in 2005 until 2024, the service used to run on IRCD (T376014). Since October 2024 it uses Ircstream.

MediaWiki is configured via $wgRCFeeds to broadcast UDP messages to Ircstream, formatted by IRCColourfulRCFeedFormatter.

Privacy

Every client that connects to irc.wikimedia.org gets its own sandboxed IRC channel stream: no other connected user is visible, and it is forbidden/unsupported to send messages to other users as well. Since the backend is not a full implementation of an IRC server (see https://github.com/paravoid/ircstream for more info) there is no need for a cloak, the client's IP address is not shared/visible.

Why do I see only my nick and rc-pmtpa in various channels?

In the past the backend service for irc.wikimedia.org was a fully working IRC server implementation, so a new client was able to see all the other connected bots/clients after connecting to a channel. With the advent of Ircstream, all clients get a private sandbox upon connecting, so they can only see rc-pmtpa among the other clients connected (see also the above section about Privacy for more info).

Use

Format

Messages from irc.wikimedia.org.

Refer to IRCColourfulRCFeedFormatter for the implementation of this format.

Descriptions and example parsers:

Avoid new use

We now have a a vastly superior EventStreams service providing machine-readable push notifications over HTTP in JSON format at https://stream.wikimedia.org/v2/stream/recentchange, but until the key consumers of the IRC recent changes feed have migrated, this old service remains vital.

Its practical use ranges widely. From relatively low impact services, such as:

  • pretty monitors (e.g. wikistream[1])
  • edit bots that react to edits as they happen (e.g. bots that sign unsigned talk page contributions by anons)

... to much higher impact services:

  • Countervandalism auto-revert bots (e.g. ClueBot alone finds and reverts over 1,000 edits every day on en.wikipedia.org alone)
  • Countervandalism Network patrol feeds (e.g. CVNBot, monitors our 100s of smaller wikis that don't have active patrol teams or watchers)

When irc.wikimedia.org is down:

  • ClueBot will be interrupted and any edits happening during time will never be looked into by it. This is comparable to having the AbuseFilter extension down.
  • Patrollers (using RTRC, or Huggle, or STiKI) will not see the edits made during this time. This is comparable to having the recentchanges SQL table temporarily ignore all write actions and thus pretty much hide activity from reviewers.

While it is most unfortunate that we rely on a bot to undo vandalism that should probably be prevented before saving (e.g. some kind of hook in AbuseFilter to consult ClueBot, could be ran natively on the cluster perhaps one day), and it is most unfortunate that most of our patrolling efforts indirectly rely on irc.wikimedia.org, that is the reality and it is important that we know its impact and the kind of sensitive treatment it deserves.

References

  1. wikistream (GitHub / node-wikichanges)