Jump to content

hCaptcha

From Wikitech


Please use caution when referencing concepts or code contained in the private https://docs.hcaptcha.com/enterprise documentation, as well as to private code used in the WMF proxy to hCaptcha

hCaptcha is a proprietary device fingerprinting and CAPTCHA platform as a service with applications for bot detection. Wikimedia Foundation is using this service on eight production Wikipedias (enwiki, frwiki, ptwiki, zhwiki, jawiki, fawiki, idwiki, trwiki) edits from users without the skipcaptcha right (non-autoconfirmed users) as well as account creations made via Special:CreateAccount.

Wikimedia uses hCaptcha Enterprise's First-Party feature, which routes all client traffic through a Wikimedia controlled proxy.

Design

The basic idea is the same as the standard hCaptcha flow. However, to minimize impact on user privacy, user requests do not go directly to hCaptcha servers, but are sent through our proxy, which strips IP addressess, cookies and other identifying information.

Client Side

We load hCaptcha JS through a reverse proxy on first form interaction with Special:CreateAccount.

FIXME: Also loaded on edits now, should be updated by PSI.

Server side

Mediawiki

FIXME: Describe how it interacts with common mediawiki userflows. Should be updated by PSI (or copied from the design doc).
FIXME: Explain fallback mechanism in MW. Should be updated by PSI.

hcaptcha-proxy

Hcaptcha wmf design
  • Forwards requests to hCaptcha
  • Strips any identifying information by unsetting various headers
  • Hashes client IP address, <%= @nginx_ipblinding_conf %> blocks in NginX-'s (see below)
FIXME: Out of date. Should be updated by Traffic to include anycast setup info.

CDNLoad Balancerreverse proxyhcaptcha upstream ("the internet")

  • proxoid.discovery.wmnet (active/active service)
  • The reverse proxy is a basic nginx installation Currently (Sept 2025) the proxy is installed on the url_downloader hosts.

Configuration

On https://dashboard.hcaptcha.com, we have a sitekey defined for production wikis.

This sitekey is defined in:

mediawiki-config

To enable/disable this functionality, toggle wmgEnableHCaptcha in wmf-config/InitialiseSettings.php. Here is an example patch to do this: I80886c

Puppet

Cookies

The proxy removes all cookies except hmt_id, per Idb012f and I87190f.

Monitoring

Runbook

Depooling an hcaptcha-proxy instance / an instance is down

FIXME: Explain automatic fallback due to the anycast setup, should be updated by Traffic.

Completely disable hCaptcha

In case of emergency, disable this functionality by toggling wmgEnableHCaptcha in wmf-config/InitialiseSettings.php. Here is an example patch to do this: I80886c

Proxy smoke test

FIXME: Smoke test requests should be documented by ServiceOps or Traffic.

Contact information

The hCaptcha integration with on-wiki workflows is managed by the Product Safety and Integrity team as part of the WE4.2 anti-abuse signals hCaptcha project.

The proxy infrastructure is managed by Traffic SRE and ServiceOps SRE.