Nova Resource:Wikistats/Documentation
Wikistats
Description
A project that collects and displays statistics for Mediawikis.
Purpose
Collecting statistics about Mediawiki installs on the internet.
Anticipated traffic level
10-100 hits per day
Anticipated time span
indefinite
Project status
currently running
Contact address
dzahn@wikimedia.org
Willing to take contributors or not
willing
Subject area narrow or broad
narrowstatistics
Where to find the code
Wikistats consists of 2 parts, the puppet manifests (in the operations/puppet git repo) and the Debian package (in the operations/debs/wikistats repo).
The puppet part is divided into ./puppet/modules/role/manifests/wikistats/instance.pp,the role class which is applied to a node/instance, and the module in ./puppet/modules/wikistats/.
Manifests
- role::wikistats - configures host name and SSL certs depending on labs vs. prod, uses the main classes below, is all that needs to be included on an instance or node
- wikistats - the init.pp of the module, sets up user/group, installs package (if we're using labsdebprepo), uses the other classes
- wikistats::cronjob - defines a cron job to update a table
- wikistats::db - installs mariadb, php-mysql
- wikistats::updates - installs php-cli, creates log dir, has definitions for the update cron jobs and configures them
- wikistats::web - does the Apache setup
(Currently they do not automatically install the package yet which is done manually)
How to build the Debian package
- git clone https://gerrit.wikimedia.org/r/operations/debs/wikistats
- cd wikistats
- "debuild" (signed) or "debuild -us -uc" (unsigned)
- cd ..
- optional: check which files would be installed by this: dpkg-deb -c wikistats_*_all.deb
- install: dpkg -i wikistats_*_all.deb
^ This is outdated. It's still in this repo but not an actual .deb anymore. Just git pull the files or let the deploy-wikistats command do that for you.
How to deploy latest code
- /usr/local/bin/wikistats# ./deploy-wikistats deploy
Optionally use "backup" to make a backup of current code before deploying or "restore" to restore from the last backup.
- /usr/local/bin/wikistats# ./deploy-wikistats backup
- /usr/local/bin/wikistats# ./deploy-wikistats restore
How to fix DB grants if they break after deploy
grep db_pass /etc/wikistats/config.php mysql -u root -p wikistats mysql> grant all privileges on wikistats.* to 'wikistatsuser'@'localhost' identified by '<password from config.php>';
How to add a new wiki
A common maintenance task is to add newly created wikis to the statistics tables in SQL. This means running an INSERT statement on the DB shell, followed by running the update.php script to fetch data for the first time. The query needed varies slightly depending what type of wiki it is. Each project, Wikipedia, Wiktionary, etc has its own table in the database.
First step, ssh to the current instance in the project "wikistats", which you can see in Horizon. As of 12 September, we have wikistats-bookworm.wikistats.eqiad1.wikimedia.cloud. In Horizon, under Proxies you can determine which of them is the backend for https://wikistats.wmcloud.org.
Once connected, get a mysql shell with:
mysql -u root wikistats MariaDB [wikistats]>
Wikipedia
An example for adding a Wikipedia with a new language that no project has used before.
MariaDB [wikistats]> insert into wikipedias (prefix, lang, loclang, method) values ("fat", "Fante", "Mfantse", 8);
The minimal values that you need to provide are:
prefix (the language code or subdomain, like the "en" in en.wikipedia.org) lang (the name of the language in English) loclang (the name of the language in the language itself method (this has historic reasons and nowadays should always be set to "8", 8 means to fetch data from the API vs old scraping methods)
In this example no other project exists in the language "Fante".
If the local language name has non-ASCII characters you have to convert them to HTML entities with something like https://onlinetools.com/utf8/convert-utf8-to-html-entities and store the HTML version in the database as the tables are not using utf8.
Each wiki creation ticket should link to an URL on meta where the new language was requested. (example https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikipedia_Ghanaian_Pidgin). This is the source of truth and where to get the correct language name strings and where you can confirm the new prefix is a valid ISO-639 language code (example, linked from there: https://iso639-3.sil.org/code/gpe).
Wiktionary
In this example a new Wiktionary has been created but the language is not entirely new as a project. A Wikipedia with the same language already exists, so we can use an "insert .. select" query to take all the values from the Wikipedias table where it has the same prefix.
MariaDB [wikistats]> insert into wiktionaries (prefix, lang, loclang, method) select prefix,lang,loclang,method from wikipedias where prefix="ckb";
Mediawikis
This is for non-WMF wikis, the general "all other MediaWikis" table. Here it needs a full URL to an api.php and that "method 8".
MariaDB [wikistats]> insert into mediawikis (method,statsurl) values (8,'https://www.qiuwenbaike.cn/api.php');
Manually updating data for a wiki
After adding a new wiki you can either just wait for the timers to update the table or manually run an update. example:
/usr/lib/wikistats/update.php wp prefix fat
Here "wp" means "from the wikipedias table" and "prefix fat", so this is for "fat.wikipedia.org". You can find the short names of all the other tables inside update.php in a large switch statement.
You can also update wikis by ID. Example, a specific wiki from the mediawikis table:
/usr/lib/wikistats/update.php mw id 20247
To manually update an entire table you can also manually start the systemd services. Find them with:
systemctl list-units | grep wikistats