WMDE/Wikidata/Dispatching

From Wikitech
< WMDE‎ | Wikidata
Jump to navigation Jump to search

Overview

  • Changes on Wikidata are buffered in the wb_changes table.
  • Dispatch state of wikis is stored in the wb_changes_dispatch table.
  • Multiple dispatchChanges.php scripts run creating ChangeNotificationJob jobs on wikis.
  • When the job runs on wikis, the ChangeHandler handles the change, which includes cache purges, refreshing links and injecting rc records for example.

Full docs: https://doc.wikimedia.org/Wikibase/master/php/md_docs_topics_change-propagation.html

The process runs on wikidatawiki and testwikidatawiki Does it run on beta?

Dispatch lag is linked to max lag. If dispatch lag increases max lag will increase and the edit rate on wikidata can dramatically drop. For example: https://grafana.wikimedia.org/dashboard/db/wikidata-edits?orgId=1&from=1541090549052&to=1541109255645

Occasional stuck locks

While watching the dispatching graphs on grafana you will occasionally notice that one or 2 wikis get stuck. This is annoying and the root cause is unknown. However this is not something to worry about too much as the locks have a fairly short TTL.

Control

The initiating of dispatchChanges.php is controlled by a cronjob in operations-puppet. This starts a new process every couple of minutes. Fine grained control of the script can be done from within IS.php in mediawiki-config.

Number of dispatching threads

Control

The number of dispatching threads can be controlled with a combination of the time controls for the cronjob in puppet and the "wmgWikibaseDispatchMaxTime" setting in IS.php

For example in https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/471088/3/wmf-config/InitialiseSettings.php the number of threads dispatching to Wikidata was changed from 2 to 3 by making each process run for a longer period of time.

Value choice

https://grafana.wikimedia.org/dashboard/db/wikidata-dispatch-script can be a good tool to determine if we need more threads dispatching or not.

If the dispatch process is consistently having 0 "no client passes" then we probably need more dispatchers.

Stop dispatching

wmgWikibaseDispatchMaxTime can be set to 0, and after some minutes dispatching will stop.

If you really need to stop dispatching now change this setting and also kill the processes on the mwmaint server.

Monitoring

Dispatching state on the repo

Job queue for ChangeNotificationJobs

TODO dashboards for monitoring the job queue?

How it actually works?

Data Storage

Every time someone makes an edit in Wikidata, a new row in wb_changes table in wikidatawiki gets added. Here's an example:

MariaDB [wikidatawiki_p]> select * from wb_changes limit 1\G
*************************** 1. row ***************************
         change_id: 1014161077
       change_type: wikibase-item~update
       change_time: 20190924171504
  change_object_id: Q68474115
change_revision_id: 1019310059
    change_user_id: 142191
       change_info: {"compactDiff":"{\"arrayFormatVersion\":1,\"labelChanges\":[],\"descriptionChanges\":[\"el\",\"eo\",\"en\",\"zh\",\"sr-ec\",\"wuu\",\"vi\",\"sr-el\",\"it\",\"zh-hk\",\"ar\",\"pt-br\",\"tg-cyrl\",\"cs\",\"et\",\"gl\",\"id\",\"es\",\"en-gb\",\"ru\",\"he\",\"nl\",\"pt\",\"zh-tw\",\"nb\",\"tr\",\"zh-cn\",\"tl\",\"th\",\"ro\",\"ca\",\"pl\",\"fr\",\"bg\",\"ast\",\"zh-sg\",\"bn\",\"de\",\"zh-my\",\"ko\",\"da\",\"fi\",\"zh-mo\",\"hu\",\"ja\",\"en-ca\",\"ka\",\"nn\",\"zh-hans\",\"sr\",\"sq\",\"nan\",\"oc\",\"sv\",\"zh-hant\",\"sk\",\"uk\",\"yue\"],\"statementChanges\":[],\"siteLinkChanges\":[],\"otherChanges\":false}","metadata":{"page_id":68145928,"parent_id":1019293753,"comment":"\/* wbeditentity-update:0| *\/ Bot: - Add descriptions:(58 langs).","rev_id":1019310059,"user_text":"Mr.Ibrahembot","central_user_id":15992302,"bot":1}}

The change_info is the compact serialization of the change and it's being used later to dispatch the change. This table is being trimmed by a cronjob in mwmaint1002 not to contain anything older than three days, otherwise it would explode.

Wikidata (and other repos like commons) keep track of client wikis that subscribe to their entities. It's stored in wb_changes_subscription table:

MariaDB [wikidatawiki_p]> select * from wb_changes_subscription where cs_entity_id like 'Q%' limit 5;
+-----------+--------------+------------------+
| cs_row_id | cs_entity_id | cs_subscriber_id |
+-----------+--------------+------------------+
| 100946988 | Q1           | afwiki           |
|  57021716 | Q1           | alswiki          |
| 116682143 | Q1           | amwiki           |
|  57060845 | Q1           | anwiki           |
|  57107362 | Q1           | arcwiki          |
+-----------+--------------+------------------+

Client wikis themselves keep track of exactly which part of Wikidata (=repo) entities they are using and in which pages in a table called wbc_entity_usage (note that this table is in all wikis and not only in Wikidata).

This is an example form Afrikaans Wikipedia:

MariaDB [afwiki_p]> select * from wbc_entity_usage where eu_entity_id = 'Q1';
+-----------+--------------+-----------+------------+
| eu_row_id | eu_entity_id | eu_aspect | eu_page_id |
+-----------+--------------+-----------+------------+
|    398481 | Q1           | C         |      39420 |
|    929039 | Q1           | L.af      |      70835 |
|    132666 | Q1           | O         |      39420 |
|    115881 | Q1           | S         |      39420 |
|    398482 | Q1           | T         |      39420 |
|    929040 | Q1           | T         |      70835 |
+-----------+--------------+-----------+------------+

"C" means "statements" (a.k.a. "claims"), "L.af" means "Label in Afrikaans language" , "O" means "Other" (currently means aliases), "S" means "Sitelinks" (to show them in sidebar), "T" means title. Note that "Q1" is being used in two different pages in different aspects. An item can be used in millions of pages in a client wiki.

The workflow using an example

Note: Change dispatching is only responsible for triggering a refresh (and inject rows to RecentChanges). Fetching the actual data happens somewhere else in the code (see WikiPageUpdater in change-propagation.wiki).

The actual dispatching happens with several parallel cronjobs that run a maintenance script for some time and then die (and new ones respawn). The maintenance scripts coordinate between themselves and the dead maintenance scripts using a redis lock manager (the same code and infrastructure as filebackend lock manager) and a table called wb_changes_dispatch:

MariaDB [wikidatawiki_p]> select * from wb_changes_dispatch limit 5;
+-------------+-------------+------------+----------------+----------+--------------+
| chd_site    | chd_db      | chd_seen   | chd_touched    | chd_lock | chd_disabled |
+-------------+-------------+------------+----------------+----------+--------------+
| abwiki      | abwiki      | 1015936815 | 20190927141553 | NULL     |            0 |
| acewiki     | acewiki     | 1015936788 | 20190927141551 | NULL     |            0 |
| adywiki     | adywiki     | 1015936745 | 20190927141546 | NULL     |            0 |
| afwiki      | afwiki      | 1015936737 | 20190927141545 | NULL     |            0 |
| afwikibooks | afwikibooks | 1015936815 | 20190927141553 | NULL     |            0 |
+-------------+-------------+------------+----------------+----------+--------------+

Note: The chd_lock column is replaced with the redis lock manager to avoid scripts holding long connections to master.

1. Each dispatching maintenance script picks up a client wiki that hasn't been touched for a while at random (to avoid race conditions with other scripts) then lock it at redis so other scripts don't pick it up, let's assume it picked up afwiki, then queries wb_changes table to find all changes that happened after afwiki's chd_touched timestamp value:

MariaDB [wikidatawiki_p]> select change_object_id, change_info from wb_changes join wb_changes_dispatch where wb_changes.change_time > wb_changes_dispatch.chd_touched and chd_site = 'afwiki' limit 5\G
*************************** 1. row ***************************
change_object_id: Q3065585
     change_info: {"compactDiff":"{\"arrayFormatVersion\":1,\"labelChanges\":[],\"descriptionChanges\":[],\"statementChanges\":[\"P7335\"],\"siteLinkChanges\":[],\"otherChanges\":false}","metadata":{"page_id":2930153,"parent_id":1021184825,"comment":"\/* wbsetreference-add:2| *\/ [[Property:P7335]]: 93048, Adding sources to Mixer ID (P7335).","rev_id":1021184886,"user_text":"Premeditated","central_user_id":54410566,"bot":0}}
*************************** 2. row ***************************
change_object_id: Q31827561
     change_info: {"compactDiff":"{\"arrayFormatVersion\":1,\"labelChanges\":[],\"descriptionChanges\":[\"de\"],\"statementChanges\":[],\"siteLinkChanges\":[],\"otherChanges\":false}","metadata":{"page_id":33311769,"parent_id":898389167,"comment":"\/* wbsetdescription-set:1|de *\/ Insel auf den Philippinen, [[:toollabs:quickstatements\/#\/batch\/19155|batch #19155]]","rev_id":1021184887,"user_text":"Hog\u00fc-456","central_user_id":35946278,"bot":0}}
*************************** 3. row ***************************
change_object_id: Q66116999
     change_info: {"compactDiff":"{\"arrayFormatVersion\":1,\"labelChanges\":[\"ast\"],\"descriptionChanges\":[],\"statementChanges\":[],\"siteLinkChanges\":[],\"otherChanges\":false}","metadata":{"page_id":65740278,"parent_id":993826385,"comment":"\/* wbsetlabel-add:1|ast *\/ Hibbertia vestita var. thymifolia","rev_id":1021184888,"user_text":"XabatuBot","central_user_id":53605708,"bot":1}}
*************************** 4. row ***************************
change_object_id: Q34659082
     change_info: {"compactDiff":"{\"arrayFormatVersion\":1,\"labelChanges\":[],\"descriptionChanges\":[\"hy\",\"hyw\"],\"statementChanges\":[],\"siteLinkChanges\":[],\"otherChanges\":false}","metadata":{"page_id":36092135,"parent_id":972155915,"comment":"\/* wbeditentity-update:0| *\/ hy:description:2002 \u0569\u057e\u0561\u056f\u0561\u0576\u056b \u0574\u0561\u0575\u056b\u057d\u056b\u0576 \u0570\u0580\u0561\u057f\u0561\u0580\u0561\u056f\u057e\u0561\u056e \u0563\u056b\u057f\u0561\u056f\u0561\u0576 \u0570\u0578\u0564\u057e\u0561\u056e, hyw:description:2002 \u0569\u0578\u0582\u0561\u056f\u0561\u0576\u056b \u0544\u0561\u0575\u056b\u057d\u056b\u0576 \u0570\u0580\u0561\u057f\u0561\u0580\u0561\u056f\u0578\u0582\u0561\u056e \u0563\u056b\u057f\u0561\u056f\u0561\u0576 \u0575\u0585\u0564\u0578\u0582\u0561\u056e","rev_id":1021184889,"user_text":"\u0531\u0577\u0562\u0578\u057f\u054f\u0546\u0542","central_user_id":45853834,"bot":1}}
*************************** 5. row ***************************
change_object_id: Q60369602
     change_info: {"compactDiff":"{\"arrayFormatVersion\":1,\"labelChanges\":[],\"descriptionChanges\":[],\"statementChanges\":[\"P356\",\"P304\",\"P478\",\"P819\",\"P1433\"],\"siteLinkChanges\":[],\"otherChanges\":false}","metadata":{"page_id":60243349,"parent_id":986272026,"comment":"\/* wbeditentity-update:0| *\/ batch import from [[Q654724|SIMBAD]] reference \"1996ApJ...466..732G\"","rev_id":1021184890,"user_text":"Ghuron","central_user_id":2200842,"bot":0}}

2. Then the job queries wb_changes_subscription to see which of them afwiki is actually subscribed to:

MariaDB [wikidatawiki_p]> select change_object_id, change_info from wb_changes join wb_changes_dispatch join wb_changes_subscription on change_object_id = cs_entity_id where wb_changes.change_time > wb_changes_dispatch.chd_touched and chd_site = 'afwiki' and cs_subscriber_id = 'afwiki' limit 5\G
*************************** 1. row ***************************
change_object_id: Q3180666
     change_info: {"compactDiff":"{\"arrayFormatVersion\":1,\"labelChanges\":[\"zh\"],\"descriptionChanges\":[],\"statementChanges\":[],\"siteLinkChanges\":[],\"otherChanges\":false}","metadata":{"page_id":3038120,"parent_id":999138832,"comment":"< A long comment>","rev_id":1021103671,"user_text":"LogainmBot","central_user_id":58945035,"bot":1}}
*************************** 2. row ***************************
change_object_id: Q469681
     change_info: {"compactDiff":"{\"arrayFormatVersion\":1,\"labelChanges\":[],\"descriptionChanges\":[\"hi\"],\"statementChanges\":[],\"siteLinkChanges\":[],\"otherChanges\":false}","metadata":{"page_id":442999,"parent_id":1021102387,"comment":"< A long comment>","rev_id":1021103680,"user_text":"Vidariv","central_user_id":5888,"bot":0}}

3. Then the job queues a ChangeNotificationJob in afwiki with the items that has changed and their change_info values (i.e. telling afwiki "Hey, Chinese label of Q3180666 and Hindi description of Q469681 has changed"). The given job checks it against the actual aspects it actually needs using wbc_entity_usage table in afwiki:

MariaDB [afwiki_p]> select * from wbc_entity_usage where eu_entity_id in ('Q3180666', 'Q469681') limit 5;
+-----------+--------------+-----------+------------+
| eu_row_id | eu_entity_id | eu_aspect | eu_page_id |
+-----------+--------------+-----------+------------+
|    872799 | Q3180666     | C.P1015   |     224030 |
|    872807 | Q3180666     | C.P1048   |     224030 |
|    872798 | Q3180666     | C.P1053   |     224030 |
|    872815 | Q3180666     | C.P1157   |     224030 |
|    872813 | Q3180666     | C.P1222   |     224030 |
+-----------+--------------+-----------+------------+

This is to avoid triggering a refreshLink or InjectRCrecord job when the used aspects of changed entities hasn't changed actually.

For example, if the page in afwiki only uses the label on English, and the aliases in Persian has changed, no action is needed here.

If there's a match or matches, the dispatcher triggers jobs to refresh the page(s) to use the new data and injects rows into recentchanges table of the client. At the end, the job updates chd_seen and chd_touched, unlocks the wiki in redis, and starts again from 1.

Note: There used to be an aspect called "X" meaning "All" and it would basically says "notify for any change on the given item" but it's deprecated now.