Wikilabels

From Wikitech
Jump to navigation Jump to search

Wikilabels is one of stand-alone services that is being used gather data from users to build AI models for ORES and it's being maintained by Wikimedia Scoring Platform team. It's currently hosted on Nova_Resource:wikilabels (Cloud VPS)

Technical details

  • There are several instances:
    • wikilabels-02.eqiad.wmflabs The main node and uses Postgresql (db1004.eqiad.wmnet) to work. It's accessible from labels.wmflabs.org
    • wikilabels-staging-01.eqiad.wmflabs: The staging node, uses similar setup and accessible from labels-staging.wmflabs.org
    • wikilabels-experiment.eqiad.wmflabs: The do tests and funny stuff. Accessible from labels-experiment.wmflabs.org
    • wikilabels-backups.eqiad.wmflabs: The nodes that keeps daily database backups of the main node. Accessible from wikilabels-dumps.wmflabs.org

Deployment guide

  • After things getting merged in the main repo. You need to update the deploy repo.
cd wikilabels-wmflabs-deploy/
git pull
cd submodules/wikilabels
git pull
cd ../..
git add wikilabels
git commit

Then write something like "Bumping wikilabels to HEAD"

git push
fab stage

Now it's in the staging node. log it (using !log wikilabels in #wikimedia-cloud channel in IRC) Test it and if it works fine move to prod

git checkout deploy
git rebase origin/master
git push -f origin deploy
fab deploy

And log it!

A new labeling campaign

You need to first introduce a new campaign:

$ ssh wikilabels-02.eqiad.wmflabs
ladsgroup@wikilabels-02$ cd /srv/wikilabels/config
ladsgroup@wikilabels-02:/srv/wikilabels/config$ sudo -u www-data /srv/wikilabels/venv/bin/wikilabels new_campaign wikidatawiki "Edit quality (5k, 2018)" damaging_and_goodfaith DiffToPrevious 1 50
{'form': 'damaging_and_goodfaith', 'id': 38, 'view': 'DiffToPrevious', 'active': True, 'name': 'Edit quality (5k, 2018)', 'tasks_per_assignment': 50, 'labels_per_task': 1, 'wiki': 'wikidatawiki', 'info_url': None, 'created': datetime.datetime(2018, 7, 11, 13, 39, 54, 282569)}

Note the id (38 in this case). And now you need to load the data into the campaign. Download the file in the home directory:

ladsgroup@wikilabels-02:/srv/wikilabels/config$ less ~/wikidatawiki.autolabeled_revisions.125k_2018.review.json | sudo -u www-data ../venv/bin/wikilabels task_inserts 38

Restarting the service

Any time the connection PostgreSQL is broken, we need to restart the wikilabels service:

service uwsgi-wikilabels-web restart

Incidents

See also