Tool:Wikiloves

From Wikitech
Toolforge tools
Wikiloves
Website https://wikiloves.toolforge.org/
Author(s) Danilo.mac, Jean-Frédéric
Maintainer(s) Jean-Frédéric (View all)
Source code https://github.com/JeanFred/wikiloves
License GNU General Public License 2.0
Issues Open tasks · Report a bug
Admin log Nova_Resource:Tools.wikiloves/SAL

wikiloves is a stats tool for Wiki Loves contests. Configuration is largely on-wiki at Module:WL data.

Definitions

The following terms are used for the concepts used in the tool. They may be renamed in the future in order to increase clarity or to match expanded scope.

Scope/Event
For example, Wiki Loves Monuments or Wiki Loves Africa
An event also has a slug, for example monuments or africa
Year
For example, 2015
Country
For example, Germany or Namibia

(The tool tries to avoid the terms competition or contest, which are often used interchangeably for any of these concepts).

These terms combine as follow:

Edition
An occurrence of an event in a given year. For example, Wiki Loves Monuments 2015 or Wiki Loves Africa 2017.
Instance
An occurrence of an edition in a given country. For example, Wiki Loves Monuments 2015 in Germany or Wiki Loves Africa 2017 in Namibia
An instance also has a start time and an end time.

Metrics

The tool is concerned with the following metrics:

Uploads
How many files were uploaded during the local competition timeframe
Images used in the wikis
How many of these uploads are in use in the wikis
Uploaders
How many distinct users uploaded files
Uploaders registered after competition start
How many users registered their Wikimedia account after the competition starts

How it works

Configuration

The configuration of the tool lives on wiki at commons:Module:WL data. It lists events, editions and instances The structure there is parsed at every update ; however only whitelisted scopes] are taken into account.

Update

In update mode, the tool iterates over every instance of every edition of every event/scope and infers a Wikimedia Commons category, in general based on the convention Images from <event> <year> in <country>. It then runs a database query on that category (via the Wiki replicas) to retrieve information on files (upload time, usage on wikis) and their uploaders (name, registration data). Files uploaded outside of the instance (ie, before the start time or after the end time) are discarded. This data is aggregated and stored in a JSON-based local database (see example).

Update are done regularly (via cron jobs):

  • One update every night (UTC) for all scopes/editions/instance
  • One high-frequency update, every 15 minutes, for in-progress editions.

Display

In display mode (ie, when browsing the website), data is read from the JSON file, and transformed and aggregated in various ways for display. Here are the existing views:

Main page
  • For every scope, displays a summary table of the editions: how many countries participated, and the aggregated four base metrics
  • A summary table of all countries, and which editions did they take part in
Event page
eg monuments
The same scope-level summary table
For every country, the summary table of their stats in every edition
Country page
eg country/Tunisia
A summary table per scope of all editions in which the country took part in
Edition page
eg monuments/2017
A summary table of the metrics per country, and a graph of uploads over time (the 'country-race')
Instance page
eg monuments/2017/Tunisia
A day-to-day table of image uploads and joiners. Joiners is an uploader who uploaded their first image on that day.
Instance uploaders page
eg monuments/2017/Tunisia/users
A sorted table of all uploaders, their upload count (as part of the competition) and file usage, and their registration date

Scope

This tool gathers statistics on any Wiki Loves contest (whether or not it is actually called Wiki Loves X) which follow the pattern scope/edition/instance. If your project does not follow that pattern, then this tool is unfortunately not a good fit for it. In concrete terms, if your project is only organised in one country, then this tool will not be a good fit for it.

Future developments may relax these requirements, or expose the base functionality (extract data on one Commons category).

Technical aspects

Development

A docker-compose setup allows you to easily spin-up the webserver with some fixtures data.

Administration

Deployment on Toolforge is done using an Ansible playbook.

JSON database schema

A quick annotated excerpt from the internal JSON database:

{
  "dumplings2030": {        # Edition (= event in a given year)
    "Syldavie": {           # Country
      "count": 26,               # Total image count
      "usage": 0,                # Count of the images in-use (subset of the previous)
      "usercount": 2             # Count of the participating users
      "userreg": 2,              # Count of the participating users who registered after the competition started (subset of the previous)

      "category": "Images_from_Wiki_Loves_Dumplings_2030_in_Syldavie",
      "start": 20300901050000,
      "end": 20301001045959, 

      "users": {
        "Alice": {
          "count": 4,             # Count of uploads (as part of the edition)
          "usage": 0,             # Count of images uploaded in-use
          "reg": 20300902064618   # Registration date
        },
        "Bob": {
          "count": 22,
          "usage": 0,
          "reg": 20160903173639
        }
      },

      "data": {
        "20300902": {           # A day of the competition
          "images": 4,          # Count of images uploaded that day
          "joiners": 1,         # Count of uploaders who who uploaded their first image on that day.
          "newbie_joiners": 1   # Among these, how many registered their account after the competition started
        },
        "20300903": {           # Another day of the competition
          "images": 22,
          "joiners": 1,
          "newbie_joiners": 1
        }
      }
    }
  }
}