Tool:InteGraality
Phabricator project: #tool-integraality
Appearance
| Website | https://integraality.toolforge.org/ |
| Author(s) | Jean-Frédérictalk |
| Maintainer(s) | Jean-Frédéric (View all) |
| Source code | https://github.com/JeanFred/integraality/ |
| License | MIT License |
| Issues | Open tasks · Report a bug |
| Admin log | Nova_Resource:Tools.integraality/SAL |
InteGraality is a tool generating dashboards of property coverage for a given part of Wikidata.
Modes
There are three possible modes:
- “Queries”, invoked from following the “🔍” in the cell. It uses the
queriesendpoint. - “Manual update”, invoked from the “update” link in the template header. It uses the
updateendpoint. - “Periodic update”, invoked by a cron job (currently weekly) which loops through all the dashboards on a given wiki and updates them.
Architecture
The tool has two main components:
- A Flask web app (served via uWSGI behind Toolforge's Kubernetes webservice) handling on-demand requests (
/update,/queries). - A CLI (
python -m integraality.pages_processor) that batch-updates all dashboards on a given wiki. This is invoked weekly by a scheduled Toolforge job.
Both share the same Python codebase and use pywikibot to read/write wiki pages.
Data flow:
- A user places
{{Property dashboard}}on a wiki page with config parameters PagesProcessorreads the page, parses the template configPropertyStatisticsruns SPARQL queries, one per column (via Wikidata Query Service or QLever) to compute coverage across all groupingsResultsFormatterformats the results as a wikitext table- The table is written back to the wiki page
Redis caches parsed template configs to speed up the /queries endpoint (which needs to reconstruct the SPARQL query without re-fetching the wiki page).
Endpoints
| Endpoint | Purpose |
|---|---|
/ |
Landing page |
/healthz |
Health check (used by Toolforge Kubernetes) |
/update?url=page_url&page=page_name |
Triggers the update of the given dashboard page |
/update/stream |
Server-Sent Events stream for live update progress |
/queries?url=page_url&page=page_name |
Returns the SPARQL queries behind the 🔍 links |
Deployment
Deployment is automated via an Ansible playbook, deploy/main.yml. It is invoked using:
./bin/deploy-to-toolforge.sh <username>
Logs
- Cron job:
logs/weekly-update-wikidata-[out|err].log - Webservice startup logs:
webservice logs -f - Webservice logs:
uwsgi.log