Obsolete:Geowiki

From Wikitech
(Redirected from Analytics/Archive/Geowiki)
This page contains historical information. It may be outdated or unreliable.

This page describes the legacy Python based-Geowiki system. For the new system introduced in 2018, see Analytics/Systems/Geoeditors


Geowiki is a set of scripts used to automatically analyze the active editors per project/country. The generated data is split into a public part (available through http://gp.wmflabs.org/ (cf. domain description)), and a "foundation-only" part (available through https://stats.wikimedia.org/geowiki-private/ ).

Source code

The source code for the geowiki scripts themselves is at https://gerrit.wikimedia.org/r/#/admin/projects/analytics/geowiki .

The repository holding the generated public data can be found at https://gerrit.wikimedia.org/r/#/admin/projects/analytics/geowiki-data . The repository holding the generated "foundation-only" data can be get synced over to machines by requiring puppet's misc::statistics::geowiki::data::private_bare::sync.

Generated data

The geowiki scripts generate several hundred files. To allow to give a still managable overview, we use ${WIKI_NAME} to refer to names of wikis.

Foundation-only data

(Currently, no visualization is offered for this data.)

WIKI_NAME

${WIKI_NAME} is (as of 2013-09-15) any of ab, ace, af, ak, als, am, ang, an, arc, ar, arz, as, ast, av, ay, az, bar, ba, bat_smg, bcl, be, be_x_old, bg, bh, bi, bjn, bm, bn, bo, bpy, br, bs, bug, bxr, ca, cbk_zam, cdo, ceb, ce, chr, ch, chy, ckb, co, crh, cr, csb, cs, cu, cv, cy, da, de, diq, dsb, dv, dz, ee, el, eml, en, eo, es, et, eu, ext, fa, ff, fi, fiu_vro, fj, fo, frp, frr, fr, fur, fy, gan, ga, gd, glk, gl, gn, got, gu, gv, hak, ha, haw, he, hif, hi, hr, hsb, ht, hu, hy, ia, id, ie, ig, ik, ilo, io, is, it, iu, ja, jbo, jv, kaa, kab, ka, kbd, kg, ki, kk, kl, km, kn, koi, ko, krc, ksh, ks, ku, kv, kw, ky, lad, la, lbe, lb, lez, lg, lij, li, lmo, ln, lo, ltg, lt, lv, map_bms, mdf, mg, mhr, mi, mk, ml, mn, mrj, mr, ms, mt, mwl, my, myv, mzn, nah, nap, na, nds_nl, nds, ne, new, nl, nn, no, nov, nrm, nso, nv, ny, oc, om, or, os, pag, pam, pap, pa, pcd, pdc, pih, pi, pl, pms, pnb, pnt, ps, pt, qu, rm, rmy, rn, roa_rup, roa_tara, ro, rue, ru, rw, sah, sa, scn, sco, sc, sd, se, sg, sh, simple, si, sk, sl, sm, sn, so, sq, srn, sr, ss, stq, st, su, sv, sw, szl, ta, te, tet, tg, th, ti, tk, tl, tn, to, tpi, tr, ts, tt, tum, tw, ty, udm, ug, uk, ur, uz, vec, vep, ve, vi, vls, vo, war, wa, wo, wuu, xal, xh, yi, yo, za, zea, zh_classical, zh_min_nan, zh, zh_yue, zu .

To add further wikis, add them in the file geowiki/data/all_ids.tsv.

On the hadoop cluster (foundation internal only)

Some of the data can be found on the analytics-hadoop cluster under the /wmf/data/archive/geowiki_legacy parent folder.

See also