Obsolete:Geowiki
This page describes the legacy Python based-Geowiki system. For the new system introduced in 2018, see Analytics/Systems/Geoeditors
Geowiki is a set of scripts used to automatically analyze the active editors per project/country. The generated data is split into a public part (available through http://gp.wmflabs.org/ (cf. domain description)), and a "foundation-only" part (available through https://stats.wikimedia.org/geowiki-private/ ).
Source code
The source code for the geowiki scripts themselves is at https://gerrit.wikimedia.org/r/#/admin/projects/analytics/geowiki .
The repository holding the generated public data can be found at https://gerrit.wikimedia.org/r/#/admin/projects/analytics/geowiki-data .
The repository holding the generated "foundation-only" data can be get synced over to machines by requiring puppet's misc::statistics::geowiki::data::private_bare::sync
.
Generated data
The geowiki scripts generate several hundred files. To allow to give a still managable overview, we use ${WIKI_NAME} to refer to names of wikis.
Foundation-only data
- ${WIKI_NAME}_all.csv
- ${WIKI_NAME}_top10.csv
- region.csv
- global_south.csv
- global_south_editor_fractions.csv
- country.csv
- overall_by_lang.csv
- overall_by_lang_monthly.csv
(Currently, no visualization is offered for this data.)
WIKI_NAME
${WIKI_NAME} is (as of 2013-09-15) any of
ab
, ace
, af
, ak
, als
, am
, ang
, an
, arc
, ar
, arz
, as
, ast
, av
, ay
, az
, bar
, ba
, bat_smg
, bcl
, be
, be_x_old
, bg
, bh
, bi
, bjn
, bm
, bn
, bo
, bpy
, br
, bs
, bug
, bxr
, ca
, cbk_zam
, cdo
, ceb
, ce
, chr
, ch
, chy
, ckb
, co
, crh
, cr
, csb
, cs
, cu
, cv
, cy
, da
, de
, diq
, dsb
, dv
, dz
, ee
, el
, eml
, en
, eo
, es
, et
, eu
, ext
, fa
, ff
, fi
, fiu_vro
, fj
, fo
, frp
, frr
, fr
, fur
, fy
, gan
, ga
, gd
, glk
, gl
, gn
, got
, gu
, gv
, hak
, ha
, haw
, he
, hif
, hi
, hr
, hsb
, ht
, hu
, hy
, ia
, id
, ie
, ig
, ik
, ilo
, io
, is
, it
, iu
, ja
, jbo
, jv
, kaa
, kab
, ka
, kbd
, kg
, ki
, kk
, kl
, km
, kn
, koi
, ko
, krc
, ksh
, ks
, ku
, kv
, kw
, ky
, lad
, la
, lbe
, lb
, lez
, lg
, lij
, li
, lmo
, ln
, lo
, ltg
, lt
, lv
, map_bms
, mdf
, mg
, mhr
, mi
, mk
, ml
, mn
, mrj
, mr
, ms
, mt
, mwl
, my
, myv
, mzn
, nah
, nap
, na
, nds_nl
, nds
, ne
, new
, nl
, nn
, no
, nov
, nrm
, nso
, nv
, ny
, oc
, om
, or
, os
, pag
, pam
, pap
, pa
, pcd
, pdc
, pih
, pi
, pl
, pms
, pnb
, pnt
, ps
, pt
, qu
, rm
, rmy
, rn
, roa_rup
, roa_tara
, ro
, rue
, ru
, rw
, sah
, sa
, scn
, sco
, sc
, sd
, se
, sg
, sh
, simple
, si
, sk
, sl
, sm
, sn
, so
, sq
, srn
, sr
, ss
, stq
, st
, su
, sv
, sw
, szl
, ta
, te
, tet
, tg
, th
, ti
, tk
, tl
, tn
, to
, tpi
, tr
, ts
, tt
, tum
, tw
, ty
, udm
, ug
, uk
, ur
, uz
, vec
, vep
, ve
, vi
, vls
, vo
, war
, wa
, wo
, wuu
, xal
, xh
, yi
, yo
, za
, zea
, zh_classical
, zh_min_nan
, zh
, zh_yue
, zu
.
To add further wikis, add them in the file geowiki/data/all_ids.tsv
.
On the hadoop cluster (foundation internal only)
Some of the data can be found on the analytics-hadoop cluster under the /wmf/data/archive/geowiki_legacy
parent folder.