From Wikitech
Jump to navigation Jump to search

Data about how humans collaborate. Loads of it.

Data on wikistats right now

 page types
 namespace 0
 not namespace 0
 article (content)
 not talk or article
 Revert types
 sha matching
 unknown (comment mentions partial revert but no sha match)
 User categories
 unknown (? this appears in the charts with no definition)
 "start of project"
 per-year totals
 per-month totals
 top ranking
 reverted editors
 reverted articles
 reverted non-project pages
 5+ edits
 with speakers of a language
 with a lot of traffic to a project / language
 millions of speakers of a language
 article count
 views per hour
 truncate / exclude
 less than 10 edits per month
 less than 10 articles per project
 classic metrics
 article count
 new articles per day
 edits per month
 new editors
 active editors (5+ / month)
 very active editors (100+ / month)
 article size
 > 0.5kb
 > 2kb
 mean edits
 bytes per article
 database size
 quarterly rankings
  edits per month thresholds
 users: 1+, 3+, 5+, 10+, 100+, 250+, 1000+, 2500+, 10000+
 bots: 5+, 10+, 100+, 1000+, 10000+

        - name
        - creation
        - language
        - Regions associated
        - project
        - creation
        - namespace
        - name
        - estimated population
        - name
       - Estimated number of spearkers

Wikipedia page
For a given month: 
   --- per project, article count, views per hour, regions ????? , speakers in million (probably sum over the regions listed), editors (5+ edits) per million spearker (for the regions), prim + sec speakers ????
            -- Only Wikipedias which contain 10 or more articles and which received 10 or more edits in last month are listed above + list of not included
   -- Links to other pages:
       -- Summary
               -- pageview, article count, new article per daym edits per month, active editors, very active editors, new editors,  speakers (same as above), editors per million speakers + charts (namespace 0 only)
       -- tables
            --Monthly counts & quterly rankings
                -- editors data, articles data, data base data ????, links (internal, interwiki, image, external, redicrects)
                -- Variation per month over last 6 month
                -- Absolute number per month since project creation
                -- Rank for this project in comparision to other projects with more than 1000 articles over every dimension, per month since project creation)
           -- Edit actuvity levels per editor class and namespace
                -- By bucket of number of edits (1, 2, 5, 10, 25,  100 ...)
                -- Monthly since project creation
          --current Distribution of number of article edits over registered editors (no bots) (1 : √10 : 10 : 10√10 : 100 : 100√10 : 1000 ...scale)
          --current top 50 recently active wikipedians (no bot), by number of contributions
          --current top 20 recently absent top editors, by number of contributions
          --current anonymous users stats (not for enwiki - only percentage of total number of edits)
          --current top 50 bots, by number of contributions
          --monthly Articles containing at least one internal link, by number of characters (readable text, disregarding wiki- and html codes, hidden links, etc.; also headers do not count)
          --monthly  database records per namespace, categorised article and binaries (images, sound files...)
          -- monthly Most edited articles (out of date)
       -- charts
                  Charts starting at project start, one bar per month
                  Color coding (on/off toggle), % variation (toggle on/off)
                  Link to dedicated chart per wikipedia project + grand total 
         -- wikipedians
           -- contributors --
           -- new wikipedians --
           -- active wikipedians (5+ contribs this month) --
           -- very active wikipedians (100+ contribs this month) --
          -- count official
          -- count alternate (last updated jan 2010 for en, feb 2014 for most others)
          -- new article per day
          -- edits per article
          -- bytes per article (last updated 2010 for en, feb 2014 for most others)
          -- articles overs 0.5kb (last updated 2010 for en, feb 2014 for most others)
          -- articles overs 2kb (%) (last updated 2010 for en, feb 2014 for most others)
        -- database
          -- edits per month
          -- database size (last updated 2010 for en, feb 2014 for most others)
          -- words (last updated 2010 for en, feb 2014 for most others)
        -- links
          -- internal inks (last updated 2010 for en, feb 2014 for most others)
          -- links to other wikipedias last updated 2010 for en, feb 2014 for most others)
          -- binaries (last updated 2010 for en, feb 2014 for most others)
          -- external llinks (last updated 2010 for en, feb 2014 for most others)
          -- redirects
    -- Other section: Comparisions
       List of dedicated cahrts listed above
       -- Overview over recent month: summarize every dedicated chart over 5 month for every project
                  Visits per day : (discontinued since 2004)
                  Page requests per day : (discontinued since 2004)
      -- pageviews
      -- projects current status
      -- bot activity - editing
      -- bot activity -- creation only

Edits & Reverts
 --- Edits & reverts count
         -- Edits  on namespace 0) & reverts  using sha1 matching
         -- since start of project, for each project
         -- Registered / anonymous / bots (editor class) + total
         -- Absolute and relative numbers - Charts
 ---- Edit and Revert Trends
      -- Chart of number of edits per month since beginning of project + smoothed
      -- Chart for revert ratio (#revert / #edits) per editor class + smoothed
      --  Distribution of reverts
             -- Namespace 0 + other namespaces (not detailed)
             -- Special Unknown category
             -- User class for: Reverted editor, reverted by, self revert
             -- Different views: percentages across subdivisions or global, plus grand totals per year for each type
    -- Top ranking
        -- Most active reverters
        -- most reverted editors
        -- most reverted articles
        -- most revertted other non-project pages

Thoughts and Conclusions
There are two main kinds of use cases I see in Wikistats:
* get lost clicking through stats and engage your mind to new questions.  This kind of brainstorming is not possible with a rigid dashboard.
* find answers to specific questions.  This kind of search is hard on wikistats unless you know where to go.

Other projects page