Jump to content

Analytics/Archive/Dashboards - Limn

From Wikitech
This page contains historical information. It may be outdated or unreliable.

"I use EventLogging to collect data, and I'd like a dashboard to visualize it."

(If that's not you, see EventLogging, how to make a schema, and how to program it)

Ok then, you've landed on the right article. We'll describe the dashboarding process here, and update it when it gets better. For all requests to the analytics team, please use Phabricator and tag the Project as "Analytics-Engineering".


Make repository for queries and configuration

A repository in Gerrit will be created by the analytics team, at your request (don't forget to tag us). This repository will be called limn-NAME-data, so please specify a name. It will contain:

  • a dashboards/reportcard.json dashboard configuration file. This will lay out your graphs into tabs. Here's an example:
{
    "id": "reportcard",
    "headline": "Mobile Contributions",
    "subhead": "Apps & Web",
    "tabs": [
        {
            "name": "Monthly reports",
            "graph_ids": [
                "http://stat1001.wikimedia.org/limn-public-data/mobile/datafiles/edits-monthly-new-active.csv",
                "http://stat1001.wikimedia.org/limn-public-data/mobile/datafiles/edits-monthly-unique-editors.csv",
                "http://stat1001.wikimedia.org/limn-public-data/mobile/datafiles/edits-monthly-successful.csv"
            ]
        }
    ]
}
  • a folder called <name> that will contain two things:
    • a config.yaml file that will define what database you're connecting to, where your output files will be stored, what SQL to execute, and how often to execute each SQL query. We will set up the basic structure for this and an example SQL file. For an idea of what it looks like, see other examples, such as the flow team's
    • one SQL script for each graph you would like to display. The SQL can be as complicated as you like, but the outer-most select should look like this:
 select <<some-expression>> as Date,    -- the date format can be a part of YYYY.MM.DD.HH.mm.ss (ex: YYYY-MM, YYYY/MM/DD)
        <<some-expression>> as Column1, -- the value is a number, and the label should be human readable (limn will use it)
        <<some-expression>> as Column2
 ...

Make Limn instance

When we make the repository, we'll also set up a limn instance for you, usually at <name>-reportcard.wmflabs.org unless you request something else. This will serve your graphs. It will be run from limn1.eqiad.wmflabs (part of the Analytics project).

Configure cron to generate data

Once your repository is ready, a cron job will be added on stat1003 at your request. This will run every 30 minutes, read your configuration, and execute your SQL. The output will be stored in CSV files and rsync-ed to http://stat1001.wikimedia.org/limn-public-data/<name>/datafiles/. Limn will read the data from here.

Get SQL and configuration Just Right (TM)

Once your job is running, you'll likely find you made SQL syntax errors, or maybe your data isn't quite right, etc. We'll help you debug because the logs are on stat1003 (our first priority with this process is to make this debug step better).

Disclose your dashboard

When your dashboard is ready to be shared, please add a link in the Dashboards directory. This directory is a place to find and organize all Limn dashboards used by WMF and the community. As of this moment, there is no easier way to discover other Limn dashboards.

For contributors, the code is in https://phabricator.wikimedia.org/diffusion/WDAS/browse/master/

Expected time spent

To give you a general idea, this is how long it normally takes to do everything:

  • Read this file: 1 hour for me, because I'm a really slow reader
  • Make Repository: 1/2 hour for us. Then you'll probably need about 5-60 minutes to write each query depending on how complicated they are. If it takes you longer than that, we can help.
  • Make Limn instance: 1/2 hour for us, we do it when we make the repo.
  • Configure cron: 2 hours. We need to troubleshoot and get puppet merged by ops.
  • Get SQL and Configuration Just Right: this can take a few hours or more, depending on how complicated your queries are.
  • Disclose Your Dashboard: 10 minutes for you to edit some wikitext.

Support

For all requests to the analytics team, please create tasks in Phabricator and tag the Analytics-Engineering project.

If you use EventLogging you should subscribe to eventlogging-alerts@lists.wikimedia.org