Analytics/Systems/EventLogging/TestingOnBetaCluster

From Wikitech


The consumer side of event logging can be easily tested on Beta Cluster.

Instance

The instance name is configured here: https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/CommonSettings-labs.php Note that this might change at any time but other than the instance the rest of the info on this document should apply regardless of the instance

Note that you need sudo on this instance to see logs, any user trying to test stuff on Beta Cluster should ask for sudo on deployment-eventlog08. It is unfortunate that sudo is required but that is the state of affairs right now.

How to create test events

How to log a client-side event to Beta Cluster directly

Just hit the varnish endpoint on labs for example:

curl -A "WikipediaApp/22.0.22-alpha-2017-11-01 (Android 8.0.0; Phone) Alpha Channel" https://en.wikipedia.beta.wmflabs.org/beacon/event?%7B%22event%22%3A%7B%22pageTitleSource%22%3A%22Main%20Page%22%2C%22namespaceIdSource%22%3A0%2C%22pageIdSource%22%3A1%2C%22isAnon%22%3Atrue%2C%22popupEnabled%22%3Atrue%2C%22pageToken%22%3A%223ec574813fadb97b%22%2C%22sessionToken%22%3A%228460c2e4d547b250%22%2C%22previewCountBucket%22%3A%220%20previews%22%2C%22hovercardsSuppressedByGadget%22%3Afalse%2C%22action%22%3A%22pageLoaded%22%7D%2C%22revision%22%3A16364296%2C%22schema%22%3A%22Popups%22%2C%22webHost%22%3A%22en.wikipedia.beta.wmflabs.org%22%2C%22wiki%22%3A%22enwiki%22%7D


https://en.wikipedia.beta.wmflabs.org/beacon/event?%7B%22event%22%3A%7B%22country%22%3A%22US%22%2C%22region%22%3A%22WA%22%2C%22anonymous%22%3Atrue%2C%22project%22%3A%22wikipedia%22%2C%22db%22%3A%22enwiki%22%2C%22uselang%22%3A%22en%22%2C%22device%22%3A%22desktop%22%2C%22debug%22%3Afalse%2C%22randomcampaign%22%3A0.8838892205730462%2C%22randombanner%22%3A0.7340400211496478%2C%22recordImpressionSampleRate%22%3A0.01%2C%22impressionEventSampleRate%22%3A1%2C%22status%22%3A%22banner_shown%22%2C%22statusCode%22%3A%226%22%2C%22campaign%22%3A%22CN%20browser%20tests%22%2C%22campaignCategory%22%3A%22CNbrowsertests%22%2C%22campaignCategoryUsesLegacy%22%3Afalse%2C%22bucket%22%3A0%2C%22banner%22%3A%22browser_test_b3%22%2C%22bannerCategory%22%3A%22CNbrowsertests%22%2C%22result%22%3A%22show%22%2C%22testIdentifiers%22%3A%22popupsUnknown%22%7D%2C%22revision%22%3A17995347%2C%22schema%22%3A%22CentralNoticeImpression%22%2C%22webHost%22%3A%22en.wikipedia.beta.wmflabs.org%22%2C%22wiki%22%3A%22enwiki%22%7D;

How to log via the website

Use http://en.m.wikipedia.beta.wmflabs.org/wiki/Main_Page to create events in mobile, for example.

How to load test with a bunch of events

There's a script that may be handy. It's in the same eventlogging codebase:

https://github.com/wikimedia/eventlogging/blob/master/bin/eventlogging-load-tester

How to verify events

You can tail the files in the /srv/log/eventlogging on deployment-eventlog08.deployment-prep.eqiad1.wikimedia.cloud to verify if your event is coming through.

Unless noted otherwise, the files mentioned in this section and the subsections are in this directory.

ssh deployment-eventlog08.deployment-prep.eqiad1.wikimedia.cloud
cd /srv/log/eventlogging

Validated events

In MySQL

All events in beta should be written to the MySQL log database hosted on the beta eventlogging server.

 ssh deployment-eventlog08.deployment-prep.eqiad1.wikimedia.cloud
 sudo mysql --skip-ssl log
 show tables;
 ...

In files

  • all-events.log: schema-validated events that are inserted into MYSQL appear in this file (the all* in name is missleading).

Tail this file while you use the website and emit server or client side events. If your events are valid your events should be there after a short while (seconds). The contents of this file are of the eventlogging_valid_mixed topic in Kafka.

In Kafka

You can consume valid events directly from kafka:

 kafkacat -L -b deployment-kafka-jumbo-1.deployment-prep.eqiad1.wikimedia.cloud:9092 | grep 'topic "eventlogging_' | awk -F '"' '{print $2}'

will list topics .

After, consume from your topic. It should be named something like eventlogging_<schema>. You should be able to see your events if they are valid.

 kafkacat -C -b deployment-kafka-jumbo-1.deployment-prep.eqiad1.wikimedia.cloud:9092 -t eventlogging_NavigationTiming

Raw stream of events (including unvalidated events)

  • client-side-events.log: client side events appear in this file (valid and not)

If events do not appear they might not be valid, check /srv/log/eventlogging/systemd and tail -f + grep the following:

eventlogging-processor@client-side-XX.log

Validation errors will appear on those logs and they are very descriptive. Note: you may see a -00 log and a -01 log, which exist for parallelization and you should monitor both.

Where is eventlogging code?

 /srv/deployment/eventlogging/analytics/eventlogging


See all EventLogging schema Kafka topics

 kafkacat -L -b deployment-kafka-jumbo-1.deployment-prep.eqiad1.wikimedia.cloud:9092 | grep 'topic "eventlogging_' | awk -F '"' '{print $2}'

All event logging topics for which valid events are being sent should be present here

Database

In order to see events you can use the eventlogging user whose user and password are listed at:

/etc/eventlogging.d/consumers/mysql-m4-master

If you have sudo on the machine the mysql password for the root user is 'secret', otherwise:

mysql -h 127.0.0.1 --user=eventlogging --password=68QrOq220717816UycU1 --skip-ssl
(it's labs, the password is not really a secret.)

If mysql needs a re-start:

systemctl restart mysql

The mysql setup on beta leaves much to be desired, if mysql does start check /var/log/mysql.err

This might be of help: [1]

Please also keep in mind that the eventlogging_cleaner.py script runs periodically to purge/sanitize records according to its whitelist as it happens in production. Some useful commands:

elukey@deployment-eventlog08:~$ systemctl list-timers | grep sanitization
Wed 2018-10-24 11:00:00 UTC  20h left      Tue 2018-10-23 11:00:14 UTC  3h 19min ago eventlogging_db_sanitization.timer eventlogging_db_sanitization.service
elukey@deployment-eventlog08:~$ systemctl cat eventlogging_db_sanitization.service
# /lib/systemd/system/eventlogging_db_sanitization.service
[Unit]
Description=Apply Analytics data retetion policies to the Eventlogging database

[Service]
User=eventlogcleaner
ExecStart=/usr/local/bin/eventlogging_cleaner ...[cut]...

Two notable things:

  • --no-whitelist-sanity-check is not used in production but only in beta.
  • The systemd timer updates /srv/eventlogging/eventlogging_cleaner with the timestamp of the next day to start sanitizing from during the next run (that happens once a day).

Admin

Give people access

Add them to the lists on these wikis (you need to be an admin to do that) Asking in #wikimedia-cloud connect might be a way to get help.

Nova_Resource:Deployment-prep

Special:NovaProject -> add users to deployment-prep

How to deploy code

# Log into the beta deploy server
ssh deployment-tin.deployment-prep.eqiad1.wikimedia.cloud

# cd to the EventLogging analytics deploy source
cd /srv/deployment/eventlogging/analytics

# Deploy using scap3 in the beta environment
scap deploy -e beta

You can run puppet with

puppet agent -tv

Restart EventLogging

Check:

 sudo eventloggingctl status

Run:

 sudo eventloggingctl restart

Stop completely:

 sudo  eventloggingctl stop