Analytics/Systems/EventLogging/TestingOnBetaCluster

From Wikitech
Jump to navigation Jump to search

The consumer side of event logging can be easily tested on Beta Cluster.

Instance

The instance name is configured here: https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/CommonSettings-labs.php Note that this might change at any time but other than the instance the rest of the info on this document should apply regardless of the instance

Note that you need sudo on this instance to see logs, any user trying to test stuff on Beta Cluster should ask for sudo on deployment-eventlog05. It is unfortunate that sudo is required but that is the state of affairs right now.

How to create test events

How to log a client-side event to Beta Cluster directly

Just hit the varnish endpoint on labs for example:

curl -A "WikipediaApp/22.0.22-alpha-2017-11-01 (Android 8.0.0; Phone) Alpha Channel" https://en.wikipedia.beta.wmflabs.org/beacon/event?%7B%22event%22%3A%7B%22pageTitleSource%22%3A%22Main%20Page%22%2C%22namespaceIdSource%22%3A0%2C%22pageIdSource%22%3A1%2C%22isAnon%22%3Atrue%2C%22popupEnabled%22%3Atrue%2C%22pageToken%22%3A%223ec574813fadb97b%22%2C%22sessionToken%22%3A%228460c2e4d547b250%22%2C%22previewCountBucket%22%3A%220%20previews%22%2C%22hovercardsSuppressedByGadget%22%3Afalse%2C%22action%22%3A%22pageLoaded%22%7D%2C%22revision%22%3A16364296%2C%22schema%22%3A%22Popups%22%2C%22webHost%22%3A%22en.wikipedia.beta.wmflabs.org%22%2C%22wiki%22%3A%22enwiki%22%7D

How to log via the website

Use http://en.m.wikipedia.beta.wmflabs.org/wiki/Main_Page to create events in mobile, for example.

How to load test with a bunch of events

There's a script that may be handy. It's in the same eventlogging codebase:

https://github.com/wikimedia/eventlogging/blob/master/bin/eventlogging-load-tester

How to verify events

You can tail the files in the /srv/log/eventlogging on deployment-eventlog05 to verify if your event is coming through.

Unless noted otherwise, the files mentioned in this section and the subsections are in this directory.

ssh deployment-eventlog05.eqiad.wmflabs
cd /srv/log/eventlogging

Validated events

  • all-events.log: schema-validated events that are inserted into MYSQL appear in this file

Tail this file while you use the website and emit server or client side events. If your events are valid your events should be there after a short while (seconds). The contents of this file are of the eventlogging_valid_mix topic in Kafka.

If events are not being stored in mysql you would need to consume them directly from kafka:

kafka-tools -b deployment-kafka-jumbo-2.deployment-prep.eqiad.wmflabs:9092 print_topics

Will list topics .

After, consume from eventlogging_<schema> using kafkacat. You should be able to see your events if they are valid. Note that eventlogging_<schema> topics in Kafka are only used by hadoop pipeline, not MySQL pipeline.

kafka-tools  -b deployment-kafka-jumbo-2.deployment-prep.eqiad.wmflabs:9092 consume_topic eventlogging_VirtualPageView

Raw stream of events (including unvalidated events)

  • client-side-events.log: client side events appear in this file (valid and not)

If events do not appear they might not be valid, check /var/log/eventlogging/ and tail -f + grep the following:

eventlogging-processor@client-side-XX.log

Validation errors will appear on those logs and they are very descriptive. Note: you may see a -00 log and a -01 log, which exist for parallelization and you should monitor both.

Where is eventlogging code?

 /srv/deployment/eventlogging/analytics/eventlogging


See all kafka topics

kafka-tools -b deployment-kafka-jumbo-2.deployment-prep.eqiad.wmflabs:9092 print_topics

All event logging topics for which valid events are being sent should be present here

Database

The mysql server is storing events just like it is in production, in order to see events you can use the eventlogging user whose user and password are listed at:

/etc/eventlogging.d/consumers/mysql-m4-master

If you have sudo on the machine the mysql password for the root user is 'secret', otherwise:

mysql -h 127.0.0.1 --user=eventlogging --password=68QrOq220717816UycU1 --skip-ssl
(it's labs, the password is not really a secret.)

If mysql needs a re-start:

systemctl restart mysql

The mysql setup on beta leaves much to be desired, if mysql does start check /var/log/mysql.err

This might be of help: [1]

Please also keep in mind that the eventlogging_cleaner.py script runs periodically to purge/sanitize records according to its whitelist as it happens in production. Some useful commands:

elukey@deployment-eventlog05:~$ sudo -u eventlogcleaner crontab -l
# HEADER: This file was autogenerated at 2018-03-22 12:28:57 +0000 by puppet.
# HEADER: While it can still be managed manually, it is definitely not recommended.
# HEADER: Note particularly that the comments starting with 'Puppet Name' should
# HEADER: not be deleted, as doing so could cause duplicate cron jobs.
# Puppet Name: eventlogging_cleaner daily sanitization
MAILTO=analytics-alerts@wikimedia.org
0 11 * * * /usr/bin/flock --verbose -n /var/lock/eventlogging_cleaner /usr/local/bin/eventlogging_cleaner --whitelist /etc/eventlogging_cleaner/whitelist.tsv --older-than 90 --start-ts-file /var/run/eventlogging_cleaner --batch-size 10000 --sleep-between-batches 2 --no-whitelist-sanity-check >> /var/log/eventlogging_cleaner/eventlogging_cleaner.log

Two notable things:

  • --no-whitelist-sanity-check is not used in production but only in beta.
  • The cron updates /var/lock/eventlogging_cleaner with the timestamp of the next day to start sanitizing from during the next run (that happens once a day).

Admin

Give people access

Add them to the lists on these wikis (you need to be an admin to do that) Asking in #wikimedia-cloud connect might be a way to get help.

Nova_Resource:Deployment-prep

Special:NovaProject -> add users to deployment-prep

How to deploy code

# Log into the beta deploy server
ssh deployment-tin.deployment-prep.eqiad.wmflabs

# cd to the EventLogging analytics deploy source
cd /srv/deployment/eventlogging/analytics

# Deploy using scap3 in the beta environment
scap deploy -e beta

You can run puppet with

puppet agent -tv

Restart EventLogging

Check:

 sudo eventloggingctl status

Run:

 sudo eventloggingctl restart

Stop completely:

 sudo  eventloggingctl stop

Kafka

If you're testing Kafka stuff on the beta cluster, you'll need a zookeeper. You can pass --zookeeper deployment-zookeeper02:2181/kafka/deployment-kafka. Or you can just do export ZOOKEEPER_URL=deployment-zookeeper02:2181/kafka/deployment-kafka