User:Elukey/Analytics/Oozie

From Wikitech

Unexpected values in pageview for workflow

Error from Oozie:

oozie@analytics1003.eqiad.wmnet via wikimedia.org 
12:46 PM (2 hours ago)

to analytics-aler. 
Values were found in pageview that were not in the whitelist.

Please have a look and take necessary action !
Thanks :)
-- Oozie

Check the following:

ADD JAR /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar; use wmf; select * from pageview_unexpected_values where year = 2016 and month = 10 and day = 7;

If you need to speed up your job

yarn application --movetoqueue application_1476969128131_44799 --queue production

Data Loss ERROR - Workflow webrequest-load-check_sequence_statistics

Error from Oozie:

oozie@analytics1003.eqiad.wmnet via wikimedia.org 
8:26 PM (18 hours ago)

to analytics-aler. 
Please check wmf_raw.webrequest_sequence_stats_hourly.
This is an ERROR.
This job has failed, refine has not been launched.

Please have a look and take necessary action !
Thanks :)
-- Oozie

The first step is to investigate why this is happening. Useful resources:

- https://grafana.wikimedia.org/dashboard/db/varnish-aggregate-client-status-codes, since Varnish errors ending up in 503s might contribute.

- Check the percentage of loss registered:

ADD JAR /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar ;
use wmf_raw;
select webrequest_source,percent_lost from webrequest_sequence_stats_hourly where day = 6 and month = 10 and year = 2016 and hour = 17;

If there is a valid motivation, pleas re-run from stat1004 the Oozie job with the following command (WARNING: start/stop time values need to be changed!)

sudo -u hdfs oozie job --oozie $OOZIE_URL

 -Drefinery_directory=hdfs://analytics-hadoop$(hdfs dfs -ls -d /wmf/refinery/2016* | tail -n 1 | awk '{print $NF}')
 -Dqueue_name=production   -Doozie_launcher_queue_name=production -Doozie_launcher_memory=256   
 -Dstart_time=2016-10-06T17:00Z   -Dstop_time=2016-10-06T17:59Z   
 -config coord_load_webrequest_upload.properties -run