BounceHandler

From Wikitech

The BounceHandler extension is currently installed 'everywhere' in the production and beta cluster. Here are few notes on where the knobs are, where to look for whats happening.

How it works (Production)

Keeping it simple!

enwiki --> sends email to someuser@somedomain.com ( with 'return-path'=> 'wiki-someuser.somedomain,com-{hash}@wikimedia.org' ) 
--> routes through polonium.wikimedia.org --> rejected in midway/ rejected by mx.somedomain.org ( bounce created )
--> bounce ( 'To' => 'wiki-someuser.somedomain,com-{hash}@wikimedia.org' ) reach polonium.wikimedia.org
--> bounce HTTP POSTED to test2.wikipedia.org --> test2.wikipedia.org lookup the CA user table,adds in to 'bounce_records' table
--> if bounces > threshold, user is unsubscribed.

How it works (Beta)

enwiki --> sends email to someuser@somedomain.com ( with 'return-path'=> 'wiki-someuser.somedomain,com-{hash}@beta.wmflabs.org ' ) 
--> routes through mx.beta.wmflabs.org --> rejected in midway/ rejected by mx.somedomain.org ( bounce created )
--> bounce ( 'To' => 'wiki-someuser.somedomain,com-{hash}@beta.wmflabs.org ' ) reach mx.beta.wmflabs.org
--> bounce HTTP POSTED to --> meta.wikimedia.beta.wmflabs.org lookup the CA user table,adds in to 'bounce_records' table
--> if bounces > threshold, user is unsubscribed.

Main Configuration's

Kept in wmf-config/InitialiseSettings.php
Toggle un-subscribe action:

$wgBounceHandlerUnconfirmUsers = true;

The threshold limit for maximum number of allowed bounces is:

$wgBounceRecordLimit = 5;

Deployment information

Database used: wikishared Database cluster: extension1

View logs/records

To query into bounce_records table from production :

$ mwscript sql.php --wiki=mediawikiwiki --cluster extension1 --wikidb 'wikishared' --replicadb any
mysql> SELECT * from bounce_records

That would throw up a pretty long list ( maybe greater than few 10k's ). To print bounces recorded in the past 24 hours, give:

SELECT * from bounce_records where br_timestamp > date_format((now() - interval 1 day),'%Y%m%d%H%i%s');

and for the past month records COUNT , give:

SELECT count(*) from bounce_records where br_timestamp > date_format((now() - interval 1 month),'%Y%m%d%H%i%s');

To check for the bouncehandler logs, please log into mwlog1001 and :

 
jgreen@mwlog1001:/a/mw-log$ cat BounceHandler.log

To get number of unsubscribes for the past 80 days:

 
jgreen@mwlog1001:/a/mw-log $ gunzip -c BounceHandler*gz |grep -i un-sub|wc

To get the number of unsubscribes/day for the past 80 days:

 
mwlog1001:~$ gunzip -c /a/mw-log/archive/BounceHandler.log-2015* | grep Un-sub|awk '{print $1}'|sort|uniq -c|awk '{print $2 " " $1}'|sort -n