Cron jobs

From Wikitech
(Redirected from Batch jobs)
Jump to: navigation, search

Note: the job queue runs continuously on many servers and is not a cron job.

manual cron jobs

QueryPage update

hume:/etc/cron.d/mw-update-special-pages: updates the special pages derived from QueryPage

PATH=/usr/local/bin:/bin:/usr/bin
00 4 */3 * * apache flock -n /var/lock/update-special-pages-small /usr/local/bin/update-special-pages-small > /home/wikipedia/logs/norotate/updateSpecialPages-small.log 2>&1
00 5 */3 * * apache flock -n /var/lock/update-special-pages /usr/local/bin/update-special-pages > /home/wikipedia/logs/norotate/updateSpecialPages.log 2>&1

update-special-pages-small

#!/bin/bash

cd /home/wikipedia/common/multiversion
for db in `</home/wikipedia/common/small.dblist`; do
	echo $db
	php MWScript.php updateSpecialPages.php $db
	echo
	echo
done

update-special-pages

#!/bin/bash

cd /home/wikipedia/common/multiversion
for db in `</home/wikipedia/common/all.dblist`; do
	echo $db
	php MWScript.php updateSpecialPages.php $db
	echo
	echo
done

Tor exit list update

hume:/etc/cron.d/mw-tor-list: Loads the tor exit list from check.torproject.org and saves it into memcached for later use by the TorBlock extension.

PATH=/usr/local/bin:/bin:/usr/bin
*/20 * * * * apache php /home/wikipedia/common/multiversion/MWScript.php extensions/TorBlock/loadExitNodes.php aawiki 2>&1

FlaggedRevs stats update

hume:/etc/cron.d/mw-flagged-revs: Updates the flaggedrevs_stats table

0 */2 * * * /home/wikipedia/common/php/extensions/FlaggedRevs/maintenance/wikimedia-periodic-update.sh 2>&1

wikimedia-periodic-update.sh

#!/bin/bash
for db in `</home/wikipedia/common/flaggedrevs.dblist`;do
	echo $db
	php -n /home/wikipedia/common/php/extensions/FlaggedRevs/maintenance/updateStats.php $db
done

Ganglia RRD commit

zwinger:/etc/cron.hourly/save-gmetad-rrds: The live RRD files for ganglia are kept in a tmpfs, for performance reasons. This script copies them back to disk in case of server restart

#!/bin/sh
/usr/local/bin/save-gmetad-rrds >> /var/log/save-gmetad-rrds.log 2>&1

save-gmetad-rrds

#!/bin/bash
service gmetad_pmtpa stop
echo "Saving RRDs..."
time rsync -a /mnt/ganglia_tmp/rrds.pmtpa/ /var/lib/ganglia/rrds.pmtpa
echo "Done"
service gmetad_pmtpa start

LDAP server backups

nfs1/2:/usr/local/sbin/opendj-backup.sh: Runs OpenDJ backups and stores them in /var/opendj/backup for pickup by amanda; cleans up backups older than three days.

0 18 * * * /usr/local/sbin/opendj-backup.sh > /dev/null 2>&1

SVN crons

?? (still runs? was formey):/usr/local/bin/svndump.php: Runs SVN dumps and stores them in /svnroot/bak for pickup by amanda; cleans up previous dump.

0 18 * * * /usr/local/bin/svndump.php > /dev/null 2>&1

?? (still runs? was formey):(mwdocs)/home/mwdocs/phase3/maintenance/mwdocgen.php: Updates the doxygen documentation for svn.

0 0 * * * (cd /home/mwdocs/phase3 && svn up && php maintenance/mwdocgen.php --all) >> /var/log/mwdocs.log 2>&1

antomony:(www-data)svn up: Updates the userinfo file

0 0 * * * (cd /var/cache/svnusers && svn up) > /dev/null 2>&1

puppetized cron jobs

Puppet configuration files can be found in the operations/puppet repo.

Apache

apaches::cron

class apaches::cron {
        cron {
                synclocalisation:
                        command =>"rsync -a --delete 10.0.5.8::common/php/cache/l10n/ /usr/local/apache/common/php/cache/l10n/",
                        user => root,
                        hour => 3,
                        minute => 0,
                        ensure => present;
                cleanupipc:
                        command => "ipcs -s | grep apache | cut -f 2 -d \\  | xargs -rn 1 ipcrm -s",
                        user => root,
                        minute => 26,
                        ensure => present;
                updategeoipdb:
                        environment => "http_proxy=http://brewster.wikimedia.org:8080",
                        command => "[ -d /usr/share/GeoIP ] && wget -qO - http://geolite.maxmind.com/download/geoip/database/GeoLiteCountry/GeoIP.dat.gz | gunzip > /usr/share/GeoIP/GeoIP.dat.new && mv /usr/share/GeoIP/GeoIP.dat.new /usr/share/GeoIP/GeoIP.dat",
                        user => root,
                        minute => 26,
                        ensure => absent;
                cleantmpphp:
                        command => "find /tmp -name 'php*'  -ctime +1 -exec rm -f {} \\;",
                        user => root,
                        hour => 5,
                        minute => 0,
                        ensure => present;
        }
}

Backup

backup::server

cron {
                amanda_daily:
                command =>      "/usr/sbin/amdump Wikimedia-Daily",
                require =>      Package["amanda-server"],
                user    =>      backup,
                hour    =>      2,
                minute  =>      0;

                amanda_weekly:
                command =>      "/usr/sbin/amdump Wikimedia-Weekly",
                require =>      Package["amanda-server"],
                user    =>      backup,
                hour    =>      6,
                minute  =>      0,
                weekday =>      Sunday;

                amanda_monthly:
                command =>      "/usr/sbin/amdump Wikimedia-Monthly",
                require =>      Package["amanda-server"],
                user    =>      backup,
                hour    =>      12,
                minute  =>      0,
                monthday =>     1;
        }

backup::mysql

cron {
                snaprotate:
                command =>      "/usr/local/sbin/snaprotate.pl -a swap -V tank -s data -L 20G",
                user    =>      root,
                hour    =>      1,
                minute  =>      0;
        }

Puppet

base::puppet

 # Keep puppet running
        cron {
                restartpuppet:
                        require => File[ [ "/etc/default/puppet" ] ],
                        command => "/etc/init.d/puppet restart > /dev/null",
                        user => root,
                        hour => 2,
                        minute => 37,
                        ensure => present;
                remove-old-lockfile:
                        require => Package[puppet],
                        command => "[ -f /var/lib/puppet/state/puppetdlock ] && find /var/lib/puppet/state/puppetdlock -ctime +1 -delete",
                        user => root,
                        minute => 43,
                        ensure => present;
        }

misc:puppetmaster

cron {
                updategeoipdb:
                        environment => "http_proxy=http://brewster.wikimedia.org:8080",
                        command => "wget -qO - http://geolite.maxmind.com/download/geoip/database/GeoLiteCountry/GeoIP.dat.gz | gunzip > /etc/puppet/files/misc/GeoIP.dat.new && mv /etc/puppet/files/misc/GeoIP.dat.new /etc/puppet/files/misc/GeoIP.dat; wget -qO - http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz | gunzip > /etc/puppet/files/misc/GeoIPcity.dat.new && mv /etc/puppet/files/misc/GeoIPcity.dat.new /etc/puppet/files/misc/GeoIPcity.dat",
                        user => root,
                        hour => 3,
                        minute => 26,
                        ensure => present;
        }

DNS

dns::auth-server

 # Update ip map file

        cron { "update ip map":
                command => "rsync -qt 'rsync://countries-ns.mdc.dk/zone/zz.countries.nerd.dk.rbldnsd' /etc/powerdns/ip-map/zz.countries.nerd.dk.rbldnsd && pdns_control rediscover > /dev/null",
                user => pdns,
                hour => 4,
                minute => 7,
                ensure => present;
        }

dns::recursor

cron { pdnsstats:
                        command => "cd /var/www/pdns && /usr/local/powerdnsstats/update && /usr/local/powerdnsstats/makegraphs >/dev/null",
                        user => root,
                        minute => '*/5';
                }

image scaler

imagescaler::cron

cron { removetmpfiles:
                command => "for dir in /tmp /a/magick-tmp; do find \$dir -type f \\( -name 'gs_*' -o -name 'magick-*' \\) -cmin +60 -exec rm -f {} \\;; done",
                user => root,
                minute => '*/5',
                ensure => present
        }

LDAP

ldap::server

cron {
                "opendj-backup":
                        command =>      "/usr/local/sbin/opendj-backup.sh > /dev/null 2>&1",
                        require =>      File["/usr/local/sbin/opendj-backup.sh"],
                        user    =>      opendj,
                        hour    =>      18,
                        minute  =>      0;
        }

MediaWiki

mediawiki::maintenance

To run a MediaWiki maintenance script regularly in production, you should create a puppet file in modules/mediawiki/manifests/maintenance/ that is a subclass of mediawiki::maintenance and then include the file in modules/role/manifests/mediawiki/maintenance.pp and add it to hieradata/role/codfw/mediawiki/maintenance.yaml. See https://gerrit.wikimedia.org/r/#/c/326856/ and https://gerrit.wikimedia.org/r/#/c/319892/ for examples.

Misc

misc::extension-distributor

cron { extdist_updateall:
                command => "cd $extdist_working_dir/mw-snapshot; for branch in trunk branches/*; do /usr/bin/svn cleanup \$branch/extensions; /usr/bin/svn up \$branch/extensions > /dev/null; done",
                minute => 0,
                user => extdist,
                ensure => present;
        }

misc::nfs-server::home

cron { home-rsync:
                        require => File["/root/.ssh/home-rsync"],
                        command => '[ -d /home/wikipedia ] && rsync --rsh="ssh -c blowfish-cbc -i /root/.ssh/home-rsync" -azu /home/* db20@tridge.wikimedia.org:~/home/',
                        user => root,
                        hour => 2,
                        minute => 35,
                        weekday => 6,
                        ensure => present;
                }

ubuntu:mirror

# Mirror update cron entry
                cron { update-ubuntu-mirror:
                        require => [ Systemuser[mirror], File["update-ubuntu-mirror"] ],
                        command => "/usr/local/sbin/update-ubuntu-mirror > /dev/null",
                        user => mirror,
                        hour => '*/6',
                        minute => 43,
                        ensure => present;
                }

misc::kiwix-mirror

cron { kiwix-mirror-update:
                command => "rsync -vzrlptD  download.kiwix.org::download.kiwix.org/zim/0.9/ /data/kiwix/zim/0.9/ >/dev/null 2>&1",
                user => mirror,
                minute => '*/15',
                ensure => present;
        }