10:54 labs-logs-bottie: petrb: found log files for master server in /var/spool/gridengine/qmaster yaylog removed exec from gs
10:54 labs-logs-bottie: petrb: removed exec from gs
March 17
17:52 labs-logs-bottie: petrb: somehow apache was installed to ibnr1 and no one logged it - next time log it and discuss before, deleting
17:19 petan|wk: this will be very useful when debuggin problems with SGE
17:19 petan|wk: replaced exim4 and setup local delivery so that mail now works on local system
17:05 labs-logs-bottie: petrb: qdeploying postfix
16:24 labs-logs-bottie: petrb: killing hang addshore processes from 1
March 16
18:43 labs-logs-bottie: petrb: btw when you create these things log it please and document them
18:43 labs-logs-bottie: petrb: fixed 1 security bug and 2 other bugs in scripts/mysql_backup.sh
18:38 labs-logs-bottie: petrb: /data/project/ is a mess we need to clean it up
18:06 labs-logs-bottie: petrb: +Jan to bots
13:56 labs-logs-bottie: petrb: temporarily changed some parameters so that load get distributed better
March 15
21:44 mutante: Fox_Wilson/voxelbot could you check for cronspam from python: can't open file '/data/project/voxelbot/VandalismInformation/bot.py': [Errno 2] No such file or directory .. thanks
16:36 addshore: increase gid_range to 1000, bring avg loads down to 4 on all queues
15:36 addhappy: reset some crazy load thresholds that were used back to more normal values
15:26 addshore: increased gid_range by 100 to allow more simultaneous jobs
12:09 petan: deleting -nr1
12:03 petan: deleted bots-liwa
09:12 petan: deleting all qw jobs of addshore from queu
March 11
22:49 addshore: OG load formula to use short (1min avg) to enable faster submission after a recovery from high load
21:20 addshore: OG check load every 10 seconds instead of 40 seconds, also alter load check to 15 from 10, load decay adjust from 7mins to 1min, load weigth from 1cpu to 0.7cpu 0.3mem
21:00 addshore: Changed OG administrator_mail to 'addshore' to check spam
19:12 labs-logs-bottie: petrb: changed email in /etc/gridengine/configuration
16:22 labs-logs-bottie: petrb: disabling MTA on whole bots project to resolve spam
12:16 labs-logs-bottie: petrb: giving root to beetstra on -liwa
12:13 labs-logs-bottie: petrb: rebooting -liwa per request
09:23 labs-logs-bottie: petrb: addshore to motd so that people know who to blame :)
09:03 labs-logs-bottie: petrb: nr1 only 100mb of free ram, needs fix
00:07 labs-logs-bottie: danmichaelo: bots installed libxml2-dev, libxslt-dev on bots-3
February 13
23:25 labs-logs-bottie: danmichaelo: installed exuberant-ctags on bots-3
23:08 labs-logs-bottie: danmichaelo: sudo pip-2.7 install ipython on bots-3
23:02 labs-logs-bottie: danmichaelo: apt-get install libfreetype6-dev, libpng12-dev on bots-3
23:01 labs-logs-bottie: danmichaelo: --help
22:19 danmichaelo: sudo apt-get install liblapack-dev on bots-3
22:18 danmichaelo: sudo apt-get install libatlas-base-dev on bots-3
22:11 danmichaelo: sudo apt-get install python2.7-dev on bots-3
21:19 danmichaelo: sudo pip-2.7 install virtualenvwrapper on bots-3
21:12 danmichaelo: sudo pip-2.7 install virtualenv on bots-3
21:11 danmichaelo: sudo python2.7 get-pip.py on bots-3
21:10 danmichaelo: sudo python2.7 distribute_setup.py on bots-3
21:07 danmichaelo: sudo apt-get install python2.7 on bots-3
21:06 danmichaelo: sudo add-apt-repository ppa:fkrull/deadsnakes on bots-3 (for python 2.7)
February 12
18:30 Vacation9: added both Makecat and Geraki; trusted users
12:44 labs-logs-bottie: petrb: bots-1 updating seen module in wm-bot
February 10
23:44 Ryan_Lane: installed libevent-dev and python-dev on bots-bnr1
23:39 Ryan_Lane: installed python-setuptools on bots-bnr1
February 9
20:03 labs-logs-bottie: addshore: added ceradon to bastion and bots
February 5
14:02 petan: +Darkdadaah
09:08 petan: recovered bots-3
February 4
11:35 labs-logs-bottie: root: reinstalling python-twisted on bots-bnr1
11:05 labs-logs-bottie: root: rebooting bots-bnr1
11:05 labs-logs-bottie: root: bots-bnr1 - process 14202 is stucked waiting for IO, unable to kill it, machine needs to be rebooted
11:03 labs-logs-bottie: root: because of gluster failure on bots-bnr1 it's needed to kill all processes that access the fs, which is gifti 10091 f.c.m (gifti)java valhallasw 13912 ..c.. (valhallasw)bash valhallasw 14202 ..c.. (valhallasw)file killing them now
10:54 labs-logs-bottie: petrb: on bnr1
10:54 labs-logs-bottie: petrb: installing python-twisted and php5-curl
February 3
21:28 labs-logs-bottie: addshore: rebooted bots-4
February 1
23:32 rschen7754: installed python-twisted
January 28
06:34 rschen7754: installed sqlite3 on bots-4
06:34 rschen7754: installed sqlite on bots-4
06:32 rschen7754: installed python-twisted on bots-4
January 27
12:30 labs-logs-bottie: petrb: sql2 is running
09:39 labs-logs-bottie: root: maintenance on bots-sql2 in 20 minutes
13:10 labs-logs-bottie: root: sql3 will reboot approximately in 1h
13:09 labs-logs-bottie: root: scheduling sql3 to reboot
12:57 labs-logs-bottie: petrb: moving the data to gluster until I recreate filesystem
12:46 labs-logs-bottie: petrb: disabling sql3 in order to change the device
January 18
19:03 giftpflanze: russblau created /etc/cron.daily/pywikipedia on bots-4 to update /data/project/pywikipedia/{trunk,rewrite}
18:24 russblau: Created shared Pywikipedia repositories at /data/project/pywikipedia/trunk and /data/project/pywikipedia/rewrite (you'll still need separate user-config files and so on for each bot)
January 16
12:58 Vacation9: Now running VoxelBot on bots-4
10:04 petan: inserting Vacation9 to project
10:02 petan: inserting Fox Wilson to project
January 15
22:32 labs-logs-bottie: petrb: the indians, of course
22:32 labs-logs-bottie: petrb: changing the apache configs per hints from apache people
16:50 labs-logs-bottie: petrb: inserting wikivoyage.org subdomains to wm-bot db
January 14
09:50 labs-logs-bottie: petrb: updating some configs in apache to fix broken urls
January 12
07:33 labs-logs-bottie: petrb: public_html is failing, changing to a+rx in a loop
18:38 andrewbogott: puppetized labs-morebots via role::logbot::wikimedia-labs
18:35 andrewbogott: testing
18:13 andrewbogott: do you work over here?
17:57 andrewbogott: taking down labs-morebots so that I can bring it back via puppet
December 10
23:55 andrewbogott: I am testing this bot.
November 26
04:14 jeremyb: [bots-labs] labs-morebots was gone after netsplit. booted
04:14 jeremyb: [bots-1] wm-bot was gone after netsplit. booted bouncer && wmib
November 25
05:56 jeremyb: bots-3 apparently beetstra's bots don't start on their own (no init script). started them all manually based on http://bots.wmflabs.org/~hydriz/minimanual.txt and root's bash history. they're running as beetstra's user
05:35 jeremyb: bots-3 rebooted from labsconsole. ganglia showed nearly 4 hrs unresponsive, couldn't connect myself, couldn't even view console log on labsconsole. (but console log was working for other instances)
November 9
23:34 andrewbogott: Maybe sort of fixed the adminbot for the time being. Is this in source-control someplace?
23:33 andrewbogott: This is my final log message for the day.
12:24 jeremyb: [bots-1 wm-bot] fixed metawiki's info in ./sites && cleaned up the channels that were watching the original one. tested 2 channels and they both work!
October 29
11:45 labs-logs-bottie: petrb: patch of wm-bot [critical]
03:47 jeremyb: [bots-1 - wm-bot] booted wmib and that didn't fix anything, booted bouncer and then wmib again and that worked. must have been hung post netsplit
October 28
12:52 legoktm: installed python-imaging on bots-3
01:48 jeremyb: (timestamps are UTC of course)
01:47 jeremyb: booted labs-morebots twice (it broke itself again in fairly short order after the first time) and then re!log'd the stuff it had missed in the interim from my own scrollback buffer
October 27
04:03 legoktm: installed sqlite3 on bots-3
04:01 legoktm: installed sqlite on bots-3
October 17
15:42 petan: +huji
October 15
14:28 labs-logs-bottie: petrb: rebooting nr1
October 12
00:47 Ryan_Lane: added feature to labs-morebots to check users against a trust list; we need user profiles before we can realistically enable this.
00:46 Ryan_Lane: updated labs-morebots to lower the cache from 30 minutes to 5 minutes
09:26 petan: waiting for someone from ops to let me create big instance...
09:22 petan: creating new bots-sql2r
09:21 petan: upgrading bots-sql2 to higher ram, going to take while
08:37 Damianz: restarting mysql on bots-sql2 - going oom again
September 30
13:13 giftpflanze: installed tclsh8.6 with tip-389-impl for full unicode support on bots-4
September 29
02:30 mutante: adding new member cupco
September 28
16:07 Damianz: Said script is github.com/DamianZaremba/labs-bots-vhost-builder/blob/master/update.py
16:06 Damianz: Added script to bots-apache1's crontab to auto create member dirs in /data/project/public_html
15:35 Hydriz: New instance bots-salebot created dedicated for Gribeco to run Salebot (an antivandal bot) for frwiki and ptwiki.
15:34 Hydriz: Added Gribeco to the project. Created a public_html directory for him as well.
September 26
21:16 Damianz: Ryan fixed acls, apache1 can now talk to sql again
20:57 Damianz: bots-apache1 up again, issues with connections to sql at the moment. Everything else should be there, we need to puppetize this
20:05 Damianz: deleting bots-apache1, stuff will be down until I install another instance
09:17 petan: fixed bug in wm-bot
01:03 Damianz: test
00:51 mutante: added new member legoktm
September 25
21:35 Damianz: TEST
21:34 mutante: foobar
18:17 mutante: apt-get dist-upgrade on bots-2
18:16 Damianz: moved apache data from bots-nfs to /data/project/public_html so it's at least replicated
18:16 Damianz: rm -rf /tmp/.s on apache1
18:16 Damianz: Copied /tmp to /root/tmp on apache1
18:16 Damianz: rm -rf /var/tmp/* on apache1
18:16 Damianz: Killed 14654 on apache1
18:16 Damianz: Copied /var/tmp to /root/var-tmp on apache1
18:16 Damianz: Disabling phpmyadmin on apache1 due to new exploit
18:16 Damianz: apache1 running crazy shit processes under www-data, has at least 3 exploits and 1 udp processes downloaded.
September 14
10:59 labs-logs-bottie: petrb: performing a huge update of wm-bot, killing all core processes
September 13
17:05 Damianz: fixed pupept hostname on bots-apache1
September 10
16:30 labs-logs-bottie: petrb: let's try :D
16:17 labs-logs-bottie: petrb: performing big update of wm-bot
August 28
05:46 Ryan_Lane: live-hacked adminbot on bots-labs to match new DIT LDAP structure for keystone
August 21
00:35 Damianz: bots-sql2 oom, rebooting
August 3
19:12 andrewbogott: restarted 'Articles For Creation bot' on bots-1
19:11 andrewbogott: restarted wm-bot (bouncer.exe and wmib.exe) on bots-1
19:11 andrewbogott: restarted Log bot on bots-labs
19:01 andrewbogott: migrated all VMs to new hardware
August 1
23:30 Fastily: Installed jdk/jre 6 on bots-4
July 26
00:21 labs-logs-bottie: gifti: Installed fcron on bots-4
July 24
23:57 labs-logs-bottie: gifti: Installed tcl8.5, tclcurl, tcllib on bots-4
July 22
09:58 Hydriz: Restarted morebots on bots-2, seems to have been down for a really long time.
July 2
14:10 Damianz: chmod /mnt/public_html/damian/api.php to 000 on bots-apache1 - think the report/review sync for cbng is broken and looping, testing if this fixed the bw/spam issues.
08:48 Hydriz: Rebooting bots-sql3 as we humans are born evil (aka leap second bug/high CPU)
June 20
08:00 labs-logs-bottie: petrb: patching bot
June 18
14:19 labs-logs-bottie: petrb: done
14:18 labs-logs-bottie: petrb: patching bot
June 17
17:09 labs-logs-bottie: petrb: patching bouncer
10:45 labs-logs-bottie: petrb: installing new io cache for wmbot
June 16
17:48 labs-logs-bottie: petrb: patching bot
16:58 labs-logs-bottie: petrb: patching bot
16:35 labs-logs-bottie: petrb: fixing RC of wmib
16:04 labs-logs-bottie: wmib: wm-bot
16:04 labs-logs-bottie: wmib: patching bot
June 14
15:49 Ryan_Lane: moved adminbot to bots-labs
June 4
20:50 labs-logs-bottie: wmib: inserting wikimania to bot config and reloading it
20:42 labs-logs-bottie: wmib: updating wm-bot
May 30
17:08 labs-logs-bottie: jeremyb: [bots-1,bots-nfs] did some IRC log redaction surgery. booted wm-bot (wmib) a couple times. (same way as before. kill bot; do surgery; kill sleep; didn't touch restart.sh) had some weird permissions issue that ended up causing #wikimedia-tech's log to lose messages. will restore those from my personal log later (probably after midnight so I don't have to do any more bot killing)
08:54 labs-logs-bottie: wmib: done
08:52 labs-logs-bottie: petrb: patching wm bot
May 29
20:26 labs-logs-bottie: jeremyb: [bots-1] added hostmasks for krinkle and jeremyb to admin config. killed the mono proc and then killed the sleep. (didn't kill the restart.sh)
13:06 labs-logs-bottie: petrb: patching wm-bot
12:21 mutante: restarting labs-morebots on bots-2
May 25
14:01 petan|wk: I want to be Mr. Obvious
May 24
13:27 labs-logs-bottie: petrb: upgrading wm-bot
May 23
16:04 Damianz: Ran mysqladmin flush-hosts on bots-sql2 as it was blocking cbng's report interface.
May 22
15:49 labs-logs-bottie: wmib: restarting wm-bot
15:29 Thehelpfulone: added Sven_Manguard a few days ago to the *project* :P
15:25 Thehelpfulone: gave tanvir bots access, will be running interwiki bots
15:19 labs-logs-bottie: petrb: patching wm-bot
12:17 mutante: running puppet on bots-4
04:11 hashar: bots-apache1 has two defunct processes eating CPU: pdflushsh (pid 6382) and 10 (pid 6278)
May 21
21:11 hashar: restarted labs-morebot : root@bots-2:~# service adminbot restart
02:28 jeremyb: [bots-2] should find out what prod uses
02:28 jeremyb: [bots-2] could use some lockfiles... either in wrapper or in python itself
02:27 jeremyb: [bots-2] then investigated further (after the restart) and it turns out there were 3 adminlogbot.py procs (including the new one that had just been started). the other 2 were from May 9 and May 12. killed them all and started again from scratch
02:24 jeremyb: [bots-2] labs-morebots was running but not working. $ sudo service adminbot status; * logslogbot is running; $ sudo service adminbot restart; * Restarting IRC Logging bot for WMF labs logslogbot; ...done.
May 18
10:42 mutante: restarted labs-morebots on bots-2
10:21 mutante: restarting labs-morebots on bots-2
May 16
23:36 Damianz: Installed default-jre on bots-3 for svenmanguard's bot
May 13
18:39 labs-logs-bottie: petrb: patching wm-bot
May 9
12:01 Hydriz: Restarted morebots due to some connection issue on Freenode servers. log-bottie remains down due to sudo policy (If only I had sudo access...)
May 7
09:33 Thehelpfulone: made a little tweak to content.html
May 4
10:40 mutante: killed duplicate adminbot procs, restarted one
10:34 mutante: restarted labs-morebots on bots-2
May 1
11:02 Hydriz: Restarted morebots, was offline due to network-wide restart of virtual machine hosts.
April 30
14:16 labs-logs-bottie: petrb: this is a message I just logged
April 29
11:21 Beetstra: synchronizing databases 'coibot' and 'linkwatcher' from bots-sql3 to bots-sql2, adapting the bot-code to store/query on bots-sql2, will restart bots after transfer is complete
18:02 Beetstra: installation of POE on own account .. failed
17:08 Beetstra: installing perl module POE (+needed modules) for beetstra
16:24 Beetstra: connected to bots-2 for COIBot, LiWa3, XLinkBot
09:39 methecooldude: Given access to Beetstra
January 4
00:08 petan: deleted irc2
00:05 methecooldude: LDAP issue on bots-irc2 > "init: nss-ldap: do_open: do_start_tls failed:stat=-1" and "init: nss_ldap: could not search LDAP server - Server is unavailable"
January 3
23:30 petan: deleted irc1
January 1
21:26 petan: added jeremyb to nfs
December 19
20:20 Damianz: Move cluebot3 to the /mnt/share/cluebot/cluebot3 dir + added a new process group to supervisor for it on bots-cb.
12:15 methecooldude: Packages update which includes a new kernel
December 14
13:22 petan|wk: installed bots-sql3 db server
December 12
20:05 petan: reinstalled bots-sql1 it was totaly broken :O
20:02 petan: installed mysql on sql1
19:59 petan: created new instance bots-sql3 for mariadb
December 10
20:13 petan: fixed permission of /home/*
19:53 petan: reconfigured apache shared directories are now on nfs
19:23 petan: created nfs
00:27 Damianz: Damianz installed php5-cli on bots-apache1
00:05 petan: configured apache, back again :)
December 9
23:49 petan: created apache1 again hopefully ok
23:46 Damianz: Installed MediaWiki::API on bots-cb
23:45 petan: killed apache1
22:43 Damianz: Damianz symlinked /mnt/share/cluebot/cluebotng/{api,reviewinterface} to ~damian/public_html/cluebotng-{api,report} on bots-cb
21:22 Damianz: Damianz Installed php5 php5-curl php5-cli php5-mysql on bots-cb
21:11 Damianz: Damianz Added user 'cluebot' on bots-cb with the home dir /mnt/share/cluebot/
21:09 petan: Damianz installed supervisor on bots-cb
20:52 petan: rebooted apache1 for updates to take effect
20:21 petan: new instance bots-sql2 for mysql (80gb storage)
19:45 petan: deployed some more libraries for sql on bots-cb
15:14 petan|working: created /mnt/share on all servers in bots cluster
15:08 petan|working: sql is back up data are in /mnt/data
14:44 petan|working: disabled on sql1 in order to finish data move
14:26 petan|working: moved sql data files to /mnt
11:00 petan|working: Petrb packaged logbot
03:02 methecooldude: Updated ldconfig to reflect package changes
December 8
23:55 hyperon: morebots packaged successfully
23:26 Ryan_Lane: test
23:18 hyperon: taking down morebot again for testing :)
23:15 hyperon: installed adminbot on bots-1, i know what i did wrong
23:13 hyperon: i think you like me better
23:12 hyperon: test
22:57 hyperon: bringing adminbot down for testing in a minute...
21:04 petan: deployed bunch of libraries for cluebot
08:53 petan|w: reinstall done
08:36 petan|w: reinstalling bots-1
08:29 petan|w: killing bots-1 to reinstall to lucid
08:12 Ryan_Lane: killing the log bot for testing
December 7
13:41 petan|wk: Petrb deployed libconfig-dev to bots-cb
07:03 Ryan_Lane: bumping bots project to the top of the list
06:50 Ryan_Lane: test
03:50 Ryan_Lane: Adding categorization of the SALs
03:12 hyperon: another test
03:02 Ryan_Lane: made adminbot more configurable. moved config to /etc/adminbot/config.py, made identica and projects optional