Nova Resource:Tools

From Wikitech
Jump to: navigation, search


Resource Type project
Project Name tools
Monitoring ganglia
icinga
Admins
Members

Contents

Documentation

Edit Documentation

Description

The Tools project is one of two projects in the Tool Labs environment (the other being Toolsbeta).


Tool Labs is a reliable, scalable hosting environment for community developers working on tools and bots that help users maintain and use wikis. The cloud-based infrastructure was developed by the Wikimedia Foundation and is supported by a dedicated group of Wikimedia Foundation staff and volunteers. Tool Labs is a part of the Labs project, which is designed to make it easier for developers and system administrators to try out improvements to Wikimedia infrastructure, including MediaWiki, and to do analytics and bot work.

Tip: Confused about the terms labs, tool labs etc? Read Wikimedia Labs vs Tool Labs.

The Tool Labs environment provides:

  • Support for Web services, continuous bots, and scheduled tasks.
  • Access to replicated production databases.
  • Easily shared management of tool accounts, where tools and bots are stored.
  • A grid engine for dispatching jobs.
  • Support for mosh, SSH, SFTP without complicated proxy setup.
  • A shared pywikibot installation.
  • Time-travel backups for short-term data recovery.
  • Version control via Gerrit and Git.
  • Support for Redis.

In general, every tool maintainer should work primarily on the Tools project (not Toolsbeta, which is for experiments to the Tool Labs environment itself).

Getting access

After filling in the form, your request will then show up in the queue below, and will be processed shortly by one of the Tool Labs administrators.

Current queue [ link ]:

(No outstanding requests)

Help-page

Tools Resources Overview

Useful links




SSH Fingerprints

tools-login: Help:SSH Fingerprints/tools-login.wmflabs.org


Topology of tools on labs Tool Labs design philosophy

Server admin log

April 13

April 12

  • 23:51 scfc_de: tools-mail: rm -f /var/log/exim4/paniclog ("unknown named domain list "+relay_domains"")

April 11

April 10

  • 18:20 scfc_de: tools-webgrid-01, tools-webgrid-02: "kill -HUP" all php-cgis that are not (grand-)children of lighttpd processes

April 8

  • 05:06 Ryan_Lane: restart nginx on tools-proxy-test
  • 05:03 Ryan_Lane: upgraded libssl on all nodes

April 4

  • 15:48 Coren: Moar powar!!1!one: added two exec nodes (-09 -10) and one webgrid node (-02)
  • 11:11 scfc_de: Set /data/project/.system/config/wikihistory.workers to 20 on apper's request

March 30

  • 18:16 scfc_de: Removed empty directories /data/project/{d930913,sudo-test{,-2},testbug{,2,3}}: Corresponding service groups don't exist (anymore)
  • 18:13 scfc_de: Removed /data/project/backup: Only empty dynamic-proxy backup files of January 3rd and earlier

March 29

  • 10:14 wm-bot: petrb: disabled 1 job in cron in -login of user tools.tools-info which was killing login server

March 28

  • 11:53 wm-bot: petrb: did the same on -mail server (removed /var/log/exim4/paniclog) so that we don't get spam every day
  • 11:51 wm-bot: petrb: removed content of /var/log/exim4/paniclog
  • 11:49 wm-bot: petrb: disabled default vimrc which everybody hates on -login

March 21

  • 16:50 scfc_de: tools-login: pkill -u tools.bene (OOM)
  • 16:13 scfc_de: rmdir /home/icinga (totally empty, "drwxr-xr-x 2 nemobis 50383 4096 Mär 17 16:42", perhaps artifact of mass migration?)
  • 15:49 scfc_de: sudo cp -R /etc/skel /home/csroychan && sudo chown -R csroychan.wikidev /home/csroychan; that should close [[bugzilla:62132]]
  • 15:15 scfc_de: sudo cp -R /etc/skel /home/annabel && sudo chown -R annabel.wikidev /home/annabel
  • 15:14 scfc_de: sudo chown -R torin8.wikidev /home/torin8

March 20

  • 18:36 scfc_de: Pointed tools-dev.wmflabs.org at tools-dev.eqiad.wmflabs; cf. [[Bugzilla:62883]]

March 5

  • 13:57 wm-bot: petrb: test

March 4

  • 22:35 wm-bot: petrb: uninstalling it from -login too
  • 22:32 wm-bot: petrb: uninstalling apache2 from tools-dev it has nothing to do there

March 3

  • 19:20 wm-bot: petrb: shutting down almost all services on webserver-02 in order to make system useable and finish upgrade
  • 19:17 wm-bot: petrb: upgrading all packages on webserver-02
  • 19:15 petan: rebooting webserver-01 which is totally dead
  • 19:07 wm-bot: petrb: restarting apache on webserver-02 it complains about OOM but the server has more than 1.5g memory free
  • 19:03 wm-bot: petrb: switched local-svg-map-maker to webserver-02 because 01 is not accessible to me, hence I can't debug that
  • 16:44 scfc_de: tools-webserver-03: Apache was swamped by request for /guc. "webservice start" for that, and pkill -HUP -u local-guc.
  • 12:54 scfc_de: tools-webserver-02: Rebooted, apache2/error.log told of OOM, though more than 1G free memory.
  • 12:50 scfc_de: tools-webserver-03: Rebooted, scripts were timing out
  • 12:42 scfc_de: tools-webproxy: Rebooted; wasn't accessible by ssh.

March 1

  • 03:42 Coren: disabled puppet in pmtpa tool labs\

February 28

  • 14:46 wm-bot: petrb: extending /usr on tools-dev by 800mb
  • 00:26 scfc_de: tools-webserver-02: Rebooted; inaccessible via ssh, http said "500 Internal Server Error"

February 27

  • 15:28 scfc_de: chmod g-w ~fsainsbu/.forward

February 25

  • 22:48 rdwrer: Lol, so, something happened with grrrit-wm earlier and nobody logged any of it. It was yoyoing, Yuvi killed it, then aude did something and now it's back.

February 23

  • 20:46 scfc_de: morebots: labs HUPped to reconnect to IRC

February 21

  • 17:32 scfc_de: tools-dev: mount -t nfs -o nfsvers=3,ro labstore1.pmtpa.wmnet:/publicdata-project /public/datasets; automount seems to have been stuck
  • 15:24 scfc_de: tools-webserver-03: Rebooted, wasn't accessible by ssh and apparently no access to /public/datasets either

February 20

  • 21:23 scfc_de: tools-login: Disabled crontab for local-rezabot and left a message at User talk:Reza#Running bots on tools-login, etc. (fa:بحث_کاربر:Reza1615 is write-protected)
  • 20:15 scfc_de: tools-login: Disabled crontab for local-chobot and left a message at ko:사용자토론:ChongDae#Running bots on tools-login, etc.
  • 10:42 scfc_de: tools-mail: rm -f /var/log/exim4/paniclog ("User 0 set for local_delivery transport is on the never_users list", cf. [[bugzilla:61583]])
  • 10:30 scfc_de: tools-login: rm -f /var/log/exim4/paniclog (OOM)
  • 10:28 scfc_de: Reset error status of task@tools-exec-09 ("can't get password entry for user 'local-voxelbot'"); "getent passwd local-voxelbot" works on tools-exec-09, possibly a glitch

February 19

  • 20:21 scfc_de: morebots: Set "enable_twitter=False" in confs/labs-logbot.py and restarted labs-morebots
  • 19:14 scfc_de: tools-login: Disabled crontab and pkill -HUP -u fatemi127

February 18

  • 11:42 scfc_de: tools-mail: Rerouted queued mail (@tools-login.pmtpa.wmflabs => @tools.wmflabs.org)
  • 11:34 scfc_de: tools-exec-08: Rebooted due to not responding on ssh and SGE
  • 10:39 scfc_de: tools-mail: rm -f /var/log/exim4/paniclog ("User 0 set for local_delivery transport is on the never_users list" => probably artifacts from Coren's LDAP changes)
  • 10:37 scfc_de: tools-login: rm -f /var/log/exim4/paniclog (OOM)

February 14

  • 23:54 legoktm: restarting grrrit-wm since it disappeared
  • 08:19 scfc_de: tools-login: rm -f /var/log/exim4/paniclog (OOM)

February 13

  • 13:11 scfc_de: Deleted old job of user veblenbot stuck in error state
  • 13:08 scfc_de: Deleted old jobs of user v2 stuck in error state
  • 10:49 scfc_de: tools-login: Commented out local-shuaib-bot's crontab with a pointer to Tools/Help

February 12

  • 07:51 wm-bot: petrb: removed /data/project/james/adminstats/wikitools per request from james on irc

February 11

  • 15:47 scfc_de: Restarted webservice for geohack
  • 13:02 scfc_de: tools-login: rm -f /var/log/exim4/paniclog (OOM)
  • 13:00 scfc_de: Killed -HUP local-hawk-eye-bot's jobs; one was hanging with a stale NFS handle on tools-exec-05

February 10

  • 23:16 Coren: rebooting webproxy (braindead autofs)

February 9

February 6

February 4

January 31

  • 03:43 scfc_de: Cleaned up all exim queues
  • 01:26 scfc_de: chmod g-w ~{bgwhite,daniel,euku,fale,henna,hydriz,lfaraone}/.forward (test: sudo find /home -mindepth 2 -maxdepth 2 -type f -name .forward -perm /g=w -ls)

January 30

  • 21:48 scfc_de: chmod g-w ~fluff/.forward
  • 21:40 scfc_de: local-betabot: Added "-M" option to crontab's qsub call and rerouted queued mail (freeze, exim -Mar, exim -Mmd, thaw)
  • 18:33 scfc_de: tools-exec-04: puppetd --enable (apparently disabled sometime around 2014-01-16?!)
  • 17:25 scfc_de: tools-exec-06: mv -f /etc/init.d/nagios-nrpe-server{.dpkg-dist,} (nagios-nrpe-server didn't start because start-up script tried to "chown icinga" instead of "chown nagios")

January 28

  • 04:27 scfc_de: tools-webproxy: Blocked Phonifier

January 25

  • 05:37 scfc_de: tools-webserver-02: rm -f /var/log/exim4/paniclog (OOM)

January 24

  • 01:07 scfc_de: tools-db: Removed /var/lib/mysql2, set expire_logs_days to 1 day
  • 00:11 scfc_de: tools-db: and restarted mysqld
  • 00:11 scfc_de: tools-db: Moved 4.2 GBytes of the oldest binlogs to /var/lib/mysql2/

January 23

  • 19:24 legoktm: restarting grrrit-wm now https://gerrit.wikimedia.org/r/#/c/109116/
  • 19:23 legoktm: ^ was for grrrit-wm
  • 19:23 legoktm: re-committed password to local repo, not sure why that wasn't committed already

January 21

  • 17:41 scfc_de: tools-exec-09: iptables-restore /data/project/.system/iptables.conf

January 20

  • 07:02 andrewbogott: merged a lint patch to the gridengine module. Should be a noop

January 16

  • 17:11 scfc_de: tools-exec-09: "iptables-restore /data/project/.system/iptables.conf" after reboot

January 15

  • 13:36 scfc_de: After reboot of tools-exec-09, all continuous jobs were successfully restarted ("Rr"); task jobs (1974113, 2188472) failed ("19  : before writing exit_status")
  • 13:27 scfc_de: tools-login: rm -f /var/log/exim4/paniclog (OOM)
  • 08:54 andrewbogott: rebooted tools-exec-09
  • 08:32 andrewbogott: rebooted tools-db

January 14

  • 15:10 scfc_de: tools-login: pkill -u local-mlwikisource: Freed 1 GByte of memory
  • 14:58 scfc_de: tools-login: Disabled local-mlwikisource's crontab with explanation
  • 13:57 scfc_de: tools-webserver-02: rm -f /var/log/exim4/paniclog (out of memory errors on 2014-01-10)

January 10

January 9

January 8

  • 13:44 scfc_de: Cleared error states of continuous@tools-exec-05, task@tools-exec-05, task@tools-exec-09

January 7

  • 18:59 scfc_de: tools-login, tools-mail: rm -f /var/log/exim4/paniclog (apparently some artifacts of the LDAP failure)

January 6

  • 14:06 YuviPanda: deleted instance tools-mc, didn't know it had come back from the dead

January 1

  • 13:24 scfc_de: tools-exec-02, tools-master, tools-shadow, tools-webserver-01: Commented out duplicate MariaDB entries in /etc/apt/sources.list and re-ran apt-get update
  • 11:27 scfc_de: tools-webserver-01, tools-webserver-01: rm -f /var/log/exim4/paniclog; out of memory errors
  • 11:18 scfc_de: Emptied /{data/project,home}/.snaplist as the snapshots themselves are not available

December 27

  • 07:39 legoktm: grrrit-wm restart didn't really work.
  • 07:38 legoktm: restarting grrit-wm, for some reason it reconnected and lost its cloak

December 23

  • 18:30 marktraceur: restart grrrit-wm for subbu

December 21

  • 06:50 scfc_de: tools-exec-01: Commented out duplicate MariaDB entries in /etc/apt/sources.list and re-ran apt-get update

December 19

  • 17:22 marktraceur: deploying grrrit config change

December 17

  • 23:19 legoktm: rebooted grrrit-wm with new config stuffs

December 14

  • 18:13 marktraceur: restarting grrrit-wm to fix its nickname
  • 13:17 scfc_de: tools-exec-08: Purged packages libapache2-mod-suphp and suphp-common (probably remnants from when the host was misconfigured as a webserver)
  • 13:09 scfc_de: tools-dev, tools-login, tools-mail, tools-webserver-01, tools-webserver-02: rm /var/log/exim4/paniclog (mostly out of memory errors)

December 4

  • 22:15 Coren: tools-exec-01 rebooted to fix the autofs issue; will return to rotation shortly.
  • 16:33 Coren: rebooting webproxy with new kernel settings to help against the DDOS

December 1

  • 14:05 Coren: underlying virtualization hardware rebooted; tools-master and friends coming back up.

November 25

  • 21:03 YuviPanda: created tools-proxy-test instance to play around with the dynamicproxy
  • 12:16 wm-bot: petrb: deswapping -login (swapoff -a && swapon -a)

November 24

  • 07:19 paravoid: disabled crontab for user avocato on tools-login, see above
  • 07:17 paravoid: pkill -u avocato on tools-login, multiple /home/avocato/pywikipedia/redirect.py DoSing the bastion

November 14

  • 09:12 ori-l: Added aude to lolrrit-wm maintainers group

November 13

  • 22:36 andrewbogott: removed 'imagescaler' class from tools-login because that class hasn't existed for a year. Which, a year ago is before that instance even existed so what the heck?

November 3

  • 16:49 ori-l: grrrit-wm stopped receiving events. restarted it; didn't help. then restarted gerrit-to-redis, which seems to have fixed it.

November 1

  • 16:11 wm-bot: petrb: restarted terminator daemon on -login to sort out memory issues caused by heavy mysql client by elbransco

October 23

  • 15:19 Coren: deleted tools-tyrant and tools-exec-cyberbot (cleanup of obsoleted instances)

October 20

  • 18:52 wm-bot: petrb: everything looks better
  • 18:51 wm-bot: petrb: restarting apache server on tools-webproxy
  • 18:49 wm-bot: petrb: installed links on -dev and going to investigate what is wrong with apaches, documentation, Coren, please update it

October 15

  • 21:03 Coren: labs-login rebooted to fix the ownership/take issue with success.

October 10

  • 09:49 addshore: tools-webserver-01is getting a 500 Internal Server Error again

September 23

  • 06:44 YuviPanda: remove unpuppetized install of openjdk-6 packages causing problems in -dev (for bug: 54444)
  • 06:44 YuviPanda: remove unpuppetized install of openjdk-6 packages causing problems in -dev (for bug: 54444)
  • 05:15 legoktm: logging a log to test the log logging
  • 05:13 legoktm: logging a log to test the log logging

September 11

  • 09:39 wm-bot: petrb: started toolwatcher

August 24

  • 18:00 wm-bot: petrb: freed 1600mb of ram by killing yasbot processes on -login
  • 17:59 wm-bot: petrb: killing all python processes of yasbot on -login, this bot needs to run on grid, -login is constantly getting OOM because of this bot

August 23

  • 12:17 wm-bot: petrb: test
  • 12:15 wm-bot: petrb: making pv from /dev/vdb on new nodes
  • 11:49 wm-bot: petrb: syncing packages of -login with exec nodes
  • 11:48 petan: someone installed firefox on exec nodes, should investigate / remove

August 22

  • 01:24 scfc_de: tools-webserver-03: Installed python-oursql

August 20

  • 23:00 scfc_de: Opened port 3000 for intra-Labs traffic in execnode security group for YuviPanda's proxy experiments

August 19

  • 09:52 wm-bot: petrb: deleting fatestwiki tool, requested by creator

August 16

  • 00:16 scfc_de: tools-exec-01 doesn't come up again even after repeat reboots

August 15

  • 15:14 scfc_de: tools-webserver-01: Simplified /usr/local/bin/php-wrapper
  • 14:31 scfc_de: tools-webserver-01: "dpkg --configure -a" on apt-get's advice
  • 14:24 scfc_de: chmod 644 ~magnus/.forward
  • 03:07 scfc_de: tools-webproxy: Temporarily serving 403s to AhrefsBot/bingbot/Googlebot/PaperLiBot/TweetmemeBot/YandexBot until they reread robots.txt
  • 02:02 scfc_de: robots.txt: "Disallow: /"

August 11

  • 03:14 scfc_de: tools-mc: Purged memcached

August 10

  • 02:36 scfc_de: Disabled terminatord on tools-login and tools-dev
  • 02:24 scfc_de: chmod g-w ~whym/.forward

August 6

  • 19:26 scfc_de: Set up basic robots.txt to exclude Geohack to see how that affects traffic
  • 02:09 scfc_de: tools-mail: Enabled rudimentary Ganglia monitoring in root's crontab

August 5

  • 20:32 scfc_de: chmod g-w ~ladsgroup/.forward

August 2

  • 23:45 scfc_de: tools-dev: Installed dialog for testing

August 1

  • 19:57 scfc_de: Created new instance tools-redis with redis_maxmemory = "7GB"
  • 19:56 scfc_de: Added redis_maxmemory to wikitech Puppet variables

July 31

  • 10:50 HenriqueCrang: ptwikis added graph with mobile edits

July 30

  • 19:08 scfc_de: tools-webproxy: Purged popularity-contest and ubuntu-standard
  • 07:32 wm-bot: petrb: deleted local-addbot jobs
  • 02:01 scfc_de: tools-webserver-01: Symlinked /usr/local/bin/{job,jstart,jstop,jsub} to /usr/bin; were obsolete versions.

July 29

  • 15:15 scfc_de: tools-webserver-01: rm /var/log/exim4/paniclog
  • 15:10 scfc_de: Purged popularity-contest from tools-webserver-01.
  • 02:40 scfc_de: Restarted toolwatcher on tools-login.
  • 02:11 scfc_de: Reboot tools-login, was not responsive

July 25

  • 23:37 Ryan_Lane: added myself to lolrrit-wm tool
  • 12:06 wm-bot: petrb: test
  • 07:11 wm-bot: petrb: created /var/log/glusterfs/bricks/ to stop rotatelogs from complaining about it being missing

July 20

  • 15:19 petan: rebooting tools-redis

July 19

  • 07:06 petan: instances were rebooted for unknown reasons
  • 00:42 helderwiki: it works! :-)
  • 00:41 legoktm: test

July 10

  • 18:04 wm-bot: petrb: installing mysqltcl on grid
  • 18:01 wm-bot: petrb: installing tclodbc on grid

July 5

  • 19:38 AzaToth: test
  • 19:36 AzaToth: test for example
  • 18:23 Coren: brief outage of webproxy complete (back to business!)
  • 18:13 Coren: brief outage of webproxy (rollback 2.4 upgrade)

July 3

  • 13:44 scfc_de: Set "HostbasedAuthentication yes" and "EnableSSHKeysign yes" in tools-dev's /etc/ssh/ssh_config
  • 12:58 petan: rebooting -mc it's aparently OOM dying

July 2

  • 16:24 wm-bot: petrb: installed maria to all nodes so we can connect to db even from sge
  • 12:19 wm-bot: petrb: installing packages -- libmediawiki-api-perl libdatetime-format-strptime-perl libbot-basicbot-perl libdatetime-format-duration-perl

July 1

  • 18:39 wm-bot: petrb: started toolwatcher on - login
  • 14:22 wm-bot: petrb: installing following packages on grid: libdata-dumper-simple-perl libhtml-html5-entities-perl libirc-utils-perl libtask-weaken-perl libobject-pluggable-perl libpoe-component-syndicator-perl libpoe-filter-ircd-perl libsocket-getaddrinfo-perl libpoe-component-irc-perl libxml-simple-perl
  • 12:05 wm-bot: petrb: starting toolwatcher
  • 11:40 wm-bot: petrb: tools is back o/
  • 09:42 wm-bot: petrb: installing python -zmg -matplotlib @ dev
  • 03:33 scfc_de: Rebooted tools-login apparently out of memory and not responding to ssh

June 30

  • 17:58 scfc_de: Set ssh_hba to yes on tools-exec-06
  • 17:13 scfc_de: Installed python-matplotlib and python-zmq on tools-login for YuviPanda

June 26

  • 21:16 Coren: +Tim Landscheidt to project admins, local-admin
  • 14:23 wm-bot: petrb: updating several packages on -login
  • 13:43 wm-bot: petrb: killing old instance of redis: Jun15 ? 00:06:49 /usr/bin/redis-server /etc/redis/redis.conf
  • 13:42 wm-bot: petrb: restarting redis
  • 13:28 wm-bot: petrb: running puppet on -mc
  • 13:27 wm-bot: petrb: adding ::redis role to tools-mc - if anything will break, YuviPanda did it :P
  • 09:35 wm-bot: petrb: updated status.php to version which display free vmem as well

June 25

  • 12:34 wm-bot: petrb: installing php5-mcrypt on exec and web

June 24

  • 15:45 wm-bot: petrb: changed colors of root prompt productions vs testing
  • 07:57 wm-bot: petrb: 50527 4186 22830 1 Jun23 pts/41 00:08:54 python fill2.py eats 48% of ram on -login

June 19

  • 12:17 wm-bot: petrb: increasing limit on mysql connections

June 17

  • 17:34 wm-bot: petrb: /var/spool/cron/crontabs/ has -rw------- 1 8006 crontab 1176 Apr 11 14:07 local-voxelbot fixing

June 16

  • 21:23 Coren: 1.0.3 deployed (jobutils, misctools)

June 15

  • 21:40 wm-bot: petrb: there is no lvm on -db which we need as hell - therefore no swap either nor storage for binary logs :( I got a feeling that mysql will die oom soonish
  • 21:39 wm-bot: petrb: db has 5% free RAM eeeek
  • 18:36 wm-bot: root: removed lot of ?audit? logs from exec-04 they were eating too much storage
  • 18:23 wm-bot: petrb: temporarily disabling /tmp on exec-04 in order to set up lvm
  • 18:23 wm-bot: petrb: exec-04 96% / usage, creating a new volume
  • 12:33 wm-bot: petrb: installing redis on tools-mc

June 14

  • 12:35 wm-bot: petrb: updating logsplitter to new version

June 13

  • 21:59 wm-bot: petrb: replaced logsplitter on both apache servers with far more powerfull c++ version thus saving a lot of resources on both servers
  • 12:43 wm-bot: petrb: tools-webserver-01 is running quite expensive python job (currently eating almost 1gb of ram) it may need to be fixed or moved to separate webserver, adding swap to prevent machine die OOM
  • 12:22 wm-bot: petrb: killing process 31187 sort -T./enwiki/target -t of user local-enwp10 for same reason as previous one
  • 12:21 wm-bot: petrb: killing process 31190 sort -T./enwiki/target of user local-enwp10 for same reason as previous one
  • 12:17 wm-bot: petrb: killing process 31186 31185 69 Jun11 pts/32 1-13:14:41 /usr/bin/perl ./bin/catpagelinks.pl ./enwiki/target/main_pages_sort_by_ids.lst ./enwiki/target/pagelinks_main_sort_by_ids.lst because it seems to be a bot running on login server eating too many resources

June 11

  • 07:36 wm-bot: petrb: installed libdigest-crc-perl

June 10

  • 13:05 wm-bot: petrb: installing libcrypt-gcrypt-perl
  • 08:45 wm-bot: petrb: updated /usr/local/bin/logsplitter on webserver-01 in order to fix !b 49383
  • 08:45 wm-bot: petrb: updated /usr/local/bin/logsplitter on webserver-01 in order to fix become afcbot 49383
  • 08:44 wm-bot: petrb: updated /usr/local/bin/logsplitter on webserver-01 in order to fix become afcbot 49383
  • 08:25 wm-bot: petrb: fixing missing packages on exec nodes

June 9

  • 20:44 wm-bot: petrb: moved logs on -login to separate storage

June 8

  • 21:24 wm-bot: petrb: installing python-imaging-tk on grid
  • 21:20 wm-bot: petrb: installing python-tk
  • 21:16 wm-bot: petrb: installing python-flickrapi on grid
  • 21:16 wm-bot: petrb: installing
  • 16:49 wm-bot: petrb: turned off wmf style of vi on tools-dev feel free to slap me :o or do cat /etc/vim/vimrc.local >> .vimrc if you love it
  • 15:33 wm-bot: petrb: grid is overloaded, needs to be either enlarged or jobs calmed down :o
  • 09:55 wm-bot: petrb: backporting tcl 8.6 from debian
  • 09:38 wm-bot: petrb: update python requests to version 1.2.3.1

June 7

  • 15:29 Coren: Deleted no-longer-needed tools-exec-cg node (spun off to its own project)

June 5

  • 09:52 wm-bot: petrb: on -dev
  • 09:52 wm-bot: petrb: moving /usr to separate volume expect problems :o
  • 09:41 wm-bot: petrb: moved /var/log to separate volume on -dev
  • 09:31 wm-bot: petrb: houston we have problem, / on dev is 94%
  • 09:28 wm-bot: petrb: installed openjdk7 on -dev
  • 09:00 wm-bot: petrb: removing wd-terminator service
  • 08:39 wm-bot: petrb: started toolwatcher
  • 07:04 wm-bot: petrb: installing maven on -dev

June 4

  • 14:49 wm-bot: petrb: installing sbt in order to fix b48859
  • 13:28 wm-bot: petrb: installing csh on cluster
  • 08:37 wm-bot: petrb: installing python-memcache on exec nodes

June 3

  • 21:40 Coren: Rebooting -login; it's trashing. Will keep an eye on it.
  • 14:15 wm-bot: petrb: removing popularity contest
  • 14:11 wm-bot: petrb: removing /etc/logrotate.d/glusterlogs on all servers to fix logrotate daemon
  • 09:43 wm-bot: petrb: syncing packages on exec nodes to avoid troubles with missing libs on some etc

June 2

  • 08:39 wm-bot: petrb: installing ack-grep everywhere per yuvipanda and irc

June 1

  • 20:57 wm-bot: petrb: installed this to exec nodes because it was on some and not on others cpp-4.4 cpp-4.5 cython dbus dosfstools ed emacs23 ftp gcc-4.4-base iptables iputils-tracepath ksh lsof ltrace lshw mariadb-client-5.5 nano python-dbus python-egenix-mxdatetime python-egenix-mxtools python-gevent python-greenlet strace telnet time -y
  • 20:42 wm-bot: petrb: installing wikitools cluster wide
  • 20:40 wm-bot: petrb: installing oursql cluster wide
  • 10:46 wm-bot: petrb: created new instance for experiments with sasl memcache tools-mc

May 31

  • 19:17 petan: deleting xtools project (requested by Cyberpower678)
  • 17:24 wm-bot: petrb: removing old kernels from -dev because / is almost full
  • 17:17 wm-bot: petrb: installed lsof to -dev
  • 15:55 wm-bot: petrb: installed subversion to exec nodes 4 legoktm
  • 15:47 wm-bot: petrb: replacing mysql with maria on exec nodes
  • 15:46 wm-bot: petrb: replacing mysql with maria on exec nodes
  • 15:14 wm-bot: petrb: installing default-jre in order to satisfy its dependencies
  • 15:13 wm-bot: petrb: installing /data/project/.system/deb/all/sbt.deb to -dev in order to test it
  • 13:04 wm-bot: petrb: installing bashdb on tools and -dev
  • 12:27 wm-bot: petrb: removing project local-jimmyxu - per request on irc
  • 10:54 wm-bot: petrb: killing process 3060 on -login (mahdiz 3060 1964 88 May30 ? 21:32:51 /bin/nano /tmp/crontab.Ht3bSO/crontab) it takes max cpu and doesn't seem to be attached

May 30

  • 12:24 wm-bot: petrb: deleted job 1862 from queue (error state)
  • 08:26 wm-bot: petrb: updated sql command

May 29

  • 21:05 wm-bot: petrb: running sudo apt-get install php5-gd

May 28

  • 20:00 wm-bot: petrb: installing p7zip-full to -dev and -login

May 27

  • 08:46 wm-bot: petrb: changed config of mysql to use /mnt as path to save binary logs, this however requires server to be restarted

May 24

  • 08:44 petan: setting up lvm on new exec nodes because it is more flexible and allows us to change the size of volumes on the fly
  • 08:28 petan: created 2 more exec nodes, setting up now...

May 23

  • 09:20 wm-bot: petrb: process 27618 on -login is constantly eating 100% of cpu, changing priority to 20

May 22

  • 20:54 wm-bot: petrb: changing ownership of /data/project/bracketbot/ to local-bracketbot
  • 14:28 labs-logs-bottie: petrb: installed netcat as well
  • 14:28 labs-logs-bottie: petrb: installed telnet to -dev
  • 14:02 Coren: tools-webserver-02 now live; / and /cluebot/ moved there

May 21

  • 20:27 labs-logs-bottie: petrb: uploaded hosts to -dev

May 19

  • 13:40 labs-logs-bottie: petrb: killing that nano process seems to be some hang and unattached anyway
  • 12:59 labs-logs-bottie: petrb: changed priority of nano process to 19
  • 12:55 labs-logs-bottie: petrb: local-hawk-eye-bot /bin/nano /tmp/crontab.d4JhUj/crontab eat too much cpu
  • 12:50 petan: nvm previous line
  • 12:50 labs-logs-bottie: petrb: vul alias viewuserlang

May 14

  • 21:22 labs-logs-bottie: petrb: created a separate volume for /tmp on login so that temp files do not fragment root fs and it does not get filled up by them, it also makes it easier to track filesystem usage
  • 13:16 Coren: reboot -dev, need to test kernel upgrade

May 10

  • 15:08 Coren: create tools-webserver-02 for Apache 2.4 experimentation

May 9

  • 04:12 Coren: added -exec-03 and -exec-04. Moar power!!1!

May 6

  • 19:59 Coren: made tools-dev.wmflabs.org public
  • 08:04 labs-logs-bottie: petrb: created a small swap on -login so that users can not bring it to OOM so easily and so that unused memory blocks can be swapined in order to use the remaining memory more effectively
  • 08:00 labs-logs-bottie: petrb: making lvm from unused disk from /mnt on -login so that we can eventually use it somewhere if needed

May 4

  • 17:50 labs-logs-bottie: petrb: foobar as well
  • 17:47 labs-logs-bottie: petrb: removing project flask-stub using rmtool
  • 15:33 labs-logs-bottie: petrb: fixing missing db user for local-stub
  • 12:51 labs-logs-bottie: petrb: creating mysql accounts by hand for alchimista and fubar

May 2

  • 20:49 labs-logs-bottie: petrb: uploaded motd to exec-N as well, with information which server users connected to

May 1

  • 16:59 labs-logs-bottie: petrb: fixed invalid permissions on /home

April 27

  • 18:54 labs-logs-bottie: petrb: installing pymysql using pip on whole grid because it is needed for greenrosseta (for some reason it is better than python-mysql package)

April 26

  • 23:55 Coren: reboot to finish security updates
  • 08:00 labs-logs-bottie: petrb: patching qtop
  • 07:57 labs-logs-bottie: petrb: added tools-dev to admin host list so that qtop works and fixing the bug of qtop
  • 07:28 labs-logs-bottie: petrb: installing GE tools to -dev so that we can develop new j|q* stuff there

April 25

  • 19:00 Coren: Maintenance over; systems restarted and should be working.
  • 18:18 labs-logs-bottie: petrb: we are getting in troubles with memory on tools-db there is only less than 20% free memory
  • 18:01 Coren: Begin maintenance (login disabled)
  • 13:21 petan: removing local-wikidatastats from ldap

April 24

  • 13:17 labs-logs-bottie: petrb: sudo chown local-peachy PeachyFrameworkLogo.png
  • 11:37 labs-logs-bottie: petrb: created new project stats and cloned acl from wikidatastats, which is supposed to be deleted
  • 11:32 legoktm: wikidatastats attempting to install limn
  • 11:15 labs-logs-bottie: petrb: installing npm to -login instance
  • 07:34 petan: creating project wikidatastats for legoktm addshore and yuvipandianablah :P

April 23

  • 13:32 labs-logs-bottie: petrb: changing permissions of cyberbot and peachy to 775 so that it is easier to use them
  • 12:14 labs-logs-bottie: petrb: qtop on -dev
  • 12:12 labs-logs-bottie: petrb: removed part of motd from login server that got there in a mysterious way

April 19

  • 22:38 Coren: reboot -login, all done with the NFS config. yeay.
  • 17:13 Coren: (final?) reboot of -login with the new autofs configuration
  • 16:24 Coren: (rebooted -login)
  • 16:24 Coren: autofs + gluster = fail
  • 14:45 Coren: reboot -login (NFS mount woes)

April 15

  • 22:29 Coren: also a test; note how said bot knows its place.  :-)
  • 22:14 andrewbogott: this is a test of labs-morebots.
  • 21:49 andrewbogott: this is a test
  • 15:41 labs-logs-bottie: petrb: installing p7zip everywhere
  • 08:00 labs-logs-bottie: petrb: installing dev packages needed for YuviPanda on login box

April 11

  • 22:39 Coren: rebooted tools-puppet-test (no end-user impact): hung filesystem prevents login
  • 07:42 labs-logs-bottie: petrb: removed reboot information from motd


Instances for this project

  Instance Name Instance Type Project Image Id FQDN Public IP Launch Time Puppet Class Modification dateThis property is a special property in this wiki. Number of CPUs RAM Size Amount of Storage
I-000002f1.eqiad.wmflabs tools-webgrid-02 m1.xlarge tools ubuntu-12.04-precise i-000002f1.eqiad.wmflabs base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::webnode
4 April 2014 13:15:20 8 16,384 160
I-000002f2.eqiad.wmflabs tools-exec-09 m1.large tools ubuntu-12.04-precise i-000002f2.eqiad.wmflabs 4 April 2014 12:56:33 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::execnode
4 April 2014 13:15:06 4 8,192 80
I-000002f3.eqiad.wmflabs tools-exec-10 m1.large tools ubuntu-12.04-precise i-000002f3.eqiad.wmflabs 4 April 2014 12:56:55 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::execnode
4 April 2014 13:14:56 4 8,192 80
I-000002e5.eqiad.wmflabs tools-proxy-test m1.medium tools ubuntu-12.04-precise i-000002e5.eqiad.wmflabs 1 April 2014 20:48:56 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::proxy
1 April 2014 20:52:35 2 4,096 40
I-00000274.eqiad.wmflabs tools-submit m1.small tools ubuntu-12.04-precise i-00000274.eqiad.wmflabs 22 March 2014 13:54:22 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::submit
24 March 2014 15:34:17 1 2,048 20
I-000000cb.eqiad.wmflabs tools-login m1.medium tools ubuntu-12.04-precise i-000000cb.eqiad.wmflabs 208.80.155.130 28 February 2014 04:25:41 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::bastion
role::labs::bastion
21 March 2014 12:46:22 2 4,096 40
I-000000d9.eqiad.wmflabs tools-exec-05 m1.large tools ubuntu-12.04-precise i-000000d9.eqiad.wmflabs 28 February 2014 04:37:24 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::execnode
11 March 2014 00:06:08 4 8,192 80
I-000000cc.eqiad.wmflabs tools-dev m1.medium tools ubuntu-12.04-precise i-000000cc.eqiad.wmflabs 28 February 2014 04:28:13 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::bastion
role::labs::bastion
9 March 2014 00:36:43 2 4,096 40
I-000000cd.eqiad.wmflabs tools-master m1.small tools ubuntu-12.04-precise i-000000cd.eqiad.wmflabs 28 February 2014 04:30:15 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::master
9 March 2014 00:36:42 1 2,048 20
I-000000ce.eqiad.wmflabs tools-shadow m1.small tools ubuntu-12.04-precise i-000000ce.eqiad.wmflabs 28 February 2014 04:30:35 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::shadow
9 March 2014 00:36:40 1 2,048 20
I-000000d0.eqiad.wmflabs tools-redis m1.large tools ubuntu-12.04-precise i-000000d0.eqiad.wmflabs 28 February 2014 04:32:28 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::redis
9 March 2014 00:36:39 4 8,192 80
I-000000d1.eqiad.wmflabs tools-mail m1.small tools ubuntu-12.04-precise i-000000d1.eqiad.wmflabs 28 February 2014 04:32:37 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::mailrelay
9 March 2014 00:36:38 1 2,048 20
I-000000d2.eqiad.wmflabs tools-webgrid-01 m1.xlarge tools ubuntu-12.04-precise i-000000d2.eqiad.wmflabs 28 February 2014 04:33:39 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::webnode
9 March 2014 00:36:36 8 16,384 160
I-000000d3.eqiad.wmflabs tools-webgrid-tomcat m1.xlarge tools ubuntu-12.04-precise i-000000d3.eqiad.wmflabs 28 February 2014 04:34:30 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::webnode
9 March 2014 00:36:35 8 16,384 160
I-000000d4.eqiad.wmflabs tools-exec-01 m1.large tools ubuntu-12.04-precise i-000000d4.eqiad.wmflabs 28 February 2014 04:35:02 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::execnode
9 March 2014 00:36:32 4 8,192 80
I-000000d5.eqiad.wmflabs tools-exec-02 m1.large tools ubuntu-12.04-precise i-000000d5.eqiad.wmflabs 28 February 2014 04:35:47 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::execnode
9 March 2014 00:35:57 4 8,192 80
I-000000d6.eqiad.wmflabs tools-exec-03 m1.large tools ubuntu-12.04-precise i-000000d6.eqiad.wmflabs 28 February 2014 04:35:43 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::execnode
9 March 2014 00:35:55 4 8,192 80
I-000000d8.eqiad.wmflabs tools-exec-04 m1.large tools ubuntu-12.04-precise i-000000d8.eqiad.wmflabs 28 February 2014 04:36:59 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::execnode
9 March 2014 00:35:52 4 8,192 80
I-000000da.eqiad.wmflabs tools-exec-06 m1.large tools ubuntu-12.04-precise i-000000da.eqiad.wmflabs 28 February 2014 04:37:46 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::execnode
9 March 2014 00:35:50 4 8,192 80
I-000000db.eqiad.wmflabs tools-exec-07 m1.medium tools ubuntu-12.04-precise i-000000db.eqiad.wmflabs 28 February 2014 04:38:11 base
role::labs::instance
exim::simple-mail-sender
sudo::labs_project
role::labs::tools::execnode
9 March 2014 00:35:47 2 4,096 40
… further results