Tool:DrTrigonBot/HowTo and Information

From Wikitech

General Information

HowTo's

Login

$ ssh drtrigon@tools-login.wmflabs.org
drtrigon@tools-login:~$ become drtrigonbot
local-drtrigonbot@tools-login:~$

Copy files

We want e.g. to copy directory ~/source to tools-login.wmflabs.org:/data/project/drtrigonbot/dest under the ownership of local-drtrigonbot.

Login and create temporary directories:

$ ssh drtrigon@tools-login.wmflabs.org
drtrigon@tools-login:~$ mkdir temp
drtrigon@tools-login:~$ cd temp
drtrigon@tools-login:~$ mkdir dest
drtrigon@tools-login:~$ exit

Copy the files to the server (by using scp):

$ scp -r ~/source/* drtrigon@tools-login.wmflabs.org:~/temp/dest

this might take some while. Attention this might miss .files may be you have to use

$ scp -r ~/source drtrigon@tools-login.wmflabs.org:~/temp

Login again and move them to the project:

$ ssh drtrigon@tools-login.wmflabs.org
drtrigon@tools-login:~$ cp -r ~/temp /data/project/drtrigonbot
drtrigon@tools-login:~$ ls -la /data/project/drtrigonbot

as you can see the files are there but under the wrong ownership. Some of them might have given read and even write permission for the group as well, but just to make sure enter:

drtrigon@tools-login:~$ chmod -R g+rw /data/project/drtrigonbot/temp

from here you can also start using e.g. mc (midnight commander).

Adopt ownership and finalize:

drtrigon@tools-login:~$ become drtrigonbot
local-drtrigonbot@tools-login:~$ cp -r ~/temp/dest ~

copying changes also the ownership, and everything is ok now. Important in this step is to copy and not move the files in order to create new files under correct ownership.

Clean up:

local-drtrigonbot@tools-login:~$ rm -r ~/temp
local-drtrigonbot@tools-login:~$ exit
drtrigon@tools-login:~$ rm -r ~/temp

or use mc for clean-up, which is less risky!

Debugging

Multiple steps are listed here, go through them starting from the first and continue until the bug is solved.

  1. check what error occurred and where, please use the panel (see status under DrTrigonBot#Information) for this purpose (confer #Status explanation) or the web tools (debug) output
  2. check SGE online (see server under DrTrigonBot#Information) and if this does not help by #Login and using qstat as well as qacct according to Nova Resource:Tools/Help#Submitting, managing and scheduling jobs on the grid
  3. ...

Status explanation

actual:

ok everything ok, bot operates as expected (not running at the moment)
  • message: 'DONE'
  • logfile changed within last 1d
n/a this means the bot is/has either:
  • running at the moment (default for continuous running bots)
  • finished with an error (but should start again next time)
  • in an unknown state that might need attention (e.g. crashed silently)
  • message: any
  • logfile changed within last 1d
off bot down resp. switched off
  • message: any
  • logfile has not changed for more than 1d

proposal for new one (not used at the moment) that should make debugging (especially for 'n/a' states) easier and thus faster (debugging is the main purpose):

usual operational states
ok everything ok, bot operates as expected (not running at the moment)
run the bot is running at the moment (default for continuous running bots)
issue and error states
n/a the bot is in an unknown state that might need attention (e.g. crashed silently)
err the bot has finished with an error (but should start again next time)
off and stand-by states
off bot down resp. switched off (no run nor started since some time)

Alternatives for 'n/a': , / commons:Category:Dots/Pog-Set / May be swap 'n/a' and 'err' colors... what is more severe? What is the time order during run?

Needs to change some of the interpreting (e.g. subster) codes because of image changes: off → err. Even worse all image links have changed to the .svg.png variant.

bots architecture [partly obsolete]

Using Agent Forwarding

$ eval `ssh-agent`
$ ssh-add ~/.ssh/your_key_file_for_labs
$ ssh -A drtrigon@bastion.wmflabs.org

#once in bastion
$ ssh bots-4
#or
$ ssh bots-4.pmtpa.wmflabs

Using ProxyCommand

Help:Getting Started#Using ProxyCommand

Logs / Logging

Every project has a Server Admin Log. The logs are combined together in a single Server Admin Log that displays logs for all projects, ordered by most recent activity. You can add a log entry in the #wikimedia-cloud connect IRC channel by using the following command:

!log <projectname> <message>
!log bots <message>

or from terminal:

$ log text

It's best to log all changes to the instances in your project. Doing so makes it easier for others to follow your progress, and to more easily join your project and help out.

Server (OS)

Install software packages in root/sudo environment (e.g. bots-4) from shell:

sudo apt-get install ...

Install software packages in non-root/sudo (e.g. bots-apache01) through gerrit, check bug 53704.

(git clone ...)
(adopt according sub-bugs of bug 53704)
(push for review)
(notify somebody and wait)

References