Obsolete:Bots project documentation
All of this was at Nova_Resource:Bots/Documentation but that Labs project is obsolete now, replaced by the Tools project (Tool Labs).
- 1 Architecture
- 2 How to install and run your bot
- 3 How to request a project membership
- 4 How to add a project member
- 5 MySQL
- 6 Requesting packages to be installed
- 7 Getting help
- 8 Servers
- 9 IRC bots
- 10 Pywikipedia bot framework
Bots run on application servers. They can write to shared gluster storage at /data/project/userdata or shared NFS server at /mnt/secure. An apache server serves bot user directories from /data/project/public_html. Bots also have mysql storage available at mysql-bsql01 - every user has an account on this mysql server and can access it by typing mysql -h bots-bsql01. In case you need to create a database there, execute:
in mysql. The user who created the new database will have all the SQL rights.
Please note: user accounts are created by a job which is cronned every hour. If you have a newly created account on bots, it may take a bit for you to have access to SQL. Please wait. If your account doesn't work for more than 50 minutes, you should contact a sysadmin.
- bots-bnr(1,2,3) - application node of grid
- bots-4 - testing instance
- bots-cb - cluebot only
Each user has a directory in /data/project/public_html on each app server which can be accessed from outside via bots.wmflabs.org/~user.
How to install and run your bot
Use Oracle Grid Engine to schedule your tasks. Or you can just manually run it on a random application server.
You need to install your bot to a folder that is accessible on all instances, such as your home or /data/project
Once you install your bot there, you need to create a start script. It can be a simple shell script:
#!/bin/dash # replace all variables for your need logfile=/data/project/mybot/syslog echo "Started bot on `hostname` at `date`" >> $logfile #here is a startup for your bot - example is bellow, commented #mono /data/project/mybot/bot.exe >> $logfile 2>&1 echo "Stopped the bot at `date`" >> $logfile exit 0
This simple script will start the bot. After it is finished, it will exit. Save it as start.sh, chmod a+x start sh and you can schedule this script by doing this from bots-gs:
qsub -q main.q /data/project/mybot/start.sh
Now your bot should be running on the application nodes.
will tell you if it runs or not. If it doesn't run you, won't see anything - check $syslog to see why it didn't run:
# in our example $syslog was set to this file cat /data/project/mybot/syslog
To see an overall summary of the current state of the grid use the command
which displays the number of jobs, loads and any broken queues.
If you are running an interactive bot, and it must run on a server with a minimal load (so that it responds quickly and isn't affected by a lagging system):
qsub -q minimalload /data/project/mybot/start.sh
Running by hand
You can install the bot to any storage, including /mnt/share (local storage, shared among users of server). Then, you can just run it as you like.
How to request a project membership
You need to ask a current member. It's easiest just to go on irc and ask there. Everyone who demonstrates even a modicum of trust can have access.
How to add a project member
Here is a rough guideline for project users for adding new project members.
- Ask nicely what the user wants to do in the bots project. (e.g. what bot he wants to run, etc)
- Add the user to the project using Special:NovaProject. (scroll to the bots project)
... and you are done! Note that the public_html directories, mysql login and /mnt/secure are automatically created via a cron on bots-apache01 and cron on bots-secure.
In case you want to use mysql, there is a shared server called bots-bsql01 - there is a script cronned every hour, that creates accounts for all users who don't have them.
mysql -h bots-bsql01
Will get you there.
mysql> call system.create_db("quack");
Create a new db "quack" and give you all grants for it
There are NO LIMITS for users in this moment. That means you can create as big of a database as you need and your queries can use as many system resources as you need. Just don't abuse it, or we will need to change it. ;)
Requesting packages to be installed
If you need to install a global package on bots cluster, you have a few options:
- Ask a sysadmin - we live in #wikimedia-cloud connect. A list of active sysadmins can be found in motd of each instance.
- Create a puppet class for it.
The bots cluster is community / wmf maintained, the list of sysadmins can be found in motd - Real Name (nickname). If you need a sysadmin for anything, just go ahead and ping one on irc.
- bots-gs - Grid scheduler - this is server you should control the grid from
- bots-bnr(1,2,3) - Grid node (4core, 8GB ram)
- bots-ibnr(1) - Grid node (smaller)
This server runs:
- DrTrigonBot by DrTrigon
- rewrite part, see log: rewrite
- trunk part follows (currently still on TS)
- BeneBot* (.NET) by Bene*
- FIXME by Fastily
- JohnFLBot by John F. Lewis
- VoxelBot by Fox Wilson and Vacation9
This is our main public web server. No need to log in to this server to change files in your http://bots.wmflabs.org/~<user>/ directories, as it is already shared across all servers at /data/project/public_html.
A shared SQL server (8core, 16GB Ram)
This server is scheduled for removal. Please don't use it.
This server powers Cluebot, an antivandal bot on the English Wikipedia.
This server runs Labs-specific bots (i.e. labs-morebots, etc.).
- LinkWatcher (m:User:LiWa) is a bot that for every edit parses out link additions, reports them to freenode and stores them in a database on bots-sql2
This server runs Salebot, an antivandal bot by User:Gribeco on frwiki and ptwiki.
- needed for locals in some bot scripts
Another IRC logging bot, written in python. labs-morebots is the instance providing the !log command. analytics-logbot is another instance.
It runs on bots-labs.
There is an init script for it called "adminbot". (/etc/init.d/adminbot).
It needs /var/run/adminbot as a cache directory and permissions to write to it. It also needs working LDAP to fetch the project list.
- Connect to bots-labs and /etc/init.d/adminbot start (or restart, check before if the process is running)
- Bot dies on !log command
check for the existence of the cache directory described above and ensure the bot user can write to it
- Bot says <x> is not a valid project
Either you misspelled a project name or there is an LDAP connection issue. Does it also say "Can't contact LDAP for project list"? If yes, check with ops for possible LDAP and/or NFS issues.
Pywikipedia bot framework
For operators of Python bots, snapshots (updated daily) of the Pywikipedia framework trunk and rewrite versions are maintained at /data/project/pywikipedia/trunk and /data/project/pywikipedia/rewrite, respectively. Note that these are just the source files; each bot operator will need to create its own configuration files, such as user-config.py, and set up its PYTHONPATH and other environment variables.