User:Russell Blau/Using pywikibot on Labs

From Wikitech
Jump to: navigation, search

This page describes my process for setting up a tool to run the core branch of pywikipediabot on the 'Tools' project.

The core branch (old name: rewrite) files are available at /shared/pywikipedia/core (Note: the trunk branch is also available, but I'm not using it so this page doesn't describe it. Also, the techniques described here should work on the 'Bots' project, but the pathname formats there are different at this writing.)

In the following examples, "local-tool@tools-login$" is the system's shell prompt; what you actually see on your screen will use your actual tool name.

On Tools: See Nova Resource:Tools/Help and User:Magnus Manske/Migrating from toolserver‎ to get an account and set up a tool project. Once you have done that, "become toolname".

Step 1. Set path variable.

The easiest way to do this is with a file called .bash_profile in your home directory. Use emacs, vi, or your other favorite editor to create this file (or edit it if it already exists). Add the following line to the file (this should be all on one line, even if your screen width makes it appear to be multiple lines):

export PYTHONPATH=/shared/pywikipedia/core:/shared/pywikipedia/core/externals/httplib2:/shared/pywikipedia/core/scripts

Save the file. This will update your settings for all future shell sessions, but you also need to import these settings into your current session by typing:

local-tool@tools-login$ source .bash_profile

Step 2. Create a subdirectory in your home directory for bot-related files:

local-tool@tools-login$ mkdir .pywikibot

Don't leave out the '.' at the beginning, it's important!

Step 3. Install user-config.py.

If you already have a user-config.py that works for you on another system, you can copy it into your .pywikibot directory. (If you have a user-fixes.py file that you want to use, you can also copy that to the same directory now.) If not, you can enter the following command at the main prompt:

local-tool@tools-login$ python /shared/pywikipedia/core/generate_user_files.py

Then follow the prompts to create the file.

Step 4. TEST.

At this point, you should be able to run the following command successfully:

local-tool@tools-login$ python /shared/pywikipedia/core/scripts/version.py

This should print something similar to the following on your terminal:
Pywikibot [http] branches/rewrite (r11526, 2013/05/12, 18:51:23, OUTDATED)
Python 2.7.3 (default, Aug  1 2012, 05:14:39)
[GCC 4.6.3]
unicode test: ok

The remaining steps are optional, but will make it much easier to run bot scripts from the shell prompt or from SGE jobs.

Step 5. Password file.

If you want your bot to run unattended, this is essential. Create a file in your .pywikibot directory; you can call it whatever you want, but ".passwd" or "passwords" is probably easiest. Use the text editor to create the file. The format for this file is defined as follows --
       All lines should be valid Python tuples in the form
       (code, family, username, password) or (username, password)
       to set a default password for a username. Default usernames
       should occur above specific usernames.

       Example:

       ("my_username", "my_default_password")
       ("my_sysop_user", "my_sysop_password")
       ("en", "wikipedia", "my_en_user", "my_en_pass")
Important: After you save the file, you must run the following command to prevent other users from reading your password!

local-tool@tools-login$ chmod 600 .pywikibot/passwords

Replace "passwords" with whatever name you gave your password file.
Then you must also edit your .pywikibot/user-config.py file; add to the end a line like:

password_file = "/data/project/toolname/.pywikibot/passwords"

Again, replace "passwords" with the name you gave your passwords file. If you copied user-config.py from another installation, it may already have a "password_file" line in it, which you'll need to delete and replace with the format shown above.

Step 6. Link to scripts.

This step creates a shortcut to the directory that contains the standard bot scripts; you don't absolutely need this, but it saves a lot of typing. Without it, you have to type out the full /shared/pywikipedia/core/scripts every time you want to run a script. Enter the following command:

local-tool@tools-login$ ln -s /shared/pywikipedia/core/scripts

Now instead of python /shared/pywikipedia/core/scripts/replace.py, you can just type python scripts/replace.py.
Another approach, which can be even easier, is to edit the .bash_profile file from Step 1 above. Add a line like the following (or, if you already have a PATH= line, edit it to include the path to the scripts directory):

export PATH=$PATH:/shared/pywikipedia/core/scripts

By doing this, you can simply type replace.py at the shell prompt, instead of the longer command lines shown above.

Miscellaneous

The last five dumps of all Wikipedias and other projects are located beneath /public/datasets/public (cf. Help:Shared storage#Public datasets).

See Nova Resource:Tools/Help#The grid engine for info on how to submit jobs. You shouldn't be running your bots from the command line, except for trivial tests as in Step 4.