pdf1.wikimedia.org

From Wikitech
(Redirected from Pdf2.wikimedia.org)

The pdf1.wikimedia.org, pdf2.wikimedia.org and pdf3.wikimedia.org servers used to run PediaPress / mwlib software for Collection/Book tool. It was shut down on 2014-10-03 in favour of OCG.

Until rt 1100 is resolved the most up to date documentation is on the PediaPress Wiki

Hardware

This currently resides on pdf1

Software

  • Python mini web server on port 8080 + Python backend
    • mwlib -> backend libraries
    • mwlib.rl -> PDF output tool + mw-serve mini-server
  • swap-watchdog to trigger reboot if/when swap death occurs from memory leak

Usage

  • Starting initial testing end of May 2008

Notes

  • May need some manual cleanup etc.

Setup / Recovery

Dependencies:

apt-get install \
  build-essential \
  python-imaging python-dev python-flup python-setuptools python-simplejson \
  subversion mercurial \
  re2c \
  tetex-bin tetex-extra ploticus \
  mediawiki-math

Add the user mwlib.  This is what you should run the cron job and service under.


Init script:

Get: http://svn.wikimedia.org/svnroot/mediawiki/trunk/tools/mw-serve/mw-serve.sh
Put it in /etc/init.d/mw-serve
Set up appropriate S* and K* aliases in /etc/rc.d/*:

cd /etc/rc0.d
ln -s ../init.d/mw-serve K25mw-serve
chmod 777 K25mw-serve
cd /etc/rc1.d
ln -s ../init.d/mw-serve K25mw-serve
chmod 777 K25mw-serve
cd /etc/rc2.d
ln -s ../init.d/mw-serve S25mw-serve
chmod 777 S25mw-serve
cd /etc/rc3.d
ln -s ../init.d/mw-serve S25mw-serve
chmod 777 S25mw-serve
cd /etc/rc4.d
ln -s ../init.d/mw-serve S25mw-serve
chmod 777 S25mw-serve
cd /etc/rc5.d
ln -s ../init.d/mw-serve S25mw-serve
chmod 777 S25mw-serve
cd /etc/rc6.d
ln -s ../init.d/mw-serve K25mw-serve
chmod 777 K25mw-serve

DNS setup:
  pdf1.wikimedia.org -> pdf1 (or whatever server is running the service.)

mwlib stuff:

Set up directories for the cache and log files:
  mkdir /opt/mwlib
  mkdir /opt/mwlib/var
  mkdir /opt/mwlib/var/log
  mkdir /opt/mwlib/var/run
  mkdir /opt/mwlib/var/cache
  mkdir /opt/mwlib/var/cache/pdfserver
  mkdir /opt/mwlib/var/cache/python-eggs
  mkdir /opt/mwlib/var/cache/pdfserver/
  chown -R mwlib /opt/mwlib/
  chgrp -R mwlib /opt/mwlib/

Install the mwlib release version (into /usr):

  easy_install mwlib
  easy_install mwlib.rl
  easy_install mwlib.ext

Add a cronjob for the cache clearing in /etc/cron.hourly
   touch mw-serve
   vim mw-serve
   Insert the following into the file:
      #!/bin/sh
      su mwlib -c "mw-serve --clean-cache 24 --cache-dir '/opt/mwlib/var/cache/pdfserver/'" www-data >> /opt/mwlib/var/log/cache-cleaning 

Add log rotation job
   '''<- add this info'''

Install [[swap-watchdog]] script into /usr/local/bin
Add to /etc/rc.local script:
   /usr/local/bin/swap-watchdog &

Upgrading mwlib

We're currently running release versions of the mwlib code, which means they're all nicely packaged up for us and can be installed and upgraded via the Python 'easy_install' tool:

  /etc/init.d/mw-serve stop && easy_install -U mwlib && easy_install -U mwlib.rl && easy_install -U mwlib.ext && /etc/init.d/mw-serve start

Software reference

Starting the server

To start:

/etc/init.d/mw-serve start

To stop:

/etc/init.d/mw-serve stop

To restart:

/etc/init.d/mw-serve restart

Should work. :D

Logfiles and pidfiles in /opt/mwlib/var/* need to be writable by www-data, which the daemon gets run as.

Maintenance and cleanup

Currently the daemon doesn't appear to do live garbage collection of cached output; it can be done with a batch script running mw-serve and some extra options. This isn't quite set up yet, so be warned the server may run out of space (bad!)

Also the logs should be rotated...