Search/UDP Logger

From Wikitech
Jump to navigation Jump to search
This page contains historical information. It is probably no longer true.

UDP Logger

There is a UDP logger running on searchidx1 that is collecting mesg's from each one of the search servers. All search queries are getting logged into the following format:

db timestamp search

The udp logger was added in r51097 and r51324.

Host

searchidx1

Storage

Files are written to

/a/search/udplogger

Scripts

(start/stop)logger.sh - start/stop the process
udplogger.py - main worker for collection

Logs

All queries are written to searchqueries including public/private wiki's

Additions Needed

  • Public/Private filter - add a script to filter
  • Log rotation/archiving - rainman_: now, ideally i think.. there would be another script that would at midnight: 1) run stoplogger.sh 2) rename searchqueries to searchqueries-CURRENTDATE 3) upload/move this file to some public location 4) run startlogger.sh
    • Wrote a quick log splitter to break up the huge 40GB query file into day chunks. Threw away private searches and archived them into 7za format and will put them on download after getting approval . --Tomasz Finc 23:57, 31 July 2009 (UTC)Reply[reply]
  • Publish data to download.wikimedia.org