Obsolete:Ms1 troubles

From Wikitech
This page contains historical information. It may be outdated or unreliable.

Mysterious extra load of some sort on ms1?

  • NFS very slow
    • -> Apaches getting bogged down by processes which needed to access the files -- "very slow"
    • -> Image scalers getting insanely bogged down, unable to do anything
  • Turning off the web server on ms1 relieved the pressure.
    • -> Old images still served from cache
    • -> New images just not loading, though...
    • -> NFS things now loading fast, so regular Apaches fine.
  • Attempted to switch in lighttpd from the Sun Java Web Server that was in there; had lots of trouble.
    • Served files out, but very ssssllloowwwwlllyyyy.
    • Connections filled up quickly.
    • Had mysterious error message (tried patching it), unsure if it was relevant.
  • Attempting to switch webserver7 back in just met with slamming the load and breaking everything again.
  • After some fiddling back and forth, eventually...
    • Switched styles back to Apaches & disabled CentralNotice temporarily to reduce load
    • Waited until after peak and switched back to JWS
  • It was pretty harsh for a few minutes, but then settled down.
  • Not really sure what was wrong. :(


Diagnosis tools we want:

  • "web top" -> a good tool to filter the live squid proxy logs to see what the top hits are, top requestors, etc and see if we can identify suspicious input patterns. This is very hard to get from just grepping the log raw.