ServiceOwnership

From Wikitech

As a result of the mobile gateway at en.m becoming more and more popular it has become necessary to draft some guidelines about what it is required from our dedicated operations staff to better support the service. The role of these guides would be a quick go to for anyone needing to administer the service. Separate pages could be used for anything more specific.

At a bare minimum the following criteria would be required:

Software

  • Software Stack
    • OS if relevant
    • LAMP
    • Scary Salaris

Should be brief as this will be a operations guide not a detailed service spec

Hardware

  • What hosts do we live on?

Links should follow to the hardware pages of the servers

This can get out of date really quickly

Usage

  • Who are the clients?

Health

  • Be able to easily identify the state of the application and if its meeting its sla's

Capacity

  • Can capacity be easily added and if so how?

Dependencies

  • Clearly identify what other services this piece of software requires and what other services require it
    • What happens if the service is intermittent or goes down hard

Start/Restart

  • How do we bring it up and down
  • How do we do migrate quickly if the server is on fire

Configuration

  • Where do they exist live and in source control
  • What (additional) actions are needed to make configuration changes go live

Typical Problems

  • What commonly breaks
    • And how are we fixing it

this should not be a section of this sucks and we just live with it

Activity

  • How do I see running activity of the service
    • Logs
    • Process stats
    • Black Magic

Monitoring

  • How do I track the history of the services lifetime

Backups

  • How are backups of any data handled, next to the configuration of the service itself.

Escalation Path

  • Who do I call when the problem is far more then what I can handle

Reference Links

  • Pointers to more detailed installation/configuration instructions if needed

Template is available at Template:ServiceOperations