Portal:Cloud VPS/Admin/Alerts

From Wikitech
Jump to navigation Jump to search

Alerts possible to WMCS-team (or WMCS-bots as of now):

  • Nova
    • Nova-Fullstack (labnet) - Launch a "full" test of instance creation
    • nova-network (labnet) - handle dynamic NAT and networking gateway
    • nova-api (labnet) - main API gateway for interacting with nova (creation, deletion, etc)
    • nova-scheduler (labcontrol) - schedule and launch instances
    • nova-compute - handles setup and tear down of instances on hypervisor
    • nova-conductor - DB broker for nova components not-nova-api

Glance:

    • glance-api-http (control) - image management for instances

Keystone:

    • projects and users
      • check-novaobserver-membership - Make sure 'novaobserver' has 'observer' everywhere
      • check-novaadmin-membership - Make sure 'novaadmin' has 'projectadmin' and 'user' everywhere
      • check-keystone-projects - Verify service projects
    • services
      • keystone-http-${auth_port} - admin API port avail (little context)
      • keystone-http-${public_port} - public API port (little context)

Designate: <--- can be restarted

    • check_designate_api_process: service api for DNS changes
    • designate-api-http: api external monitoring
    • check_designate_sink_process
    • check_designate_central_process
    • check_designate_mdns`
    • check_designate_pool-manager

Labstore:

    • nfsd-exports - sets up /etc/export.d/ files for instances in cloud
    • interfaces - saturation in/out
    • ldap - there is a scheme to use LDAP for groups w/o having the entire system be an LDAP client.
    • secondary - checks specific to the 'secondary' Tooforge DRBD/NFSd cluster

Toolforge:

   modules/icinga/manifests/monitor/toollabs.pp
    • tools-proxy - reverse proxy for all web tools
    • tools-checker-self - reverse proxy for actual check running. This is to monitoring toolforge from prod icinga atm.
    • tools-checker-ldap - without LDAP Toolfroge crumbles.
    • tools-checker-labs-dns-private - verify resolution for internal DNS from within Toolforge
    • tools-checker-nfs-home - NFS /home test (this is a subpath really of one export for project and home)
    • tools-checker-grid-start-trusty - starting and running a process on grid
    • tools-checker-etcd-flannel - etcd is the backend for flannel which is our networking overlay for k8s
    • tools-checker-etcd-k8s - etcd is the persistent data store for k8s itself
    • tools-checker-k8s-node-ready - check to see if k8s thinks workers are healthy (nods)

Docs:

   * https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Infrastructure
   * https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin