SRE/Dc-operations
SRE Data Center Operations
DC Operations | About | Projects & Workboards | IRC: #wikimedia-dcops connect
HW Troubleshooting | HW Specific Documentation
This landing page will serve as an index for DC Ops team specific Wikitech pages. In addition to this landing page, Data Center Operations maintains a landing page in Phabricator.
SLAs
We make every attempt to resolve all tasks and requests in a timely manner. We've implemented the following SLA targets.
Please note none of these start until both the clarified start time and with proper project tags. See details for each type of task request in their section below. Please use templates listed below.
Project | Business Days to Resolve | SLA start |
---|---|---|
Procurement | 90 | Date of Task filing |
Racking/Installation | 30 | Arrival of Hardware to DC site |
Hardware Failure / Repair | 10 | Date of Task filing |
Decommission | 45 | When all sub-team steps are complete and task is assigned to on-site |
Other Info
Common Data center Specifications
SRE/Infrastructure Naming Conventions - shared SRE Department document of host-name standards.
Insetup roles per team:
- Core Platform Team: role::insetup::core_platform
- Data Engineering: role::insetup::data_engineering
- Data Persistence: role::insetup::data_persistence
- Infrastructure Foundations: role::insetup::infrastructure_foundations
- Machine Learning: role::insetup::machine_learning
- Observability: role::insetup::observability
- Search Platform: role::insetup::search_platform
- Serviceops: role::insetup::serviceops
- Traffic: role::insetup::traffic
- WMCS SREs: role::insetup::wmcs
Runbooks
- Hardware Troubleshooting Runbook
- Securely Erasing Media
- Rebuilding SW Raid with SGDISK & MDADM
- Platform Specific Documentation - vendor specific hardware details (Dell, HP, OpenGear, ServerTech)