User:Ottomata/Storage for serving wish
Appearance
2025 Storage for serving, what Andrew wishes for.
(These are from rough notes I took in a meeting)
- Storage for serving team(s) (Data Persistence, etc.?) does platform product management to understand the most important product needs for storage, ideally even before storage is requested.
- Each year (or quarter)? DP reviews OKRs to look for dependencies. They then reach out to teams early in order to understand commonalities of things that will be requested, and to prioritize work to fill platform gaps.
- Similar to FY 25/26 APP Review for Second Order Dependencies on DPE (copy from Virginia's original)
- Storage for serving teams operate a short menu of standardized storage technologies. These storage systems must be read accessible by MediaWiki, either directly or via something like Data Gateway.
- E.g. Maybe WMF provides:
- “NoSQL” (Cassandra) ✅
- RDBMS (MariaDB) ⛔
- Search (OpenSearch) ❓
- Simple K/V store ⛔❓
- Unstructured / object / file store (Ceph) ⛔
- NOTE: ^ these are just possibilities. Actual menu and techs are up to Storage for serving teams.
- E.g. Maybe WMF provides:
- Data Engineering works with Storage for serving teams to automate (“derived”) data transfer to storage systems, in batch and/or realtime fashions (as needed).
- Existent automated data transfer capabilities:
- Hive -> Cassandra insert - batch ✅
- Page Search index tags updates - batch & stream ✅
- …anything else is bespoke.
- Existent automated data transfer capabilities:
- Storage for Serving Intake process prompts developers to:
- Start from feature requirements and/or query patterns.
- Ask about granularity, timeliness needs, write patterns, etc.
- Include an “expiration date” for the data product. This can be extended by product team at any time, but allows SRE to remove data & services if a product is no longer owned.