Research/Archive
< Research
This page is currently a draft. More information and discussion about changes to this draft on the talk page. |
Services and applications
- Revision scoring
- Article recommendations
- project documentation
- Application: https://recommend.wmflabs.org
- Wiki Labels
- project documentation
- Application: https://labels.wmflabs.org
Productization
We defined in coordination with Ops three different stages and the corresponding requirements to turn a service like Revision Scoring into a "productized service", i.e., a service that can be used, on top of its other use-cases, by Wikimedia Foundation's products and features.
Stage 0 (Architectural discussion) | |
---|---|
Horizontal scalability | allow additional load be taken up by simply adding new instances |
Caching | Decide on caching and cache invalidation strategy |
SPOF spotting / planning | Draw out general architecture, find SPOFs and think of ways to mitigate them |
Stage 1 (Implementation) | |
Actually build the code! | This is actual development, start building stuff! |
Staging environment | provide an environment with the same set up as the Production environment, for test purposes |
Deployment system | allow to deploy new changes confident that you can roll them back if they fail |
Puppetized setup | allow spinning up new instances quickly |
Comprehensive logging | identify bugs and errors more easily |
Stage 2 | |
Metrics monitoring | define metrics that should go to graphite-labs.wikimedia.org (examples of such metrics are number of revisions processed per minute, per wiki, % cached, etc.) |
Scale hardening | Things to do to reduce amount of pages - for example, control the acceptance of new web requests when the celery queue is full. (This step is not required for all services). |
Stage 3 | |
API documentation | More comprehensive document endpoints (API) and usage |
WIP
Guidelines around these areas are Work in Progress
- Dependencies: We should define best practices around dependencies - some considerations: installing them system wide vs isolating them, using virtual environments, debianizing packages etc.