User talk:Giuseppe Lavagetto/MicroServices

Rendered with Parsoid
From Wikitech
Latest comment: 9 years ago by Subramanya Sastry in topic Steve Yegge rant

Development

All “upstream” services needed to run a specific service should be deployable automatically on the developer’s workstation, preferably via vagrant (as we do currently with mediawiki-vagrant). The number of “upstream” services cannot thus be big, or we’d ask developers to use supercomputers - this also works well with the suggestion of avoiding horizontal layers.

Not necessarily true, services can be deployed to vagrant in "mock" mode, emitting canned responses. Once you want to work with a service you "override it" by a version that is able to return real data. Nuria (talk) 16:51, 4 February 2015 (UTC)Reply

Monitoring

Also, a global “transaction ID” should be attached to any request from a user, and be propagated along in the cluster, and be logged.

+10.000 cannot stress out how important this is. Nuria (talk)

As part of this effort, teams must own their services from the top down, and should be the first level of oncall paging for their own services - this is by the way already happening at least for parsoid, so we’re on a not-so-bad track with this.

Agreed. This is crucial.

References

mmm... not sure how Amazon's problems in 2003 migrating to SOA would apply to us, Amazon's scale at the time in terms of building dynamic pages -even then - was orders of magnitude bigger than Wikipedia's is at this time as most of our content is cached with a high TTL. Now,I think something that applies to us from Steve's article is that for Amazon the hardest part was not developing the services but rather it was developing the dependency system, deployment and monitoring system to keep 'service building' separated from 'service infrastructure'.Nuria (talk)
I tend to agree that our scale is nothing comparable in terms of traffic on the backends (luckily...) but I think those principles still apply to us - like the risk of Dossing one another; parsoid jobs regularly almost killed our API cluster pre-HHVM for instance. Giuseppe

Steve Yegge rant

Thanks for that link .. was interesting and fun reading! Subramanya Sastry (talk) 03:12, 11 February 2015 (UTC)Reply