Data Engineering/Show your work

From Wikitech

When working remotely as a team, it is especially important to 'show your work'. At Wikimedia tech, we have several channels for collaboration: Phabricator, code review (gerrit or gitlab), Slack, IRC, email, mailing lists (with public archives) wikis (mediawiki.org, wikitech.wikimedia.org, meta.wikimedia.org), Miro boards, Asana, Google docs, calls and meetings, etc. It can sometimes be difficult to follow all of these channels, and it can be even more difficult to find past work and decisions made.

This page describes how we at Data Engineering (and perhaps more broadly for WMF Engineering) should use these channels to effectively collaborate with each other.

tl;dr

  • Showing and documenting our work should be part of doing that work.
  • Phabricator and Wikis are the only valid place to document our work.
  • Google docs, Slack, IRC, email, Miro boards, Asana, meetings and code review are only for collaboration, not for documentation.

Why

  • Having the context allows for a task to be handed over to someone else, or if it is paused, to be resumed
  • The task will show up in search results on certain keywords when others have similar issues or want to know more about topics
  • Highlights scope of work for current and future planning
  • Shows how close the task is to completion for planning purposes
  • Allows others to join the conversation

Where

Wikis

wikitech.wikimedia.org and mediawiki.org are the canonical places for technical documentation. All final documentation should on a wiki. These wiki docs can and should link to Phabricator tasks for folks who want to see the context around how decisions were made and solutions implemented.

Phabricator

Phabricator should be used for project and task tracking, as well as the main place to document our day to day work. When working on a Phabricator task, we should use it as a place to post status updates of how the task is going, as well as a discussion log between people working on the task. Because Phabricator conversations are public and available forever, it allows people in the future to find the context of how and why certain work was done.

Code review

Code review should be used for directly reviewing and collaborating on code that we commit. Sometimes, conversations in code review also end up being the kind of conversations someone not involved in the code review might want to follow or find in the future. When this happens, a summary of the conversation should be posted on the Phabricator task.

Realtime and email

The same is true for email, Slack, IRC, and meetings. Realtime collaboration is necessary, but conversations in Slack and IRC rooms are siloed and difficult to find for posterity. Summaries of important conversations and decisions made should be posted on the Phabricator task.

Avoid direct messages for discussing and deciding on work. Direct messages are useful when you want to discuss sensitive things, or perhaps early vetting of an idea. Because direct messages are private, using them to make decisions hides the context and complexity involved in making a decision, and removes the opportunity for someone with more context to weigh in. Instead, use a relevant public IRC room, a shared Slack workspace, or even better a Phabricator ticket.

Google docs

Google docs can be used for technical design documents and project planning, but ultimately the final status of design documents should be put on either wikitech or mediawiki.org. Once moved to a wiki, the google doc should have header with a note that the final doc has been migrated to a wiki. The wiki can link back to the google doc as well.

Example

Let's take a real example of a Phabricator task where the folks involved 'showed their work'.

T321088 - Add support for jupyterhub on conda-analytics

  • This task's description of work is clear. This clear description made it possible for concerned users to ask questions.
  • The code review has conversations about the code itself, but those conversations don't reflect anything meaningful to the task in question.
  • After some code review has happened, Xabriel summarizes the status and next steps. This allows anyone following along to quickly understand.
  • Xabriel documents a testing plan that he discussed elsewhere with Ben. there was realtime back and forth to draft this plan, but the final version was posted to Phabricator. This might allow a future engineer to find out how this had been done in the past, and repeat it.
  • While deploying, an problem was encountered. The problem is described and a workaround is noted. Ben later fixes the problem and describes how.
  • Another user following along was able to refer to a past discussion in a different Phabricator ticket, in order to make an improvement to this work. If this past discussion was not in Phabricator, it would have been harder to summarize and link to the context.
  • Ben and Xabriel continue to collaborate and document their work on Phabricator. The next time this kind of work needs to be done, or if someone needs to figure out how and why this was deployed in this way, they will have the information they need.