Search Platform/Weekly Updates/2024-04-19

From Wikitech

Summary

We had an incident on WDQS in the codfw datacenter, which required manual investigation and intervention, taking up significant time this week. The user impact was mitigated by depooling this data center and serving all traffic from the eqiad datacenter.

A few highlights for the week:

  • Communication about the graph split was sent, we're waiting for feedback until May 15, after which we will freeze the graph split strategy
  • We have a few stability fixes to the Search Update Pipeline. It has been deployed on all wikis on Cloudelastic and we will shortly start enabling it for the production search clsuter.

What we've accomplished

Improve multilingual zero-results rate

WDQS Graph Split

Search Update Pipeline

Operations

  • Incident on WDQS@codfw where the streaming updater failed and reverted to an expensive reconciliation process for all changes, reducing the overall throughput enough that the updater could not catch up with the edit rate. We don't have a full understanding of the root cause, but we do have a few improvements to the general stability of the system - https://phabricator.wikimedia.org/T362508