Search Platform/Weekly Updates/2023-08-25

From Wikitech

Summary

Search Update Pipeline is on track to a deployment on production for end to end testing, but it is unlikely that a full deployment on all wikis will be done before the end of Q1.

All major decisions in terms of deployment dependencies have been decided, with approval from supporting teams (which kafka cluster to use, what is the shape of events, storage of state in Swift / Zookeeper). We are experimenting with keeping a decision log on wiki: https://wikitech.wikimedia.org/wiki/Search/Update_Pipeline#Decision_records. Major thanks to Service Ops, Data Persistance and Data Engineering teams for their support!

What we've accomplished

Search Update Platform

Operations / SRE

  • Multiple reboots for Bullseye and Buster - https://phabricator.wikimedia.org/T344671 / https://phabricator.wikimedia.org/T344587
  • Migration to Debian Bullseye for W[CD]QS servers (29 / 35 done), wdqs[1003-1007,1009].eqiad.wmnet still needs migration, all other servers are completed - https://phabricator.wikimedia.org/T343124
  • Failure of the update pipeline for WCQS, related to someone editing the Main_Page which is in the entity namespace. Situation is currently resolved, but further work is necessary to ensure that this does not happen again - https://phabricator.wikimedia.org/T344882
  • Lexeme dump failed this week, which failed some downstream processing. This has been manually recovered. No further action will be taken to improve stability, we're waiting on the general work on dumps.
  • Elasticsearch reindex was started 10 days ago, but has only processed up to enwiki. This is much slower than usual, some investigation is needed. Note that this does not impact users, but reduces our ability to deploy changes to analyzer chains.

Misc

  • Support data analysis for the ChatGPT plugin to help find what topics are being returned.