Search Platform/Weekly Updates/2023-03-31

From Wikitech

Summary

We are wrapping up the quarter, doing some clean up and getting ready for next quarter. Overall, I'm very happy with what we got done this quarter. We were overly ambitious with our Search Update Pipeline goal, which was identified early in the quarter.

Some highlights:

  • We are well over our SLO for WDQS uptime over the last 90 days (https://grafana.wikimedia.org/d/l-3CMlN4z/wdqs-uptime-slo). We still have a few struggles with the official SLO dashboard and its performance (https://grafana.wikimedia.org/d/slo-WDQS/wdqs-slo-s).
  • The plan around scaling WDQS has been communicated (https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_update/March_2023_scaling_update), with 2 office hours meetings with our communities. This raised very important questions around the possibility to optimize how we use item descriptions, which could help to significantly reduce the amount of data stored in WDQS.
  • More analyzers have been unpacked, including Japanese, Armenian, Latvian, Hungarian, Bulgarian, Lithuanian, Persian, Romanian and Sorani. Turkish and Brazilian are the only 2 analyzers left to unpack. The main goal of this work was to have a coherent way of configuring analysis chains across languages, which will allow us to apply future improvements across all the languages we support. A side benefit of this work is that when we unpack an analyzer, we already introduce some standard behaviour which improve the language support. You can review the full unpacking notes for more details about individual language improvements: https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Unpacking_Notes
  • The migration to Spark 3 is completed (on the last day of the quarter - we've never been that precise on any estimation). This was a good collaboration exercise with the Data Engineering team. We fumbled a little bit with coordination at first, but the results are there. We are now running all of our DAGs on Spark 3 and on a brand new Airflow 2 instance. This should help Data Engineering to keep their platform up to date, and help us reduce our operating burden as this new Airflow instance is aligned with what Data Engineering is providing and the operation is handed over to the Data Engineering team.

What we've accomplished

Search - Analysis

Spark 3 Upgrade

Search Update Pipeline

Operations / SRE

Misc

  • OKRs for Q4 are almost ready, pending a final review by the team.
  • WDQS Stabilization plan has been communicated, with 2 follow up office hours to answer questions. One of those was in an Australia compatible time, which was very well received by a few Australian community members (it is too rare that we schedule office hours at a compatible time for them).
  • WDQS Office hours raised an interesting data duplication issue that might be reasonable to address. A number of item descriptions could be automatically inferred instead of being entered into Wikidata. See https://phabricator.wikimedia.org/T303677