Search Platform/Weekly Updates/2023-02-24

From Wikitech

Summary

We finally have a successful data reload for WDQS. It still needs to be copied over to all servers. This process is still fragile and problematic: it took 2.5 month to complete a task that should take at most a couple of weeks.


What we've accomplished

Spark 3 migration

  • RDF (Java / Scala) jobs are running on Spark 3. This was one of the more complex job to be migrated - https://phabricator.wikimedia.org/T327381
  • Migrated drop_old_data_daily from Airflow 1 to 2, needed some additional fixes to date handling.

Search Update Pipeline

Operations / SRE

Misc