Search Platform/Weekly Updates/2023-12-15

From Wikitech

Summary

We're fixing a few more bugs on the Search Update Pipeline and improving our understanding of operations.

Graph split is well on track, with data loaded into test servers and and the starting point of a test framework.

Next week is expected to be extremely low key, with multiple team members already starting their end of year vacation.

What we've accomplished

Search Update Pipeline

Improve multilingual zero-results rate

  • Reviewing real world data for ICU tokenizer and the required repairs. Most results are good, but we might need to exclude some specific languages or scripts.
  • Thinking about internal representation of allowed and denied language combinations for ICU token repair and how to specify them in the config - https://phabricator.wikimedia.org/T332337

WDQS graph splitting

Operations

Misc