Jump to content

Search Platform/Weekly Updates/2026-01-09

From Wikitech

Highlights

CirrusSearch dumps (contents of OpenSearch index) are no longer sourced via MW maintenance script. Instead, we use AirFlow/Hadoop/Spark to source the dumps, similar to the content dumps (dumps 2.0).

Search via Action API now supports natural sorting by title. However, UI support on Special:Search is still pending.

Besides that, work on WE3.1.17 (semantic/vector search) continues. So far, we have processed enwiki, dewiki, and simplewiki with different LLMs and tokenization strategies. Thanks to ML we are able to offload the creation of embeddings for content and queries to a dedicated service on lift wing.

Shipped: did we release anything this week?


Blockers: does this essential workstream have any unresolved blockers or dependencies? is anything preventing us from doing our work?

N/A

Lessons learned: Did we learn anything in the course of doing this work that can be applied to other work?

N/A

Community collab: Did we do anything in this essential workstream this week in collaboration with the community or because the community asked us to?

N/A

What else was accomplished in this essential workstream this week?