Jump to content

Search Platform/Weekly Updates/2025-03-21

From Wikitech

In Progress

Abandoned Fulltext Queries

  • T375554 Classify fulltext search abandonment: English, French, Spanish
    • Looking at these again. Struggling to find meaningful interpretations of user behavior and categories to group them in, but I have been getting some ideas by re-searching like the user did and re-enacting their clicks and page visits
    • Chugging through English queries. I've developed an ontology for aspects I can generally figure out, including features of the query, aspects of user behavior, some aspects of user intent. If all goes well, I hope to be done reviewing and tagging the English sample tomorrow.

MLR Improvements

  • T388549 Vector Search PoC
    • setup a local testbed (opensearch 1.3) that indexes Article Topics vectors. I'm comparing results of vector search with morelike recoummendation. WIP code: https://gitlab.wikimedia.org/gmodena/vector_search
    • Open question (maybe for the wed meeting?): how do we evaluate results?
      • I'm working on building a benchmark dataset of morelike queries using CirrusSearch request logs.

Misc / Operations

Done

Elasticsearch -> OpenSearch migration

Airflow -> k8s migration

WDQS Graph Split

Misc / Operations