Search Platform/Weekly Updates/2025-03-21
Appearance
In Progress
Abandoned Fulltext Queries
- T375554 Classify fulltext search abandonment: English, French, Spanish
- Looking at these again. Struggling to find meaningful interpretations of user behavior and categories to group them in, but I have been getting some ideas by re-searching like the user did and re-enacting their clicks and page visits
- Chugging through English queries. I've developed an ontology for aspects I can generally figure out, including features of the query, aspects of user behavior, some aspects of user intent. If all goes well, I hope to be done reviewing and tagging the English sample tomorrow.
MLR Improvements
- T360536 Increase retention of training data
- All patchs have been merged and deployed. I'm waiting for the next mjolnir dag run to validate that everything works as expected
Vector based search
- T388549 Vector Search PoC
- setup a local testbed (opensearch 1.3) that indexes Article Topics vectors. I'm comparing results of vector search with morelike recoummendation. WIP code: https://gitlab.wikimedia.org/gmodena/vector_search
- Open question (maybe for the wed meeting?): how do we evaluate results?
- I'm working on building a benchmark dataset of morelike queries using CirrusSearch request logs.
Misc / Operations
Done
Elasticsearch -> OpenSearch migration
- T388611 Add version checking to wmf-search opensearch plugins repo
- T386945 The export_queries_to_relforge dag is not compatible with opensearch - The output of this DAG isn't actually used, so we're dropping this DAG.
- T386870 Regression Test OpenSearch Language Analysis
- T380752 Migrate Relforge to Opensearch
Airflow -> k8s migration
WDQS Graph Split
- T388860 Kartotherian should use wdqs-internal-main
- T374021 Make WikibaseQualityConstraints use split-graph query service
Misc / Operations
- T387863 Repartition eqiad|codfw.cirrussearch.update_pipeline.update.v1 topics in kafka-main@eqiad|codfw
- T387865 Repartition eqiad|codfw.mediawiki.cirrussearch.page_weighted_tags_change.v1 topics in kafka-main@eqiad|codfw
- T380572 SUP: Reduce Metrics
- T381909 WCQS updates for miscweb migration to k8s
- T388733 PHP Warning: MediaWiki\Parser\Sanitizer::normalizeWhitespace: Failed to normalize whitespace: 6 [Called from MediaWiki\Parser\Sanitizer::normalizeWhitespace in /srv/mediawiki/php-1.44.0-wmf.20/includes/parser/Sanitizer.php
- T388728 PHP Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated