Jump to content

Search Platform/Weekly Updates/2024-10-25

From Wikitech

Summary

We made steady progress on various initiatives, though most efforts remain in progress: Release of a public RDF stream, Improvements for Japanese wikis, and consolidation of data flows for weighted tags.

What's ongoing

WDQS: Public RDF Stream

Language Stuff

  • Continued analysis and tests of Kuromoji, a Japanese language analyzer, see T318269

Spark Kafka Writer (Weighted tags/Dumps 2.0)

  • Completed first test runs of the kafka wrapper with schema validation. Still needs work to simplify usage, see T374341 and T372912

What we've completed

  • T375557 Reindex all wikis to enable folding harmonization and new functionality
  • T372904 Use page_weighted_tags_changed stream
  • T376715 TypeError: Argument 3 passed to CirrusSearch\DataSender::sendWeightedTagsUpdate() must be of the type array, null given, called in /srv/mediawiki/php-1.43.0-wmf.25/extensions/CirrusSearch/includes/Job/ElasticaWrite.php on line
  • T376161 Classify fulltext search abandonment: sampling