Help:Toolforge/Elasticsearch

From Wikitech
Jump to: navigation, search
Toolforge HelpFAQRulesDevelopingDatabasesJob gridKubernetesWebHow toList of ToolsGlossary

Elasticsearch is a full text search system built on Apache Lucene. It can be used to index and search data stored as JSON documents. It is the technology used to power Wikimedia's CirrusSearch system.

An Elasticsearch cluster that can be used by all tools is available on tools-elastic-0[123], on the non-standard port 80. This Elasticsearch cluster is a shared resource and all documents indexed in it can be read by anonymous users from within Toolforge. Write access needed to create new indexes, and store or update documents requires a username and password.


Read-only access

The Elasticsearch servers allow anyone to read any of the indexes that it contains. This access is limited to other hosts in the Toolforge project (e.g. the OGE job grid, Kubernetes containers, and the bastion servers).

The Elasticsearch service is available on port 80 on the following servers:

  • http://tools-elastic-01.tools.eqiad.wmflabs/
  • http://tools-elastic-02.tools.eqiad.wmflabs/
  • http://tools-elastic-03.tools.eqiad.wmflabs/

Note that the default Elasticsearch port (9200) is not used.

Write access

Elasticsearch does not offer multi-tenant access control in their open source version, so write access to the indices is only loosely protected. PUT, POST, or DELETE requests sent to the Elasticsearch servers require HTTP Basic Authentication using a username and password specific to each tool.

Requests for write access are made by filing a Phabricator task. When credentials have been created they will be placed in /data/project/$TOOL/.elasticsearch.ini. Access requests are currently processed manually and may take a few days to be fulfilled.