Maps/v2/Common tasks
Common maps tasks
Deploy new code
Kartotherian
The service is currently deployed on the Wikikube Kubernetes cluster, so any changes follows the same process as any other Kubernetes service.
Tegola
The service is currently deployed on the Wikikube Kubernetes cluster, so any changes follows the same process as any other Kubernetes service.
Configuration changes
Tegola and Kartotherian
The tegola and kartotherian configs are maintained using helm in the deployment-charts repo and populated by the helm values per env.
Postgres
Postgres config is maintained using puppet. Here are the relevant sections:
Imposm
OSM import process is configured using puppet. Here are the relevant sections:
EventGate
- Schema definitions
Run a planet import
When running a planet import, make sure that the following checklist is followed:
- Set profile::maps::osm_master::disable_replication_timer in Hiera to true (systemctl status imposm should come up empty)
- Set profile::maps::osm_master::disable_waterlines_import_timer in Hiera to true (systemctl status imposm should come up empty)
- Execute imposm-initial-import script using the following command (the -d parameter is a date block which relates to a file name on https://planet.osm.org/pbf/)
# change -d accordingly when performing the import with the most updated values
sudo -s
screen -DR
imposm-initial-import -d 210906 -x webproxy.eqiad.wmnet:8080
- Post a log message in IRC #wikimedia-operations
!log maintenance: trigger full planet re-import for maps eqiad
- monitor the full planet import
- Is there any disk space issue?
- Did the script finished properly?
- Are the logs sufficient?
- In case of rollback, run:
sudo -u osmupdater imposm-rollback-import
- The grants for Tegola are configured in the imposm-initial-import script, but mysteriously don't work as part of the initial import. Following the initial import run the following:
sudo -u postgresql psql -d gis
And then within the psql shell run:
GRANT SELECT ON ALL TABLES IN SCHEMA public TO tegola;
GRANT SELECT ON ALL TABLES IN SCHEMA import TO tegola;
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public to tegola;
- Enable the replication service by setting profile::maps::osm_master::disable_replication_timer in Hiera to false. Imposm will start and catch up with all the OSM data that followed to the snapshot that got import. Progress can be followed by journalctl -u imposm.
- Enable the waterlines import by setting profile::maps::osm_master::disable_waterlines_import_timer in Hiera to false
- remove imposm backup to free disk space
- clean-up old data in backup:
imposm-removebackup-import
Get a DB shell
> ssh maps10XX.eqiad.wmnet
> sudo -u postgres psql
postgres=# \c gis
Run an example query
Select all landuse geometries included in the bounding box:
- SRID 3857 (the one we use in our layers config)
Box: left=2632692 bottom=4572168 right=2648887 top=4581940
This bounding box is a rectangle that includes the center of Athens, Greece.
gis=# SELECT osm_id, ST_AsMVTGeom(geometry, ST_MakeEnvelope(2632692,4572168,2648887,4581940, 3857)) AS geom FROM layer_landuse(ST_MakeEnvelope(2632692,4572168,2648887,4581940, 3857), 9);
Reindex DB
WARNING this operation might take even ~10 hours to be completed
From a pgsql shell, connect to the `gis` DB and run:
gis #= REINDEX DATABASE gis;
Example HTTP requests
Kartotherian
Get the raster tile x=0 y=0 z=0 for source osm-intl
curl https://maps.wikimedia.org/osm-intl/0/0/0.png
Tegola
Tegola currently is only used internally. In order to reach out for the current example we will ssh to the deployment node (eq. in eqiad)
> ssh deployment.eqiad.wmnet
To get the endpoint information for different maps and layers:
> curl https://tegola-vector-tiles.svc.eqiad.wmnet:4105/capabilities | jq .
To get the vector tile x=0 y=0 z=0 for map osm
> curl https://tegola-vector-tiles.svc.eqiad.wmnet:4105/maps/osm/{z}/{x}/{y}.pbf
EventGate
Post an event:
curl -X POST -d '
{
"\$schema": "/maps/tile_change/1.0.0",
"meta": {
"dt": "$(date --iso-8601=s)"
"stream": "maps.tile_expiration"
},
"tile": "1/0/2",
"state": "expired"
}' https://eventgate-main.svc.eqiad.wmnet:4492/v1/events/
Example tegola commands
- Run server
tegola -c </path/to/config> serve
- Cache a tile
tegola -c /path/to/config cache seed tile-name 0/0/0
- Purge a tile from cache
tegola -c /path/to/config cache purge tile-name 0/0/0
Cleanup old Swift buckets
Once a new bucket is created and warmed up, it is good to remove the old data to free some load from Swift. Since we use the S3 API, it is convenient to use a tool like s3cmd.
Connect to a host like stat1010 (or any that offer the command) and create a config file like the following:
[default]
encrypt = false
host_base = thanos-swift.discovery.wmnet
host_bucket = thanos-swift.discovery.wmnet
progress_meter = True
use_https = True
access_key = tegola:prod
secret_key = CHANGE-ME-WITH-THE-REAL-PASSWORD
You can find the real password in Puppet private or on a thanos-fe host (looking into /etc). Please remember to chmod the above file appropriately to avoid leaking secretes on multi-tenant hosts. Assuming that you named the above file s3-thanos-tegola.s3cmd, execute the following command:
s3cmd -c s3-thanos-tegola.s3cmd del --recursive --debug s3://tegola-swift-codfw-v002 --force
It will take days to complete if the bucket's size is around 90M elements, so use a tmux session and leave it running until it finishes. If you want to track the progresses, you can check the Prometheus metric swift_account_stats_objects_total{account="AUTH_tegola", cluster="thanos"}.
Clean-up Kafka queue
Reset the queue:
kafka-commit-last-message --topic "eqiad.maps.tiles_change" --broker kafka-main1001.eqiad.wmnet:9092 --group-id "poppy-eqiad.maps.tiles_change"
Verify that there are no messages to be consumed from last offset
kafka-consume-messages --topic "eqiad.maps.tiles_change" --broker kafka-main1001.eqiad.wmnet:9092 --offset earliest --group-id "poppy-eqiad.maps.tiles_change"
List tiles on Openstack Swift
cd /root
source .tegola_credentials
swift -A $ST_AUTH -U $ST_USER -K $ST_KEY list tegola-swift-container
Load balance maps requests
In the past, in order to perform safe switchovers in maps we had to the load balancer for kartotherian. Nowadays Kartotherian is running on Kubernetes and it is able to run in a single data center if needed (for maintenance etc..).
Check tile pregeneration status
Currently the only way we have to check for pregeneration status is through tile pregeneration logs.
Here are the links to logstash:
Codfw is running every day at 00:00
Eqiad is running every day at 12:00
To filter for errors we can just query for "errors".
Warm up the Tegola tiles cache from scratch
There are use cases in which we want to bootstrap the Tegola's tile cache on Swift from scratch, for example after a major upgrade of Postgres to a new version or after a full planet import. These are the overall steps needed:
- Add a new Eventgate stream for the schema
maps.tiles_changeand deploy its MediaWiki config. This is a recent example. Please use a meaningful name, that will eventually correspond to the Kafka topic containing the events. - Depool the target datacenter from traffic. This includes the related traffic handled by Kartotherian, to completely drain the DC from maps traffic.
- Update Tegola's k8s configuration to read from the new stream/kafka-topic. Tegola runs Cronjobs in k8s that everyday at the same hour pull new events from a target Kafka topic and take actions upon those (for example, generate a new tile and store it on Swift).
- In puppet we have a script called
bootstrap-tiles-storage.shthat reads from a Swift Bucket and generates Eventgate events to the right stream/topic. This in turn will cause the new tiles to be generated by the aforementioned Cronjobs. Please note that you need to configure two important bits: the source Swift bucket and the target Eventgate stream name. The former can be the data from the other DC as well, if considered more up-to-date and complete, meanwhile the latter needs to be the new stream created during the first step. - Wait a couple of days since the events to generate are likely in the order of millions.
- Verify that the Swift bucket contains the number of tiles that you expect.