Jump to content

Scap

From Wikitech
thcipriani@tin:~$ scap fortune | scap say
 ------------------------------------------------
/                                                \
|   S.C.A.P.: scatter crap around production    |
\                                                /
 ------------------------------------------------
    \
     \
      \
           ___ ____
         ⎛   ⎛ ,----
          \  //==--'
     _//|,.·//==--'    ____________________________
    _OO≣=-  ︶ ᴹw ⎞_§ ______  ___\ ___\ ,\__ \/ __ \
   (∞)_, )  (     |  ______/__  \/ /__ / /_/ / /_/ /
     ¨--¨|| |- (  / ______\____/ \___/ \__^_/  .__/
         ««_/  «_/ jgs/bd808                /_/

scap is the Wikimedia deployment tool. Scap started life as a series of shell scripts written around 2004. Prior to 2004, the code was loaded from an NFS-mount ("insta-deployment") on the Apache servers. It was ported from Bash to Python in 2013. Starting in late 2015 it was modified to support the use-case of deploying software beyond just MediaWiki. Previously, other software was deployed with a salt-based git-deployment tool called Trebuchet.

Scap is used for:

Usage

As of July 2017, there are two sets of operations that can be performed by scap:

  1. MediaWiki deployments (including backports)
  2. Other software deployments

MediaWiki Deployments

Basic commands.

scap lock

This is a deployment-server-script!
Holds a lock open for the given repository
  • Can be used to stop other deploys. Specify --all to lock all repositories, e.g. scap lock --all "incident in progress, blocking deploys Txxxxx"
  • When a lock is placed, scap lock continues running in the foreground. Once Enter is pressed, the lock is removed. If this does not work (or the command is because ie. console disconnected), you can run the command again with --unlock-all to release the global lock set with --all.
  • To check if there are any locks in place try to lock all repositories with --all. If it fails due to a lock, it will show you any existing locks. Remember to release the global lock afterwards:
scap-deploy@deploy:~$ scap lock --all 'setting global lock'
12:36:26 lock-manager is locked by scap-deploy (pid 3825) on Fri Apr 28 12:36:23 2023; reason is "Jenkins deployment".
Will wait up to 10 minute(s) for the lock(s) to be released.

scap pull

Performs the action only on the one server on which the command is run
  • syncs /srv/mediawiki-staging from a deployment rsync server-> /srv/mediawiki on the local server
  • This action was formerly run as sync-common

scap sync-file

This is an all-script!
  • for a single file
  • checks PHP syntax/validates json files
  • builds new MediaWiki image
  • syncs /srv/mediawiki-staging/(some file) -> /srv/mediawiki
  • Deploys MediaWiki image to Kubernetes via helm
  • Depools, restarts PHP, and repools all non-Kubernetes targets

scap sync-wikiversions

This is an all-script!
  • Compiles /srv/mediawiki-staging/wikiversions.php from /srv/mediawiki-staging/wikiversions.json
  • Ensures each version exists and has l10n
  • Creates new MediaWiki image
  • syncs /srv/mediawiki-staging/wikiversions.{json,cdb} -> /srv/mediawiki
  • Deploys MediaWiki image to Kubernetes via helm
  • Depools, restarts PHP, and repools all non-Kubernetes targets

scap sync-world

'scap' logo
A diagram of wikimedia 'scap' deployment tool
This is an all-script!
"sync-common-all-php"
  • lints PHP files in ./wmf-config, ./multiversion
  • Sync deploy directory (/srv/mediawiki-staging) on localhost with staging area (/srv/mediawiki)
  • Compiles /srv/mediawiki/wikiversions[-realm].{php,json} on the deployment host
  • for all mediawiki versions currently deployed (usually 2) rebuilds localization caches of core and extensions:
    • creates l10n cache directory /srv/mediawiki-staging/php-[version]/cache/l10n
    • generates /srv/mediawiki-staging/wmf-config/ExtensionMessages-[version].php via MergeMessageFileList.php for each version
    • in parallel call RebuildLocalisationCache.php for each version to generate cdb files in l10n cache directory
    • Calls sudo -u l10nupdate scap cdb-json-refresh --directory=[version l10n cache directory]
      • creates l10n-cache upstream directory: /srv/mediawiki-staging/php-[version]/cache/l10n/upstream
      • reads all cdb-files in l10n-cache directory
      • In parallel:
        • Generate md5 of db file
        • If the cdb md5 does not match the existing md5 on disk, regenerate the upstream json from cdb
        • Write the json file, write the md5 file
        • If a json file becomes corrupted, delete the md5 file and re-run cdb-json-refresh as l10nupdate
  • Creates new MediaWiki image
  • runs scap pull on co-master servers to update /srv/mediawiki-staging
  • runs scap pull on rsync fanout servers to update /srv/mediawiki
  • runs scap pull on "all" servers to update /srv/mediawiki from nearest rsync proxy
  • rebuilds localization caches on "all" servers from json files that were synced by scap pull
    • rsync is not the optimal tool for cdb files, so json is synced and cdb files are built from those
  • compiles wikiversions.json to wikiversions.cdb on localhost
  • Deploys MediaWiki image to Kubernetes via helm
  • runs scap sync-wikiversions
  • Depools, restarts PHP, and repools all non-Kubernetes targets

The 3.14.0 release of Scap (current version is 3.15.0) added the --canary-wait-time option to the sync-world subcommand. This sets the time how long Scap waits for code to run on canary servers. The default is 20 seconds, but if that is too long or too short, you can now adjust it.

As usual, please use carefully. Don't shorten the time just to make your deployment go faster.

Backport Deployments

Backport windows are small code deployment windows that happen 3 times per day in #wikimedia-operations connect on libera.chat.

scap backport sequence diagram

scap backport

Handles one or more backports
  • Use the --list flag to list available backports
  • Merges the patches supplied to the backport command if unmerged
  • Deploys to testservers and waits for user confirmation
  • Runs scap sync-world

scap backport --revert

Handles one or more reverts
  • Use the --list flag to list available backports to revert
  • Creates one or more revert patches and pushes to gerrit
  • Runs scap backport on the revert patches

For more detailed instructions there is a page that details how to run a Backport deployment

Other software deployments

Scap is capable of deploying software other than MediaWiki. These commands are meant to be run from the active deployment server in the directory for the repo you are deploying. These repos are all sub-directories of /srv/deployment. For example, ORES is deployed from the directory /srv/deployment/ores/deploy.

The primary documentation for scap3 can be found in the form of auto-generated html docs, see scap documentation index.

scap deploy

This is a deployment-server-script!
  • Uses git to sync deployment-server:/srv/deployment/[repo] -> targets:/srv/deployment/[repo]
  • Syncs in 5 separate stages
    1. config-deploy (optional/requires configuration) - uses templates in the /srv/deployment/[repo]/scap directory to deploy a configuration file on the targets
    2. fetch - Uses git fetch to fetch code from the deployment-server to a local cache repo (for example: target:/srv/deployment/ores/deploy-cache/cache)
    3. promote
      • Uses git clone to create a checkout of the current repo in the state it's in on the deployment-server (for example: it will make a checkout in target:/srv/deployment/ores/deploy-cache/revs/xxxxxxxxxxxxxxxxxxxxx)
      • Creates symlink on target from revision to final location (for example: it will make a symlink target:/srv/deployment/ores/deploy-cache/revs/xxxxxxxxxxxxxxxxxxxxx -> target:/srv/deployment/ores/deploy)
    4. service_restart (optional) - if a service_name is specified in scap/scap.cfg that service will be restarted
    5. finalize - removes files on targets used for rollback.

Usage

usage: scap deploy [-h] [-v] [--environment ENVIRONMENT]
                   [-r REV] [-l LIMIT_HOSTS] [-f] [--dry-run]
                   [--service-restart] [-i] [message [message ...]]

Sync new service code across cluster.

positional arguments:
message               Log message for SAL

optional arguments:
  -h, --help            show this help message and exit
  -r REV, --rev REV     Revision to deploy
  -l LIMIT_HOSTS, --limit-hosts LIMIT_HOSTS
                        Limit deploy to hosts matching expression
  -f, --force           force re-fetch and checkout
  --dry-run             Compile and deploy config files to a temp location and
                        output a diff against the previously deployed config
                        files.
  --service-restart     Restart service
  -i, --init            Setup a repo for initial deployment

Examples

# show verbose output:
scap deploy -v "Updating repo"
# deploy using configuration inside scap/environments/staging/scap.cfg
scap deploy --environment staging
# deploy only to the scap-target-01 host
scap deploy -l 'scap-target-01'
# deploy only to the 5 hosts [scap-target-01..scap-target-05]
scap deploy -l 'scap-target-[01:05]'
# deploy to all hosts EXCEPT those in eqiad
scap deploy -l '!scap-target-*.eqiad.wmnet'
deployer@tin:/srv/deployment/foo/deploy$ scap deploy
20:46:12 Started deploy_foo/deploy
Entering 'foo'
20:46:12
== DEFAULT ==
:* scap-target-07
:* scap-target-08
:* scap-target-09
:* scap-target-04
:* scap-target-05
:* scap-target-06
:* scap-target-10
:* scap-target-01
:* scap-target-02
:* scap-target-03
deploy_foo/deploy_config_deploy: 100% (ok: 10; fail: 0; left: 0)
deploy_foo/deploy_fetch: 100% (ok: 10; fail: 0; left: 0)
deploy_foo/deploy_promote: 100% (ok: 10; fail: 0; left: 0)
20:46:42 Finished deploy_foo/deploy (duration: 00m 29s)

scap deploy is capable of:

scap deploy --service-restart

This is a deployment-server-script!
  • If a service_name is defined in scap/scap.cfg, the --service-restart command can be used to restart a service on all targets

Usage

deployer@tin:/srv/deployment/foo/deploy$ scap deploy --service-restart -v "Restarting for Upgrade"
17:17:30 Started restart [foo_deploy/deploy@61368b4]
17:17:30 Deploying Rev: 61368b4253c9f6a5d2d62d482319d644b31791f0
17:17:30 Prepare config deploy
17:17:30 Config deploy file: /srv/deployment/foo/deploy/scap/config-files.yaml
17:17:30 Update DEPLOY_HEAD
17:17:30 Creating /srv/deployment/foo/deploy/.git/DEPLOY_HEAD
17:17:30 Update server info
Entering 'foo'
17:17:30 Started restart [foo/deploy@61368b4]: Restarting for Upgrade
17:17:30
== DEFAULT ==
:* scap-target-07
:* scap-target-08
:* scap-target-09
:* scap-target-04
:* scap-target-05
:* scap-target-06
:* scap-target-10
:* scap-target-01
:* scap-target-02
:* scap-target-03
17:17:30 Running remote deploy cmd ['scap', 'deploy-local', '-v', '--repo', 'foo/deploy', '-g', 'default', 'restart_service', '--refresh-config']
foo/deploy: restart_service stage(s): 100% (ok: 10; fail: 0; left: 0)
17:17:44
== DEFAULT ==
:* scap-target-07
:* scap-target-08
:* scap-target-09
:* scap-target-04
:* scap-target-05
:* scap-target-06
:* scap-target-10
:* scap-target-01
:* scap-target-02
:* scap-target-03
17:17:44 Running remote deploy cmd ['scap', 'deploy-local', '-v', '--repo', 'foo/deploy', '-g', 'default', 'finalize', '--refresh-config']
foo/deploy: finalize stage(s): 100% (ok: 10; fail: 0; left: 0)
17:17:55 Finished restart [foo/deploy@61368b4] (duration: 00m 24s)

scap deploy-log

scap deploy-log running
This is a deployment-server-script!

The scap deploy-log command provides powerful filters for the scap deploy logs.

deploy-log is meant to run during or after a deploy, potentially in a separate terminal. Log entries can be filtered on one or more fields using a given free-form expression. By default scap deploy-log will periodically scan the scap/log directory for new files and immediately begin tailing any newly discovered log file.

To learn more about deploy-log see info about structured logging.

Usage

scap deploy-log [--file <file>] [--latest] [-v] [expr]

scap deploy-log [--file <file>] [--latest] [-v] [expr]

-f <file>, --file <file>

    Used to explicitly specify the log file to be parsed. If no file is specified then deploy-log will automatically open any newly created log files and immediately begin outputting any matching log messages.

-l, --latest

    Parse and filter the latest log file, exiting once the entire file has been processed.

-v, --verbose

    Produce verbose output

expr

    Optional filter expression which is used to match log entries in <file>

Examples

# show verbose output:
scap deploy-log -v
# tail the most recent log file:
scap deploy-log --latest
# show log messages for the host named scap-target-01
scap deploy-log 'host == scap-target-01'
# show log messages matching a regex pattern:
scap deploy-log 'msg ~ "some important (message|msg)"'
# show WARNING messages for hosts whose name begins with "scap-target-"
scap deploy-log 'levelno >= WARNING host == scap-target-*'

Deployment-server-script

A deployment-server-script is a script that is meant to be run from the active deployment server that operates on target boxes via ssh.

All-script

An all-script is a deployment-server-script that operates on mediawiki-installation boxes via ssh. These scripts performs the action on all (relevant) servers.

The specific servers include:

Wholesome Fun

bd808@silver:~$ scap fortune | scap say
 ------------------------------------------------
/                                                \
|     S.C.A.P.: someone can always pontificate    |
\                                                /
 ------------------------------------------------
    \
     \
      \
           ___ ____
         ⎛   ⎛ ,----
          \  //==--'
     _//|,.·//==--'    ____________________________
    _OO≣=-  ︶ ᴹw ⎞_§ ______  ___\ ___\ ,\__ \/ __ \
   (∞)_, )  (     |  ______/__  \/ /__ / /_/ / /_/ /
     ¨--¨|| |- (  / ______\____/ \___/ \__^_/  .__/
         ««_/  «_/ jgs/bd808                /_/

xkcd comic issue #2565 titled "Latency" depicts typical process latency in with a graph of 800ms automated processes at the beginning and at the end but in the middle is someone copies and pastes data from a thing into another thing. 2 to 15 minutes (More if the person on call is busy)

SCAPDFATIAT

https://xkcd.com/2565/

Make a new Scap release

Scap/Release