Help:Toolforge/Performance

PHP webservice running OOM with large MySQL result sets

The issue

When fetching large MySQL result sets, PHP webservice is running into out-of-memory (OOM) errors. The default limit for webservices is currently 4 GB vmem, but this doesn't matter here.

Question: What is causing PHP to consume this amount of memory? And what can be done ?

Hypothesis: SQL result sets are too large. To save PHP memory, it's better to use unbuffered queries to retrieve one data row after another from the server, instead of fetching all at once into client memory (= buffered = default for plain mysqli queries).

Conclusion

The high memory consumption of PHP is mainly caused by the generated array, not by the retrieved sql result set.
Enforcing the use of unbuffered queries, doesn't pay as long as the results are stored in variables. Memory usage just switches from memory (buffer) to memory (variable).
As an alternative, an unbuffered query can be processed row by row, without storing it. But this will greatly slow down things.
Buffering is a way to quickly execute SQL queries and free up resources on the MySQL server, as the server is much faster than the client.
The inital difference between normal mysql query and prepared statement is caused by the way datatypes are retrieved. With mysql query MySQL server converts all SQL datatypes to string before sending them to client, while prepared statement tries to keep SQL datatypes (as long as they match to PHP datatypes). After casting the variables, this difference is gone.

* Include only columns you really need
* Reduce dimensions/size of storage array(s)
* Cast variables to numeric type (if applicable)

Test setup

randomly chosen editor with 107,803 edits on dewiki
result stored in 2-dimensional PHP array
script as shown below

Test results

size of query result exported as plain file: 11 MB

Memory usage with different modifications
Test modifications	normal mysql query	prepared statement
buffered (store_result)	161.00 MB	155.25 MB
unbuffered (use_result)	161.00 MB	155.25 MB

casting variables in PHP result array ¹	148.00 MB	148.00 MB
reducing result array to 1 dimension ²	38.25 MB	38.25 MB
no array as result, but one large string ³	11.25 MB	11.25 MB

adding a column with blank content ⁴	187.50 MB	181.75 MB

¹ Modification: casting suitable variables to number type integer

  $foo[] = array(
	(int)$row->rev_timestamp,
	$row->page_title,
	(int)$row->page_namespace,
	$row->rev_comment,
    );

² Modification: reducing result array to 1 dimension

  $foo[] =  $row->rev_timestamp . $row->page_title .
	    $row->page_namespace . $row->rev_comment;

³ Modification: no array as result, but one large string

  $foo .=  $row->rev_timestamp . $row->page_title .
	    $row->page_namespace . $row->rev_comment;

⁴ Modification: adding a column with blank content

   SELECT  UNIX_TIMESTAMP(rev_timestamp) as rev_timestamp, page_title, page_namespace, rev_comment, '' as blank_column
  
   $foo[] = array(
	$row->rev_timestamp,
	$row->page_title,
	$row->page_namespace,
	$row->rev_comment,
	$row->blank_column,
     );

Script

For simplification, all error handlers are removed.

<?php 

// Set memory limit ( default = 128 MB )
   ini_set("memory_limit", "512M" );


// Read DB credentials
   $inifile = "../../replica.my.cnf";
   $iniVal = parse_ini_file($inifile);
   $dbUser = $iniVal["user"];
   $dbPwd  = $iniVal["password"];
   unset($iniVal);


// Create new mysqli Object
   $mysqli = new mysqli("s5.labsdb",$dbUser, $dbPwd, "dewiki_p");
   $mysqli->set_charset("utf8");


// Define the SQL query
   $query = "
	SELECT  UNIX_TIMESTAMP(rev_timestamp) as rev_timestamp,  page_title, page_namespace, rev_comment
	FROM revision_userindex 
	JOIN page ON page_id = rev_page 
	WHERE rev_user = '226562'
	ORDER BY rev_timestamp ASC
   ";

// Get memory usage before query
   $m1 = memory_get_usage(true);


// The alternative calls. commented out if not to be run
   $ff = sqli_query( $mysqli, $query, MYSQLI_USE_RESULT );
   $ff = sqli_query( $mysqli, $query, MYSQLI_STORE_RESULT );
   $ff = sqli_stmt( $mysqli, $query, $store_result=false );
   $ff = sqli_stmt( $mysqli, $query, $store_result=true );


// Get memory usage after query
   $m2 = memory_get_usage(true);


// Output the results
   printf("\n num: ".count($ff)." records \n\n");
   printf("$m1 \n $m2");


// Functions

// A. Plain vanilla mysqli query
   function sqli_query( $mysqli, $query, $mode ){
		
	if ( $result = $mysqli->query( $query, $mode) ){
		while( $row = $result->fetch_object() ){
			$foo[] = array(
				$row->rev_timestamp,
				$row->page_title,
				$row->page_namespace,
				$row->rev_comment,
			   );	
		}
		$result->close();
	}
	
	return $foo;
   }


// B. prepared statement
   function sqli_stmt( $mysqli, $query, $mode ){
	
	if ( $stmt = $mysqli->prepare( $query ) ){
#		$stmt->bind_param('i', $user_id );  // in this example no param to bind
		$stmt->execute();
		$stmt->bind_result($rev_timestamp, $page_title, $page_namespace, $rev_comment);
		if ( $mode ) {
		    $stmt->store_result();
		}
		while( $stmt->fetch() ){
			$foo[] = array(
				$rev_timestamp,
				$page_title,
				$page_namespace,
				$rev_comment
			   );
		}
		$stmt->free_result();
		$stmt->close();
	}

	return $foo;
   }

References

Webservice stalling, OOM or otherwise unresponsive

The issue

Goal is to provide a high performant and reliable webservice, as in most cases this is the business card to Toolforge, its tool authors and their services - and last but not least the main interface to the customers, the Wikipedia community.
Goal is to provide an easy-to-use standard webservice, and simultaneously make it highly configurable by users.

Lighttpd is a high performance, low footprint webserver, which is designed to serve 100+ request per seconds with ease, all the more surprising was it to see, that after the switch from Apache many webservices suddenly suffered from similar problems like:

high memory footprints (vmem) and unexplained OOM's
stalled services (lighttpd running, but no response, mostly due to dead or disabled backends)

To proportion that, many tools serve at below 1 /sec average and failed at rates of ~ 5-10, mostly due to spider access. Admittedly, there have always been some, let's say, non-optimal written tools which exhausted their resources (and in the old Apache environment the resources of all tools), but the piled and very similar problems required it, to dig deeper.

Note: Lighty has been a new technique to Labs since migration to eqiad. With this in mind, these lines are not intended to blame anyone, but to improve things for the sake of all.

Conclusion

Improve code (always do that)
Change lighttpd default settings. See: #Proposed changes for default config

Proposed configuration settings (except proxy_connect_timeout) have been successfully tested with several highly used tools (catscan2, wikihistory, xtools, zoomviewer) - currently running as lighty-xxx webservices.

Definitions & figures

Lighttpd: lighttpd is a single-threaded, single-process, event-based, non-blocking-IO web server.
CGI / FastCGI: (Fast) Common Gateway Interface. Methods used to generate dynamic content on web pages. The main difference: FastCGI uses persistent processes to handle a series of requests, while CGI creates new process(es) on each invocation. In this text referred to as cgi resp. fcgi.
Worker / children: There are two kind of workers in this context: lighttpd-workers and fcgi-workers (=fcgi-children). Not to be confused. As their names imply, these sub-processes are doing the "real" work while the corresponding parent process controls (queueing, respawning).
Backend: In terms of fcgi, Unit of controller and worker processes. Each backed has its own cache.
Parameter: Parameters are set either admin-side (default config) or user-side (.lighttpd.conf). A parameter can only be set once (=) or added to a parameter array (+=). It cannot be overwritten, unset or removed from a parameter array.^[1]. Trying so will result in an error like: (configfile.c.912) source: /var/run/lighttpd/newwebtest.conf line: 592 pos: 1 parser failed somehow near here: (EOL).
(As an aside: the high line number is caused by the start script, which adds ~500 lines of mime-type definitions).

Whenever possible (security, goals), parameters should not be set in default, to allow customization in .lighttpd.conf.

Respawning: Controlled event where an old process gets terminated and replaced by a new one. Greatly helps to clean up things. All (possibly from previous errors still open) connections and references are terminated, memory gets freed. Things start from scratch.
Memory consumption:

1 lighty process takes about ~100m vmem, constantly (which is negligible)
1 php process takes ~280 MB vmem, idle. A current standard config with 4 php-fcgi workers takes ~~~2 GB~~ ~1.6 GB vmem to start. Even if this is shared or ballooned in some way, it is important as it is counting to your maxvmem_limit on the grid engine.
Almost every fcgi process is leaking memory (especially php, but also python). If respawning doesn't work correctly (wrong event signaling, unsuitable max-requests-parameters or multiple lighty processes (see below) ) things eventually pile up until h_vmem threshold is exeeded (=>killed by grid) or until backends stall.
Plain cgi naturally doesn't suffer much from leaking issues, as every call ends up in process destruction, but causes high latency and overhead in process creation on each call.

Visualization of an example configuration:
(server.max-worker = 3, fcgi: "max-procs" => 2, "PHP_FCGI_CHILDREN" => "2") (not optimal, but for demonstration)

#pstree
lighttpd(7928)─┬─lighttpd(8002)				# lighttpd main process with 3 lighttpd sub-processes (=workers)
               ├─lighttpd(8003)
               ├─lighttpd(8004)
               ├─php-cgi(7984)─┬─php-cgi(7994)		# php-fcgi process with 2 sub-processes (=workers) (=backend 0)
               │               └─php-cgi(7995)
               └─php-cgi(7998)─┬─php-cgi(8000)		# php-fcgi process with 2 sub-processes (=workers) (=backend 1)
                               └─php-cgi(8001)		

#server-statistics
fastcgi.active-requests: 0				# number of currently active requests for all backends
fastcgi.backend.PHPMAIN.0.connected: 1			# total number of requests served from this backend
fastcgi.backend.PHPMAIN.0.died: 0			# number of times this backend has died and was respawned
fastcgi.backend.PHPMAIN.0.disabled: 0			# currently disabled; see disable-time
fastcgi.backend.PHPMAIN.0.load: 0			# number of currently active requests for this backend
fastcgi.backend.PHPMAIN.0.overloaded: 0			# backend could not recover without restart
fastcgi.backend.PHPMAIN.1.connected: 28
fastcgi.backend.PHPMAIN.1.died: 0
fastcgi.backend.PHPMAIN.1.disabled: 0
fastcgi.backend.PHPMAIN.1.load: 0
fastcgi.backend.PHPMAIN.1.overloaded: 0
fastcgi.backend.PHPMAIN.load: 0
fastcgi.requests: 29					# total number of requests served from all backends

Proposed changes for default config

Proxy

#	param	current	new	nginx default	docs
1.	proxy_connect_timeout	–	5	60	[1]

If the proxy can't reach lighttpd (usually a few ms) within 5 sec., the target lighttpd-service is considered stalled or overloaded. Connection should be terminated to not increase this overload.
Also if the target service is stalled, it'd be a relief to receive a quick response, especially if a service is called from within Wikipedia. It's a pain to have a 60sec timeout while the script/page keeps loading.

Beyond that, if this error is monitored, it'll give a good picture of services that potentially need optimization.

Should be tested with backup/test proxy.

Lighttpd

(default $workers = 5)

#	param	current	new	redmine default	docs	remark / consider
1.	h_vmem	4 GB	7 GB	–		webstart script
2.	server.max-connections	$[$workers*4]	~250-500	1024	[2]
3.	server.max-keep-alive-idle	60	5	5	[3]
4.	server.event-handler	–	"linux-sysepoll"	poll (unix generic)	[4]
5.	server.network-backend	–	–	"linux-sendfile"	[5]	"linux-aio-sendfile" async-io, requires libaio
6.	server.max-worker	$workers	–	0	[6]

It's an occasionally reached peak-limit The vast majority of tools runs constantly far below this limit, but the current 4 GB Limit (whereof ~~~2 GB~~ ~1.6 GB is idle standard config) repeatedly causes annoyance and prevents some tools from using the normal lighttpd-config.
Upper limit for file descriptors for all services per web-server-host (hard limit) is currently 4096. So 250+ (~25 parallel requests à 10 files) should be feasible.
Reducing this to the default of 5 sec, frees up used file descriptors in time. Keep alive idle is the state where all request in this connection have been handled. No need to wait further 60 seconds.
Recommended event-handler for Linux 2.6+
"linux-aio-sendfile" should be considered, if feasible.
Should be removed completely. !
1. Lighttpd process serves static (and almost completely cached) content (html, css etc.) and offloads the crucial part of the work to cgi, fcgi handlers (= vast majority of use cases). There is no real gain in forking this process, if you are running a dynamic webiste (php, python, perl ect.). These lighttpd-workers can't work in synced parallel manner, as they have to wait and/or put arbitrary load to the cgi/fcgi processes. On the other hand, you catch massive disadvantages:
2. If you use server.max-worker, the mod_status module will not show the statistics of all the server's children combined. You will see different stats with each request. Other modules are affected by this, too; every worker works for his own and there is no communication between them. Your log files may get screwed. - This is all true. (see full text in docs)
3. Respawning gets erratic and flawed. #Respawning with one lighttpd process vs. multiple lighttpd processes
4. There might be rare use cases for this (high load on plain static webservice), but the vast (almost all) tools serve dynamic content and load balancing to vCPU's is done by cgi/fcgi.

Default php-fcgi

#	param	current	new	recomm.	docs	remark
1.	default php-fcgi	mandatory	optional	–
2.	"max-procs"	1	2	1	[7]
3.	"PHP_FCGI_CHILDREN"	4	2	4	[8]	= php-fcgi worker processes
4.	"PHP_FCGI_MAX_REQUESTS"	10000	500	500	[9]	parameter for respawning; not to confuse with max. parallel request or similar

Make default php-fcgi handler optional for those who are more experienced and either want to run their own php handler or don't need php at all (saves ~~~2 GB~~ ~1.6 GB vmem). If you use the lighttpd-plain webservice type, the PHP phandler will not be loaded.
Setup a second backend as load balancer. If one backend dies, it switches over. Currently, with only one backend, there's nothing to switch over to, messages like below occur in a row, webservice is stalled.

2014-07-21 20:53:30: (mod_fastcgi.c.3001) backend is overloaded; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 0 load: 79

2014-07-21 20:53:31: (mod_fastcgi.c.3001) backend is overloaded; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 0 load: 51

Together with 2.) this will give 4 php-fcgi workers in total, which is current standard and ok for default.
As mentioned above, this auto-respawning is a great help in cleaning up things and preventing piling memory leaks.
Beyond that from the docs: This problem seems to stem from a little-known issue with PHP: PHP stops accepting new FastCGI connections after handling 500 requests; unfortunately, there is a potential race condition during the PHP cleanup code in which PHP can be shutting down but still have the socket open,...

Current

New

...




fastcgi.server += ( ".php" =>
        ((
                "bin-path" => "/usr/bin/php-cgi",
                "socket" => "/tmp/php.socket.$tool",
                "max-procs" => 1,
                "bin-environment" => (
                        "PHP_FCGI_CHILDREN" => "4",
                        "PHP_FCGI_MAX_REQUESTS" => "10000"
                ),
                "bin-copy-environment" => (
                        "PATH", "SHELL", "USER"
                ),
                "broken-scriptfilename" => "enable",
                "allow-x-send-file" => "enable"
        ))
)
EOF

...
EOF

cat <<EOF >$runbase-php.conf~

fastcgi.server += ( ".php" =>
        ((
                "bin-path" => "/usr/bin/php-cgi",
                "socket" => "/tmp/php.socket.$tool",
                "max-procs" => 2,
                "bin-environment" => (
                        "PHP_FCGI_CHILDREN" => "2",
                        "PHP_FCGI_MAX_REQUESTS" => "500"
                ),
                "bin-copy-environment" => (
                        "PATH", "SHELL", "USER"
                ),
                "broken-scriptfilename" => "enable",
                "allow-x-send-file" => "enable",
         ))
)
EOF

if [ -r $home/.lighttpd.conf ]; then
  cat $home/.lighttpd.conf >>$runbase.conf~
fi

if [ -r $home/.lighttpd.conf ]; then
  cat $home/.lighttpd.conf >>$runbase.conf~
  if [[ ! $(cat "$home/.lighttpd.conf"|
     grep -P '^(?:[ \t]*fastcgi.server[ \t\+\(=]+".php"|#no-default-php)') ]]; then
     cat $runbase-php.conf~ >>$runbase.conf~
  fi
fi

Additional Info

Respawning with one lighttpd process vs. multiple lighttpd processes

One lighttpd process

Multiple lighttpd processes

Testcase:

Starting a custom webservice with $ ./webstart
Watching processes and processtree

lighttpd(3448)─┬─php-cgi(3484)─┬─php-cgi(3486)
               │               └─php-cgi(3487)
               └─php-cgi(3488)─┬─php-cgi(3490)
                               └─php-cgi(3491)

Testcase:

Starting a plain vanilla webservice with $ webservice start
Watching processes and processtree

lighttpd(17923)─┬─lighttpd(17957)
                ├─lighttpd(17958)
                ├─lighttpd(17959)
                ├─lighttpd(17960)
                ├─lighttpd(17962)
                └─php-cgi(17951)─┬─php-cgi(17953)
                                 ├─php-cgi(17954)
                                 ├─php-cgi(17955)
                                 └─php-cgi(17956)

sending SIGTERM (15) to 3488
make new http request to induce respawning

lighttpd(3448)─┬─php-cgi(3484)─┬─php-cgi(3486)
               │               └─php-cgi(3487)
               └─php-cgi(3960)─┬─php-cgi(3962)
                               └─php-cgi(3963)

sending SIGTERM (15) to 17951
make new http request to induce respawning

lighttpd(17923)─┬─lighttpd(17957)
                ├─lighttpd(17958)
                ├─lighttpd(17959)
                ├─lighttpd(17960)
                ├─lighttpd(17962)───php-cgi(18542)─┬─php-cgi(18544)
                │                                  ├─php-cgi(18545)
                │                                  ├─php-cgi(18546)
                │                                  └─php-cgi(18547)
                └─lighttpd(18511)

sending SIGTERM (15) to 3960
make new http request to induce respawning

lighttpd(3448)─┬─php-cgi(3484)─┬─php-cgi(3486)
               │               └─php-cgi(3487)
               └─php-cgi(4203)─┬─php-cgi(4205)
                               └─php-cgi(4206)

sending SIGTERM (15) to 18542
make new http request to induce respawning

lighttpd(17923)─┬─lighttpd(17957)
                ├─lighttpd(17958)
                ├─lighttpd(17959)───php-cgi(18809)─┬─php-cgi(18811)
                │                                  ├─php-cgi(18812)
                │                                  ├─php-cgi(18813)
                │                                  └─php-cgi(18814)
                ├─lighttpd(17960)
                ├─lighttpd(17962)───php-cgi(18542)
                └─lighttpd(18511)

Current default config: lighttpd-starter

...
spool="/var/run/lighttpd"
runbase="$spool/$tool"
workers=5
if [ -r "/data/project/.system/config/$tool.workers" ]; then
  workers=$(cat "/data/project/.system/config/$tool.workers")
fi
...
cat <<EOF >$runbase.conf~
server.modules = (
  "mod_setenv",
  "mod_access",
  "mod_accesslog",
  "mod_alias",
  "mod_compress",
  "mod_redirect",
  "mod_rewrite",
  "mod_fastcgi",
  "mod_cgi",
)
 
server.port = $port
server.use-ipv6 = "disable"
server.username = "$prefix.$tool"
server.groupname = "$prefix.$tool"
server.core-files = "disable"
server.document-root = "$home/public_html"
server.pid-file = "$runbase.pid"
server.errorlog = "$home/error.log"
server.breakagelog = "$home/error.log"
server.follow-symlink = "enable"
server.max-connections = $[$workers*4]
server.max-keep-alive-idle = 60
server.max-worker = $workers
server.stat-cache-engine = "fam"
ssl.engine = "disable"
 
alias.url = ( "/$tool" => "$home/public_html/" )
 
index-file.names = ( "index.php", "index.html", "index.htm" )
dir-listing.encoding = "utf-8"
server.dir-listing = "disable"
url.access-deny = ( "~", ".inc" )
static-file.exclude-extensions = ( ".php", ".pl", ".fcgi" )
 
accesslog.use-syslog = "disable"
accesslog.filename = "$home/access.log"
 
cgi.assign = (
  ".pl" => "/usr/bin/perl",
  ".py" => "/usr/bin/python",
  ".pyc" => "/usr/bin/python",
)
 
fastcgi.server += ( ".php" =>
        ((
                "bin-path" => "/usr/bin/php-cgi",
                "socket" => "/tmp/php.socket.$tool",
                "max-procs" => 1,
                "bin-environment" => (
                        "PHP_FCGI_CHILDREN" => "4",
                        "PHP_FCGI_MAX_REQUESTS" => "10000"
                ),
                "bin-copy-environment" => (
                        "PATH", "SHELL", "USER"
                ),
                "broken-scriptfilename" => "enable",
                "allow-x-send-file" => "enable"
        ))
)
 
EOF
 
/usr/share/lighttpd/create-mime.assign.pl >>$runbase.conf~
if [ -r $home/.lighttpd.conf ]; then
  cat $home/.lighttpd.conf >>$runbase.conf~
fi
 
mv $runbase.conf~ $runbase.conf
 
exec /usr/sbin/lighttpd -f $runbase.conf -D >>$home/error.log 2>&1

References

↑ Using variables

Tweaking your webservice

^{work in progress}

General

Monitoring

To enable custom monitoring, add the following lines to your .lighttpd.conf:

server.modules += ("mod_auth","mod_status")
status.status-url = "/server-status"
status.statistics-url = "/server-statistics"

This will provide 2 monitoring-URL's

https://<toolname>.toolforge.org/server-status example
https://<toolname>.toolforge.org/server-statistics example

Max-keep-alive-requests

default = 16, max = 100 (current proxy limit)
Keep alive requests is a technique to handle multiple request within a single connection, rather than making a new connection for every file. This can considerably speed up things and save resources.
1. check your website

Chrome/Mozilla: Right mouse -> Inspect element -> Network -> F5 (reload)

Internet Explorer: F12 -> Crtl + 4 (network) -> F5 (enable network capturing) -> reload page

If the number of files <= 16 you're all set. Otherwise add parameter

server.max-keep-alive-requests = x

to your .lighttpd.conf and adjust x to a suitable value.

Disable default PHP

If you are a dedicated Python/Perl/TCL/Plain-html/Whatsoever-guy and you don't need PHP at all, just add the magic comment

#no-default-php

to your .lighttpd.conf .(This will return 1.6 GB to your vmem account)

FCGI

Custom php-fcgi configuration

in most cases default php configuration should fit the needs of your tool
if not, enable monitoring and observe server-status. If request permanently pile up over a longer period of time, go ahead.

Basic condsiderations

1 php process takes ~ 280 MB vmem
your grid limit is principally 4 GB (~ 4000 MB)
do the math (worst case): memory limit per script (MB) = (4000 - ( 280 x <all_php_processes> ) ) / <php_children>

(if you are brave, you can overcommit)

by default there are 2 backends (parameter: max-procs) for load-balancing and failover
each backend has its own cache
with backend-cache in mind, best thing is to raise the number of PHP_FCGI_CHILDREN step-by-step (3, 4, 5)

2 backends à 2 fcgi children: (4000 - ( 280 * 6 )) / 4 = 580 MB limit per script (default)

2 backends à 3 fcgi children: (4000 - ( 280 * 8 )) / 6 = 293 MB limit per script

2 backends à 4 fcgi children: (4000 - ( 280 * 10 )) / 8 = 150 MB limit per script

2 backends à 5 fcgi children: (4000 - ( 280 * 12 )) / 10 = 64 MB limit per script

A good^TM script will take 1-3 seconds to run, a very good^TM one less than a second.

Advanced condsiderations

Use case: Very high request rate + quickly responding script (< 1sec) + low to moderate memory consumption per script

You can achive higher parallel troughput by raising the number backends. Example: "max-procs" => 4, "PHP_FCGI_CHILDREN" => "2"

(always keep in mind your total memory limit)

There maybe other usecases where you have a high memory footprint / high latency / very high load, which will require more resources. In this case contact Coren to raise the limit and/or to track down the problem in deep to find an optimal solution.

##
## CAVE! Don't for get to replace <toolname> in "socket" paramter
##

fastcgi.server += ( ".php" =>
        ((
            "bin-path" => "/usr/bin/php-cgi",
            "socket" => "/tmp/php.socket.<toolname>",
            "max-procs" => 2,
            "bin-environment" => (
                    "PHP_FCGI_CHILDREN" => "4",
                    "PHP_FCGI_MAX_REQUESTS" => "500"
             ),
            "bin-copy-environment" => (
                    "PATH", "SHELL", "USER"
             ),
             "broken-scriptfilename" => "enable",
             "allow-x-send-file" => "enable"
        ))
)

Important:

replace <toolname> with your tool's name !
customize max-procs and PHP_FCGI_CHILDREN values to fit your needs. See #FCGI considerations
be sure what you're doing, if you change/add other parameters.

Example: "max-procs" => 2, "PHP_FCGI_CHILDREN" => "2" (default configuration)
Memory consumption, idle: 6 x 280 MB = 1.6 GB vmem

lighttpd(3448)─┬─php-cgi(3484)─┬─php-cgi(3486)
               │               └─php-cgi(3487)
               └─php-cgi(3488)─┬─php-cgi(3490)
                               └─php-cgi(3491)

Example: "max-procs" => 2, "PHP_FCGI_CHILDREN" => "4"
Memory consumption, idle: 10 x 280 MB = 2.8 GB vmem

lighttpd(29949)─┬─php-cgi(29980)─┬─php-cgi(29982)
                │                ├─php-cgi(29983)
                │                ├─php-cgi(29984)
                │                └─php-cgi(29985)
                └─php-cgi(29986)─┬─php-cgi(29988)
                                 ├─php-cgi(29989)
                                 ├─php-cgi(29990)
                                 └─php-cgi(29991)

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support

Chat in real time in the IRC channel #wikimedia-cloud ^connect or the bridged Telegram group
Discuss via email after you have subscribed to the cloud@ mailing list

Stay aware of critical changes and plans

Subscribe to the cloud-announce@ mailing list (all messages are also mirrored to the cloud@ list)
Read the News wiki page

Track work tasks and report bugs

Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself

Read stories and WMCS blog posts

Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)

[1] Using variables

[1]