Jump to content

Data Platform/Systems/AQS/Scaling/LoadTesting

From Wikitech

Cassandra Info

See: https://etherpad.wikimedia.org/p/analytics-aqs-cassandra

TL:DR

With our new compression scheme help us to reduce greatly the number of SSTables read upon every request and that, together with SSDs, has a strong impact on the throughput that the cluster can sustain. We are able to operate at ~ 400 transactions per second (TPS) per machine. Regardless of whether we have 3/4 months of data or 10/12 our throughput per host doesn't change.

We did not investigate as to the bottleneck so it might be that our woes have moved to the http layer.

While tests are going on CPU doesn't go beyond 20%. The machine is not taking traffic so CPU usage is only due to compaction, cassandra running and load testing.

Tests setup

We will be using siege to do load testing from inside the firewall to restbase deployed in aqs100[4,5,6]

Siege doesn't seem to be able to load a file with more than 100.000 urls thus we load test with files of that size per every test.

  • Source files to test with have about 11 million urls thus we split them in smaller files with split.
  • cassandra is creating debug logs while these tests are going on so it is likely we can run with a more efficient configuration

Graphs

CPU load in aqs1004: https://ganglia.wikimedia.org/latest/?r=2hr&cs=&ce=&m=bytes_out&c=Analytics+Query+Service+eqiad&h=aqs1004.eqiad.wmnet&tab=m&vn=&hide-hf=false&mc=2&z=small&metric_group=

SSTables read per query, new cluster:

https://grafana.wikimedia.org/dashboard/db/aqs-cassandra-cf-sstables-per-read?panelId=5&fullscreen&from=1471377813796&to=1471464213797&var-node=aqs1004-a&var-node=aqs1004-b&var-node=aqs1005-a&var-node=aqs1005-b&var-node=aqs1006-a&var-node=aqs1006-b&var-keyspace=local_group_default_T_pageviews_per_article_flat&var-columnfamily=data&var-quantiles=99percentile [Broken Link T211982]

Druid Info

We added Druid as a back-end for AQS, to serve metric data to Wikistats. We need to siege the new end-points to figure out how Druid's throughput differs from Cassandra's. Running first with the same configuration as the Cassandra load tests, and then re-evaluating. The URLs used here represent every query that Wikistats would issue to AQS if users loaded every single project's main dashboard page. That is 9 widgets for each of about 800 projects. The first run, using siege's normal regression mode, resulted in about 75 transactions per second:

milimetric@aqs1004:~$ siege -c 100 -r 100 -f siege-druid-from-aqs.txt 
** SIEGE 3.0.8
** Preparing 100 concurrent users for battle.
The server is now under siege...
<<snip a few timeout alerts>>
done.

Transactions:                   9990 hits
Availability:                  99.90 %
Elapsed time:                 132.46 secs
Data transferred:              13.68 MB
Response time:                  0.51 secs
Transaction rate:              75.42 trans/sec
Throughput:                     0.10 MB/sec
Concurrency:                   38.81
Successful transactions:        9770
Failed transactions:              10
Longest transaction:           27.67
Shortest transaction:           0.00

And this is what the servers looked like during this test:

druid1004 load test with 100,000 transactions
druid1005 load test with 100,000 transactions
druid1006 load test with 100,000 transactions

It seems like Druid is able to cache about half of the responses we need to serve all dashboards. When I re-run the same test within 30 minutes, but with -i, for random "internet" mode, it runs almost twice as fast and the cache miss rate in prometheus is about half what it was during the first run:

milimetric@aqs1004:~$ siege -i -c 100 -r 100 -f siege-druid-from-aqs.txt 
** SIEGE 3.0.8
** Preparing 100 concurrent users for battle.
The server is now under siege..      done.

Transactions:                  10000 hits
Availability:                 100.00 %
Elapsed time:                  75.26 secs
Data transferred:              13.42 MB
Response time:                  0.11 secs
Transaction rate:             132.87 trans/sec
Throughput:                     0.18 MB/sec
Concurrency:                   14.96
Successful transactions:        9768
Failed transactions:               0
Longest transaction:            8.50
Shortest transaction:           0.00

For reference, here are the URLs used to track performance:

The CPU only reaches about 25% during the peak of the siege, pretty good.

Re-running this test (the internet random one) the next day results in a similar figure, 139.20 transactions per second. The cached objects metric in prometheus shows that Druid never let go of the cache from yesterday, but added another about 10% cached objects to use for this latest test. Running the siege with normal (non-random) access to URLs results in 144 transactions per second, so now I'm curious what happens if I request a different time period. I changed all the time ranges to 2014-12 (instead of 2015-12) through 2017-12 and ran in internet mode, and this resulted in 102 transactions per second. That most likely means the two years that were cached were returned, and the rest was returned from storage. The cached objects/misses/hits for this last test seem to confirm, they all increased. Ok, so, then, for the last test, I wanted to also look at daily resolution, so I loaded these two URLs for all projects and ran the same kind of siege:

milimetric@aqs1004:~$ tail siege-druid-from-aqs.txt -n2
https://wikimedia.org/api/rest_v1/metrics/editors/aggregate/zu.wiktionary.org/all-editor-types/all-page-types/25..99-edits/daily/2017091309/2017121309
https://wikimedia.org/api/rest_v1/metrics/bytes-difference/net/aggregate/zu.wiktionary.org/all-editor-types/non-content/daily/2017091309/2017121300

milimetric@aqs1004:~$ siege -i -c 100 -r 100 -f siege-druid-from-aqs.txt 
** SIEGE 3.0.8
** Preparing 100 concurrent users for battle.
The server is now under siege..      done.

Transactions:                   9998 hits
Availability:                  99.98 %
Elapsed time:                  78.81 secs
Data transferred:               4.77 MB
Response time:                  0.15 secs
Transaction rate:             126.86 trans/sec
Throughput:                     0.06 MB/sec
Concurrency:                   19.30
Successful transactions:        9998
Failed transactions:               2
Longest transaction:           18.41
Shortest transaction:           0.00

Druid should not have had any of these things in memory, but maybe it cached daily while serving monthly data. The only other test that I can think of running is a longer extended siege. But my conclusion so far is that Druid can serve at least 50 requests per second without trouble from a cold cache. And if it's pushed further than that, it doesn't work very hard, it just takes a little longer to answer.

Realization: Druid caches segments of time. So that's why warming up the caches with monthly queries allowed it to easily answer daily queries. In that case, one more test is needed to see if requesting the full history for the datasource will break our caches such that they can't answer the monthly queries. The results indicate that basically we have enough memory to cache the full history. The URLs queried were the same as in the first test (all the queries required to serve all the dashboards for all the projects), but I injected about 3% queries that request the full history, all throughout the file. Whether we run over these randomly or in order, Druid actually works faster. This is probably because the full history cache allows answering the other questions even faster:

Transactions:                  10000 hits
Availability:                 100.00 %
Elapsed time:                  69.66 secs
Data transferred:              13.34 MB
Response time:                  0.03 secs
Transaction rate:             143.55 trans/sec
Throughput:                     0.19 MB/sec
Concurrency:                    4.78
Successful transactions:        9504
Failed transactions:               0
Longest transaction:            3.06
Shortest transaction:           0.00

For reference, this is an example hive query that generates the URLs used for siege above:

set hive.mapred.mode=nonstrict;

 select regexp_replace(template, ',,hostname,,', hostname) url
   from (select distinct hostname
           from wmf_raw.mediawiki_project_namespace_map
        ) projects
            cross join
        (select 'http://localhost:7232/analytics.wikimedia.org/v1/editors/aggregate/,,hostname,,/all-editor-types/all-page-types/all-activity-levels/monthly/2015121200/2017121200' template
            union all
         select 'http://localhost:7232/analytics.wikimedia.org/v1/edited-pages/aggregate/,,hostname,,/all-editor-types/all-page-types/all-activity-levels/monthly/2015121200/2017121200' template
            union all
         select 'http://localhost:7232/analytics.wikimedia.org/v1/edits/aggregate/,,hostname,,/all-editor-types/all-page-types/monthly/2015121200/2017121200' template
            union all
         select 'http://localhost:7232/analytics.wikimedia.org/v1/registered-users/new/,,hostname,,/monthly/2015121200/2017121200' template
            union all
         select 'http://localhost:7232/analytics.wikimedia.org/v1/pageviews/aggregate/,,hostname,,/all-access/user/monthly/2015121200/2017121200' template
            union all
         select 'http://localhost:7232/analytics.wikimedia.org/v1/unique-devices/,,hostname,,/all-sites/monthly/2015121200/2017121200' template
            union all
         select 'http://localhost:7232/analytics.wikimedia.org/v1/pageviews/top/,,hostname,,/all-access/2015/10/all-days' template
            union all
         select 'http://localhost:7232/analytics.wikimedia.org/v1/bytes-difference/absolute/aggregate/,,hostname,,/all-editor-types/all-page-types/monthly/2015121200/2017121200' template
            union all
         select 'http://localhost:7232/analytics.wikimedia.org/v1/bytes-difference/net/aggregate/,,hostname,,/all-editor-types/all-page-types/monthly/2015121200/2017121200' template
        ) templates
;

Load Tests

Data and tests are at: nuria@aqs1004:~/load-testing

TL;DR

  • From what I can see compaction doesn't affect throughput and we are capped at about 450 transactions per second per host. This number doesn't change with the amount of data loaded thus far. That is, the throughput cap is the same with 4 and 7 months of data.

SSTables per read is about 3, in our old cluster this number is 15. This means that the compaction we are using is more efficient for our type of reads.

  • CPU doesn't even get to 20% while testing at a throughput of > 300 TPS for minutes at a a time.
  • We have debug logging on, if logging is in any way a bottleneck our throughput might be even better.

Tests 1. 2016-08-16. Three months loaded. Compaction has finished

Results

  • We test once compaction has finished before starting a different load of data on aqs1004 using siege.

The maximum throughput 1 box is able to sustain is about 350 transactions per second (tps). Not much of a blip on CPU, it doesn't go above 20%.

Detail data

Tests 2016-08-16 file with 1e6 urls


"a hundred concurrent users doing 100 repetitions", should be 1e6 transactions

-c100 -r100
** SIEGE 3.0.8
** Preparing 100 concurrent users for battle.
The server is now under siege..      done.

Transactions:		       10000 hits
Availability:		      100.00 %
Elapsed time:		       62.82 secs
Data transferred:	        2.95 MB
Response time:		        0.01 secs
Transaction rate:	      159.18 trans/sec
Throughput:		        0.05 MB/sec
Concurrency:		        2.30
Successful transactions:        9135
Failed transactions:	           0
Longest transaction:	        0.09
Shortest transaction:	        0.00



*******Longer test: -c100 -r1000

** SIEGE 3.0.8
** Preparing 100 concurrent users for battle.
The server is now under siege..      done.

Transactions:		      100000 hits
Availability:		      100.00 %
Elapsed time:		      548.67 secs
Data transferred:	       28.91 MB
Response time:		        0.01 secs
Transaction rate:	      182.26 trans/sec
Throughput:		        0.05 MB/sec
Concurrency:		        2.55
Successful transactions:       92589
Failed transactions:	           0
Longest transaction:	        0.22
Shortest transaction:	        0.00

********* bumping up concurrency
Still cpu not even 20%

nuria@aqs1004:~/load-testing$ siege -c 200 -r 1000 --file=part-00000_1_1e6.txt --log
** SIEGE 3.0.8
** Preparing 200 concurrent users for battle.
The server is now under siege..      done.

Transactions:                 200000 hits
Availability:                 100.00 %
Elapsed time:                 551.68 secs
Data transferred:              57.82 MB
Response time:                  0.01 secs
Transaction rate:             362.53 trans/sec
Throughput:                     0.10 MB/sec
Concurrency:                    5.04
Successful transactions:      185178
Failed transactions:               0
Longest transaction:            0.22
Shortest transaction:           0.00

*******   siege -c 400 -r 1000 --file=part-00000_1_1e6.txt --log
Not working.

******** siege -c 300 -r 1000 --file=part-00000_1_1e6.txt --log
Not working either!!!

Looks like 300 TPS is our threshold.

 

Tests 1. 2016-08-18. Four months loaded. Compaction is happening while testing

Results

Compaction doesn't seem to affect test results when it comes to throughput. Average response time is slower (is "response time" a percentile or an average?). Twice as slow actually.

Number of SSTtables read is still around 3 and CPU is low while tests are running (<20%) while throughput is about 300 TPS.


Details


200 concurrent users doing 1000 repetitions

** siege -c 200 -r 1000 --file=../part-00000_1_1e6.txt --log
** SIEGE 3.0.8
** Preparing 200 concurrent users for battle.
The server is now under siege..      done.

Transactions:		      199870 hits
Availability:		       99.94 %
Elapsed time:		      563.90 secs
Data transferred:	       57.81 MB
Response time:		        0.02 secs
Transaction rate:	      354.44 trans/sec
Throughput:		        0.10 MB/sec
Concurrency:		        7.01
Successful transactions:      185055
Failed transactions:	         130
Longest transaction:	        5.23
Shortest transaction:	        0.00

It took about the same time to complete this test that when compaction was not happening,
average response time is higher 0.2 versus 0.1

** 2nd time same set of urls:

nuria@aqs1004:~/load-testing/2016-08-18$ siege -c 200 -r 1000 --file=../part-00000_1_1e6.txt --log
** SIEGE 3.0.8
** Preparing 200 concurrent users for battle.
The server is now under siege..      done.

Transactions:		      200000 hits
Availability:		      100.00 %
Elapsed time:		      559.88 secs
Data transferred:	       57.82 MB
Response time:		        0.02 secs
Transaction rate:	      357.22 trans/sec
Throughput:		        0.10 MB/sec
Concurrency:		        5.84
Successful transactions:      185178
Failed transactions:	           0
Longest transaction:	        0.33
Shortest transaction:	        0.00

Average response time is same and so is TPS.


** 3rd time different set of urls results are about identical to 1st run, slower transaction seems slower
but averages are about the same

nuria@aqs1004:~/load-testing/2016-08-18$ siege -c 200 -r 1000 --file=../part-00000_2_1e6.txt --log
** SIEGE 3.0.8
** Preparing 200 concurrent users for battle.
The server is now under siege..      done.

Transactions:		      199999 hits
Availability:		      100.00 %
Elapsed time:		      560.76 secs
Data transferred:	       58.84 MB
Response time:		        0.02 secs
Transaction rate:	      356.66 trans/sec
Throughput:		        0.10 MB/sec
Concurrency:		        7.32
Successful transactions:      186271
Failed transactions:	           1
Longest transaction:	        5.01
Shortest transaction:	        0.00

Same values for response time and throughput

 

Tests 1. 2016-08-26. Seven months loaded. Compaction is happening while testing

Results

Compaction doesn't seem to affect throughput (per our first test on 2016-08-16) We can bump up TPS to 450 per sec

Using the same set of urls over does not allow us to increase our troughput but it just seems the slowest transaction is not as slow. This might be a red herring if those numbers are affected by outliers.

CPU usage doesn't get to 20% while tests are going on (keep in mind that compaction is happening). SSTables per read is at 3.

Details

siege -c 200 -r 1000 --file=../part-00000_1_1e6.txt --log

** SIEGE 3.0.8
** Preparing 200 concurrent users for battle.
The server is now under siege..      done.

Transactions:                 200000 hits
Availability:                 100.00 %
Elapsed time:                 566.03 secs
Data transferred:              57.82 MB
Response time:                  0.02 secs
Transaction rate:             353.34 trans/sec
Throughput:                     0.10 MB/sec
Concurrency:                    5.41
Successful transactions:      185178
Failed transactions:               0
Longest transaction:            0.27
Shortest transaction:           0.00

** 2nd time same set of urls:
Results are about the same, less of a blip on CPU

nuria@aqs1004:~/load-testing/2016-08-24$ siege -c 200 -r 1000 --file=../part-00000_1_1e6.txt --log
** SIEGE 3.0.8
** Preparing 200 concurrent users for battle.
The server is now under siege..      done.

Transactions:                 200000 hits
Availability:                 100.00 %
Elapsed time:                 558.60 secs
Data transferred:              57.82 MB
Response time:                  0.01 secs
nuria@aqs1004:~/load-testing/2016-08-24$ more load-testing.log
7 months loaded, data is being compacted
CPU usage doesn't get to 20%, compaction doesn't seem to affect
throughput (per our first test on 2016-08-16)

We can bump up TPS to 450 per sec


siege -c 200 -r 1000 --file=../part-00000_1_1e6.txt --log

** SIEGE 3.0.8
** Preparing 200 concurrent users for battle.
The server is now under siege..      done.

Transactions:                 200000 hits
Availability:                 100.00 %
Elapsed time:                 566.03 secs
Data transferred:              57.82 MB
Response time:                  0.02 secs
Transaction rate:             353.34 trans/sec
Throughput:                     0.10 MB/sec
Concurrency:                    5.41
Successful transactions:      185178
Failed transactions:               0
Longest transaction:            0.27
Shortest transaction:           0.00

** 2nd time same set of urls:
Results are about the same, less of a blip on CPU

nuria@aqs1004:~/load-testing/2016-08-24$ siege -c 200 -r 1000 --file=../part-00000_1_1e6.txt --log
** SIEGE 3.0.8
** Preparing 200 concurrent users for battle.
The server is now under siege..      done.

Transactions:                 200000 hits
Availability:                 100.00 %
Elapsed time:                 558.60 secs
Data transferred:              57.82 MB
Response time:                  0.01 secs
Transaction rate:             358.04 trans/sec
Throughput:                     0.10 MB/sec
Concurrency:                    5.09
Successful transactions:      185178
Failed transactions:               0
Longest transaction:            0.26
Shortest transaction:           0.00

** 3rd run, different set  of urls
Pretty much exact results than round 1

siege -c 200 -r 1000 --file=../part-00000_2_1e6.txt --log

** SIEGE 3.0.8
** Preparing 200 concurrent users for battle.
The server is now under siege..      done.

Transactions:                 200000 hits
Availability:                 100.00 %
Elapsed time:                 561.96 secs
Data transferred:              58.84 MB
Response time:                  0.01 secs
Transaction rate:             355.90 trans/sec
Throughput:                     0.10 MB/sec
Concurrency:                    5.23
Successful transactions:      186272
Failed transactions:               0
Longest transaction:            0.21
Shortest transaction:           0.00



** 4th run try to bunp up concurrency:
siege -c 300 -r 1000 --file=../part-00000_2_1e6.txt --log

Test breaks, transactions per sec are too much

** 5th run bumping up concurrency, but a bit less, we bump up transactions to 400

nuria@aqs1004:~/load-testing/2016-08-24$ siege -c 250 -r 1000 --file=../part-00000_2_1e6.txt --log
** SIEGE 3.0.8
** Preparing 250 concurrent users for battle.
The server is now under siege..      done.

Transactions:                 250000 hits
Availability:                 100.00 %
Elapsed time:                 557.41 secs
Data transferred:              73.59 MB
Response time:                  0.01 secs
Transaction rate:             448.50 trans/sec
Throughput:                     0.13 MB/sec
Concurrency:                    5.98
Successful transactions:      232779
Failed transactions:               0
Longest transaction:            0.21
Shortest transaction:           0.00

** 6th run using a different set of urls, fresh
nuria@aqs1004:~/load-testing/2016-08-24$ siege -c 250 -r 1000 --file=../part-00000_3_1e6.txt --log
** SIEGE 3.0.8
** Preparing 250 concurrent users for battle.
The server is now under siege..      done.

Transactions:                 250000 hits
Availability:                 100.00 %
Elapsed time:                 560.55 secs
Data transferred:              73.35 MB
Response time:                  0.02 secs
Transaction rate:             445.99 trans/sec
Throughput:                     0.13 MB/sec
Concurrency:                    7.04
Successful transactions:      233363
Failed transactions:               0
Longest transaction:            0.24
Shortest transaction:           0.00

 


Tests 2016-09-17. All data loaded, deployed fix for nulls to zero. No compaction happening

Results

Throughput limit is not changed regardless of data size. It is likely that we are now running into a limit with our http layer rather than storage layer.

CPU usage doesn't get to 20% while tests are going on. SSTables per read is at 3.


Details



Test 1, 1e6 distinct urls. 200 users, 100 repetitions, which means urls are hit more than once

siege -c 200 -r 1000 --file=../part-00000

** SIEGE 3.0.8
** Preparing 200 concurrent users for battle.
The server is now under siege..      done.

Transactions:		      200000 hits
Availability:		      100.00 %
Elapsed time:		      556.29 secs
Data transferred:	       58.90 MB
Response time:		        0.02 secs
Transaction rate:	      359.52 trans/sec
Throughput:		        0.11 MB/sec
Concurrency:		        6.78
Successful transactions:      186310
Failed transactions:	           0
Longest transaction:	        0.50
Shortest transaction:	        0.00


Test 2, same setup but with 200.000 distinct urls. Results
are about identical

nuria@aqs1004:~/load-testing/2016-09-17$ siege -c 200 -r 1000 --file=./urls_2e6.txt.2
** SIEGE 3.0.8
** Preparing 200 concurrent users for battle.
The server is now under siege..      done.

Transactions:		      200000 hits
Availability:		      100.00 %
Elapsed time:		      561.82 secs
Data transferred:	       58.04 MB
Response time:		        0.02 secs
Transaction rate:	      355.99 trans/sec
Throughput:		        0.10 MB/sec
Concurrency:		        6.71
Successful transactions:      185881
Failed transactions:	           0
Longest transaction:	        0.26
Shortest transaction:	        0.00


Tests 3. Bumping up transactions per second.

siege -c 300 -r 1000 --file=./urls_2e6.txt.2

This results on transactions per second > 500 and things break

Test 4. Reduced concurrent users to 250. Actually we get to > 400 throughput


siege -c 250 -r 1000 --file=./urls_2e6.txt.2


** SIEGE 3.0.8
** Preparing 250 concurrent users for battle.
The server is now under siege..      done.

Transactions:		      250000 hits
Availability:		      100.00 %
Elapsed time:		      562.85 secs
Data transferred:	       72.71 MB
Response time:		        0.02 secs
Transaction rate:	      444.17 trans/sec
Throughput:		        0.13 MB/sec
Concurrency:		        7.84
Successful transactions:      232861
Failed transactions:	           0
Longest transaction:	        0.29
Shortest transaction:	        0.00

Test 5. Same users and repetitions than Test 4 but with urls that expand to a larger time range.
Results are similar to prior test.

nuria@aqs1004:~/load-testing/2016-09-17$ siege -c 250 -r 1000 --file=./urls_2e6.txt.2.late_end_date
** SIEGE 3.0.8
** Preparing 250 concurrent users for battle.
The server is now under siege..      done.

Transactions:		      250000 hits
Availability:		      100.00 %
Elapsed time:		      556.65 secs
Data transferred:	       83.56 MB
Response time:		        0.02 secs
Transaction rate:	      449.12 trans/sec
Throughput:		        0.15 MB/sec
Concurrency:		        8.72
Successful transactions:      232609
Failed transactions:	           0
Longest transaction:	        0.29
Shortest transaction:	        0.00
 


Calibration. Performance Tests on Old Cluster

We have used siege to run tests on the old cluster in order to get a baseline of siege results compare to results we see in the live system. We know from experience that our old cluster couldn't sustain more than 30 reqs per second due to hardware (lack of SSDs) but also to software (suboptimal cassandra compaction).

We tested the old cluster with data up to 2016-09-23 but w/o receiving production traffic so our setup is identical to the performance tests we run on the newer cluster

TL;DR

Latencies are way higher for much smaller concurrency levels. Response times are aprox 2 seconds for less than 60 transactions per second. This means that latencies (as measured by our tests) are 100 times higher on the old cluster with similar (and lower) concurrency levels.

Cluster cannot even get to 100 transactions per sec. In theory the load tests tell us it that it could support almost 60. CPU gets to 20%. In reality we know our threshold is around 30 so we can safely conclude that siege allows us to find the right order of magnitude, the actual threshold has to be found experimentally.

Details

nuria@aqs1001:~/load-testing$ siege -c 120 -r 1000 --file=urls_2e6.txt.2.late_end_date
** SIEGE 3.0.8
** Preparing 120 concurrent users for battle.
The server is now under siege..      done.

Transactions:                 119593 hits
Availability:                  99.66 %
Elapsed time:                2278.04 secs
Data transferred:              40.07 MB
Response time:                  1.59 secs
Transaction rate:              52.50 trans/sec <-
Throughput:                     0.02 MB/sec
Concurrency:                   83.61
Successful transactions:      111310
Failed transactions:             407
Longest transaction:            5.48
Shortest transaction:           0.00



nuria@aqs1001:~/load-testing$ siege -c 150 -r 1000 --file=urls_2e6.txt.2.late_end_date
** SIEGE 3.0.8
** Preparing 150 concurrent users for battle.
The server is now under siege..      done.
siege aborted due to excessive socket failure; you
can change the failure threshold in $HOME/.siegerc

Transactions:                  42589 hits
Availability:                  97.63 %
Elapsed time:                 723.67 secs
Data transferred:              14.13 MB
Response time:                  2.03 secs
Transaction rate:              58.85 trans/sec <-
Throughput:                     0.02 MB/sec
Concurrency:                  119.26
Successful transactions:       39333
Failed transactions:            1035
Longest transaction:            5.89
Shortest transaction:           0.00
 


Trying to bump up concurrency beyond this level runs into errors.