OSM-Test
Summary
This test was done in July 2008. The idea was to run a Squid proxy in front of the OSM tile servers, {a,b,c}.tiles.openstreetmap.org. As test server, knsq30 was used in KNAMS.
The system configuration:
1 Intel Xeon 5130, 2 GHz, DualCore 16 GB memory 4 Seagate ST973451SS 73GB SAS disks, 15krpm 1 GBit/s Ethernet port
General remarks
The backend was an Apache server using OSM's mod_tiles. It has its own disk cache for tiles and renders tiles on demand. Each request goes through the module. As far as I understood, there's a weekly database export from the core MySQL DB into Mapnik's PostgreSQL DB. Depending on the next scheduled DB export, the mod_tiles module decides whether the tile needs to be re-rendered and computes the Expires: header.
The module does not honor the If-modified-since: header, it always responds with returncode 200 and doesn't make use of code 304. Due to this, the refresh_pattern was tuned aggressively.
First test
The following squid config was used:
visible_hostname knsq30.knams.wikimedia.org unique_hostname knsq30.knams.wikimedia.org http_port 80 vhost negative_ttl 0 minutes request_header_max_size 15 KB uri_whitespace deny request_body_max_size 7 MB half_closed_clients off pipeline_prefetch on read_timeout 1 minute request_timeout 1 minute persistent_request_timeout 1 minutes quick_abort_min 0 KB quick_abort_max 0 KB cache_mem 3000 MB maximum_object_size 600 KB maximum_object_size_in_memory 150 KB cache_dir coss /dev/sda6 30000 max-size=524288 max-stripe-waste=32768 block-size=2048 cache_dir coss /dev/sdb 30000 max-size=524288 max-stripe-waste=32768 block-size=2048 cache_dir coss /dev/sdc 30000 max-size=524288 max-stripe-waste=32768 block-size=2048 cache_dir coss /dev/sdd 30000 max-size=524288 max-stripe-waste=32768 block-size=2048 cache_swap_log /var/spool/squid/%s refresh_pattern . 120 500% 6000 ignore-reload override-lastmod dead_peer_timeout 10 seconds client_db off digest_generation off error_directory /usr/share/squid/errors/Wikimedia log_mime_hdrs off logfile_rotate 1 strip_query_terms off logformat wikimedia knsq30.knams.wikimedia.org %sn %ts.%03tu %tr %>a %Ss/%03Hs %<st %rm %ru %Sh/%<A %mt %{Referer}>h %{X-Forwarded-For}>h %{User-Agent}>h cache_access_log /var/log/squid/access.log cache_log /var/log/squid/cache.log cache_store_log none acl all src 0.0.0.0/0.0.0.0 acl localsrc src 127.0.0.1/255.255.255.255 acl purge method PURGE acl post method POST acl tiertwo src 127.0.0.1 acl tile_openstreetmap_org dstdomain tile.openstreetmap.org cache_peer 128.40.58.204 parent 80 0 originserver no-query connect-timeout=5 login=PASS cache_peer_access 128.40.58.204 allow all never_direct allow all htcp_access allow all htcp_clr_access allow tiertwo htcp_clr_access deny all http_access allow all
After a few days, the cache had a high hit ratio, serving about half of the cache hits from memory, the other half from disk. Depending on the time of day, about 5-10% of the requests were cache misses and had to be delivered from the master.
Connection information for squid: Number of clients accessing cache: 0 Number of HTTP requests received: 46800892 Number of ICP messages received: 0 Number of ICP messages sent: 0 Number of queued ICP replies: 0 Number of HTCP messages received: 0 Number of HTCP messages sent: 0 Request failure ratio: 0.00 Average HTTP requests per minute since start: 5547.8 Average ICP messages per minute since start: 0.0 Select loop called: 251333892 times, 2.014 ms avg Cache information for squid: Request Hit Ratios: 5min: 78.7%, 60min: 81.5% Byte Hit Ratios: 5min: 91.1%, 60min: 92.7% Request Memory Hit Ratios: 5min: 54.0%, 60min: 53.5% Request Disk Hit Ratios: 5min: 45.5%, 60min: 45.4% Storage Swap size: 36762950 KB Storage Mem size: 3072072 KB Mean Object Size: 4.31 KB Requests given to unlinkd: 0 Median Service Times (seconds) 5 min 60 min: HTTP Requests (All): 0.00286 0.00286 Cache Misses: 0.01035 0.00919 Cache Hits: 0.00179 0.00179 Near Hits: 0.00000 0.00000 Not-Modified Replies: 0.00091 0.00179 DNS Lookups: 0.00000 0.00000 ICP Queries: 0.00000 0.00000 Resource usage for squid: UP Time: 506159.565 seconds CPU Time: 7310.760 seconds CPU Usage: 1.44% CPU Usage, 5 minute avg: 2.72% CPU Usage, 60 minute avg: 2.28% Process Data Segment Size via sbrk(): 1034232 KB Maximum Resident Size: 0 KB Page faults with physical i/o: 10 Memory usage for squid via mallinfo(): Total space in arena: -1748292 KB Ordinary blocks: 1629838 KB 0 blks Small blocks: 0 KB 0 blks Holding blocks: 0 KB 0 blks Free Small blocks: 815713 KB Free Ordinary blocks: 460 KB Total in use: 1629838 KB -92% Total free: 816173 KB -46% Total size: -1748292 KB Memory accounted for: Total accounted: 4795460 KB memPoolAlloc calls: 230165238 memPoolFree calls: 191604546 File descriptor usage for squid: Maximum number of file descriptors: 32768 Largest file desc currently in use: 521 Number of file desc currently in use: 430 Files queued for open: 0 Available number of file descriptors: 32338 Reserved number of file descriptors: 100 Store Disk files open: 0 IO loop method: epoll Internal Data Structures: 8522920 StoreEntries 531370 StoreEntries with MemObjects 531367 Hot Object Cache Items 8522892 on-disk objects
The squid process:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND proxy 9229 1.4 40.7 6746012 6711288Â ? Sl Jul09 122:21 (squid) -D -sYC -f /etc/squid/osm.conf
Disk I/O was very low.
# iostat -x 10 avg-cpu: %user %nice %system %iowait %steal %idle 0.85 0.00 0.30 3.89 0.00 94.96 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 40.12 3.89 6.19 0.60 92.61 35.93 18.94 0.02 3.24 3.09 2.10 sdb 38.22 0.00 6.49 0.00 89.42 0.00 13.78 0.03 4.15 3.38 2.20 sdc 47.01 0.00 7.68 0.00 109.38 0.00 14.23 0.03 3.77 3.77 2.89 sdd 27.15 101.20 4.39 1.00 63.07 204.39 49.63 0.03 5.19 2.96 1.60
Second test
The memory was increased to 8 GB:
cache_mem 8000 MB
Results
to be updated
Summary
The CPU utilization was very low. The server would probably be able to run the mod_tiles apache in parallel to squid.
The overall network traffic of the OSM server farm was reduced significantly. Remaining network traffic was used for
- serving cache misses to the squid server
- API access to the OSM database
- serving the Wiki, SVN, mailing lists
- Tiles@Home, the second map renderer used by the OSM project