Obsolete:Caching servers out of Florida

From Wikitech
This page contains historical information. It may be outdated or unreliable.

Rationale

Previously, all content was served exclusively out of servers based in Florida. The Wikimedia Foundation pays for traffic out of these servers.

As a consequence:

  • the 100 megabit/s connection occasionally gets physically overloaded (now upgraded into a 1000 megabit/s connection)
  • due to network latencies, access times for non-North American users can be high.

Having some caches (see below) hosted in other locations, closer to the users, would have three advantages:

  • it would free some network bandwidth at the Florida facility; this is especially important if the bandwidth at the other location is donated;
  • it would provide better response times for the users in those locales (especially for non-logged in users);
  • it would reduce the pressure on the Florida squids (especially important if the hardware is donated; it is probably not cost-effective to send it to Florida).

An opportunity occured of getting both some free (albeit outdated) hardware in France and some free hosting in a French colocation facility.

Technical definitions

Squid caching

The first line of servers reached by clients requesting Wikimedia pages are Squid caches. These do not store nor process Wiki content; instead, they memorize the last version of recently accessed pages (or images etc.), and if the same page is requested again, this version is sent immediately, without further request. If no recent version is available, the page is requested from the database/Apache web servers (which is a heavy and slow process). A purging mechanism ensures that obsolete (updated) pages are not kept in the cache.

DNS service

When users request a page beginning by http://fr.wikipedia.org, their computer has to turn fr.wikipedia.org into an IP address. This is done through the domain name system (DNS) using Wikimedia's DNS servers.

Currently, all of Wikimedia's DNS service is hosted by Wikimedia and Bomis hardware in the Florida colocation facility. It is normally recommended, especially for big popular sites, that DNS should be provided from multiple sites.

GeoDNS

To be able to redirect users from certain countries or networks to different locations with Wikimedia servers, the GeoDNS system is used. Users from some countries can be directed to use the Squid caches in Florida, in Paris or in some possible future other facilities depending on their country or network of origin. This transparently makes them use the servers to which, with very high probability, they have the fastest connection.

Paris experiment

Description

3 machines with Celeron 600 CPUs, 18 gigabytes of disk and 640 megabytes of RAM (128 + 512 upgrade) in each were installed in racks belonging to Lost Oasis in the Telecity colocation in Aubervilliers, near Paris, France. Lost Oasis donates the rack space and the bandwidth.

One of the three machines has had RAM problems and was taken off-line. It is expected that faulty RAM will be taken out and replaced soon.

The machines exclusively provide Squid service. It is suggested that they could also be used for secondary DNS service.

The machines were put in active usage on Sunday, January 9, 2005. In the beginning, they served French-, Dutch- and German- language for France, Germany, Belgium, Luxembourg and the Netherlands. It soon became apparent that while French users were perfectly content with the setting, non-French users experienced extreme slowness. All countries except France were thus taken out off the scheme. After asking Lost Oasis, it turns out that Lost Oasis pays a lot for outgoing transit to some foreign networks, which they are not prepared to sponsor us for, and thus they implemented rate limiters to these destinations. These rate limiters explain the problems reported by non-French users. Lost Oasis says it will soon obtain better rates for its transit traffic and will then lift our restrictions.

As of January 12, the 2 active machines serve fr:, en:, upload (images) and commons (multimedia) content to French users. (The system will be taken off-line on January 12 evening for servicing of the machine with faulty memory)

Results

(Numbers are based on several experiments at different hours during both the evening and the daytime, thus rather loaded hours in Europe.)

Typical access times for cached pages (measured using wget time) on the fr: (or en:) pages are 0.3s (min: 0.15s) from the Paris squids, and 0.9s to more than 1s on the Florida squids. Thus access times for popular pages are reduced by a factor of three.

One partial explanation for this is the latency of the connections. A network round-trip (measured using ping) from a French network node to the Paris squids takes 4-5 ms, about 25-35 ms from a DSL client with basic settings (some DSL lines will go faster); compare this to 110 ms (and sometimes 150 ms) to the Florida cluster from a French network node (140 ms from a DSL client). Florida seen from a typical French DSL connection has about four times the latency of Paris squids. Note that this latency seems inherent in retrieving data from a transatlantic link – a ballpark estimate of the physical roundtrip time in optical fiber gives 80 ms, which seems consistent.

At the peak activity during the day, the CPU load is about 50% and the output 600 kilobytes per second. This indicates that, CPU-wise, it should be possible to support 3 countries with traffic like that of France using the 3 machines (or 2 if one wants to be on the safe side). Of course, these machines are outdated, so much better performance is to be expected from recent hardware.

According to statistics, we have an about 75%-80% hit rate, climbing to 95% for multimedia content.

Prospects

The machines currently have 640 megabytes of RAM (the faulty one, when repaired, may end up with 768 megabytes). This is a limitation, because the efficiency of caches depends on their ability to keep large amounts of data in RAM. It might be desirable to upgrade the machines to 1 gigabyte each, which would come at a supplemental total cost of about 390€.

Another solution would be to add a recent machine, which would probably have performance greater than the three together.

Caveats

France and some other European countries (UK...) have oversea possessions which should be distinguished from the mainland European territory with respect to routing. In particular, the French Antilles (Guadeloupe and Martinique) and French Guiana should be routed directly to Florida. Investigations are going on about this topic.

Other sites

See Volunteer Squid Sites.