HHVM

This page contains historical information.
HHVM was used by the Wikimedia wikis from 2015-2019.
It was replaced with PHP 7.

2019

Useful reading material

http://hhvm.com/blog/6323/the-journey-of-a-thousand-bytecodes - A very high level and clear explanation of how the php code turns eventually in machine code and the optimizations made in the middle.

https://docs.hhvm.com/hhvm/advanced-usage/fastCGI - We run Apache httpd in front of HHVM and they exchange data through the FastCGI protocol.

Light processes

HHVM runs requests in different threads but in a single process. fork() has to lock the entire memory space of the process so it can copy it, which blocks all requests that are currently in flight. If HHVM forks in advance, it can just have the child process sitting there, and then when a thread needs to shell out it can just call exec*.

There are five pre-forked light processes so that worker threads (serving live traffic) don't have to contend for one of them.

Usage of APC

HHVM is configured to use APC but not as translation cache (php source to bytecode). The MediaWiki php code uses APC only as local cache, managed through the ObjectCache.php class. HHVM has its own translation cache on disk (sqlite3 db) and it also implements a JIT compiler.

https://grafana.wikimedia.org/dashboard/db/hhvm-apc-usage

Packaging

Adding patches

Any patch you add should be in the DEP-3 format, in debian/patches. It should be added to debian/patches/series in the wikimedia-specific section.

Upgrade HHVM to a new upstream version

Upgrading our packages to a new HHVM version is usually a painful and long process if you're moving between major versions (so 3.3.x => 3.6.x), not so much if you're moving between minor versions. The formal process to follow is however exactly the same in the two cases.

Clone both our repository and facebook's HHVM one
In the facebook repo, don't forget to update - recursively - the submodules
See inside hhvm/third-party, where there is a lot of third-party software, most of which is already packaged in debian and should not be used during compilation. Just remove everything that's not needed from here (see our own repository about this).
create a tar (excluding the .git directories) of the upstream repository at the tag you want, and name it hhvm-orig_<upstream_version>+dfsg1
Use git import-orig to import it into the repository. I'd advise you to avoid merging automatically as it can get ugly.
Try to build the package, rinse, repeat. If you test building anywhere else than a machine with pbuilder, do a fresh clone before compilation; the hhvm build process is messy and it trashes and fiddles with a /lot/ of files. After one compilation, just throw away that clone.

Rebuild the packages

Once you have the repository set up, and you created your own patch, it's pretty easy to build a new package:

on any server that has the puppet role package::builder (at the moment, it's deneb in production) you can simply clone the repository, check out both the master and upstream repositories, and build your package:

git checkout upstream
git checkout master
GIT_PBUILDER_AUTOCONF=no DIST=jessie WIKIMEDIA=yes git-buildpackage -us -uc --git-builder=git-pbuilder

Once you did this, you need to upload the package to Reprepro and start deployment. Typically, this is done as follows:

Deploy to the beta cluster (deployment-prep)
After 2-3 business days, deploy to production only on the canary appservers and api appservers
After another 2-3 business days, deploy cluster-wide

Of course, check carefully the logs and metrics - including crashes, and most importantly log in the SAL everything you do.

Remember to clear HHVM's bytecode repository (typically /var/cache/hhvm/fcgi.hhbc.sq3) when upgrading packages. There is no automatic pruning mechanism, and the file can easily become very large. This is an issue on Cloud VPS instances, where space is scarcer.

Troubleshooting

HHVM/Troubleshooting