User:JMeybohm/Docker-Registry-P2P

From Wikitech

Simple evaluation of Kraken and Dragonfly as docker-registry P2P layers

Weight dragonfly kraken notes
Basic operations Ease of installation/setup 3 7 6 For dragonfly, we need N supernodes (TBD) per DC as well as a client (dfget & dfdaemon) on each of the k8s nodes

Kraken, I think (bad docs) needs at least 3 componentsm, "origins" (seeders) probably running on each docker-registry node, N trackers per DC as well as an agent on every k8s node

Ease of client configuration 5 8 6 IIUC Kraken needs to be used as docker-registry, e.g. image tag's would need to change (to kraken-registry.w.wmnet) which then points to localhost (for kubernetes nodes).

Dragonfly can be configured to act as a HTTP(S) proxy which docker can then use. Making the change in regisry transparent to the user/infra

Resource consumption 9 I've no real idea for both of them. I feel like dragonfly will need some more (dedicated) resouces as supernodes do quite some coordingation and overall network performance is bound to their QPS
HA capabilities 5 7 10 Dragonfly supernodes are not HA. Client's can be configured with multiple supernodes, but they don't know about each and that will lead to degredated performance (as clients connect to a random node, which might not have other relevant clients connected)

Kraken is fully distributed and does not rely on a supernode, muliple trackers (should be, no def. answer) possible. Also multiple seeders are possible (could run on every docker registry)

Monitoring 6 10 6 Native prometheus metrics in dragonfly, kraken uses github.com/uber-go/tally for metrics which can produce prometheus compatible output but it seems to not be enabled in kraken https://github.com/uber/kraken/pull/168
Distribution/Resilliance 7 Dragonfly relies on it's supernode(s) to coodrinate data transfer, also the supernodes act as caches (seeders) for the cluster

With Kraken, transfer happens between clients and tracker(s) are used to orchestrate that. Dedicated seeders can be used so speed up distribution

Politics License 5 6 6 Apache 2.0 for both
Backing/support 7 6 5 dragonfly commtits mostly from 4 alibaba employees, kraken from two uber's

it looks like kraken is around for longer, main development 2017-2019); dragonfly 2018-2020, probably due to a complete rewrite from java in go In raw numbers, dragonfly has more then twice as many contributors than kraken

Adoption 7 6 5 Dragonfly lists a bunch, mainly chainese, conpany's using it. Kraken does not and things like google trends don't work very well with either of the project names

Dragonfly has been accepted to CNCF incubator in april 2020, in sandbox since oct 2018

Community 4 7 5 Hard to tell. Open/close ratio for github issues and pull requests are in favour of dragonfly (but it has been open source for longer)

Kraken has very limited documentation (bunch of md files) whilst dragonfly has docs even on it's webpage (which are quite outdated, so don't go there, see git) Kraken also seems a bit abandoned, some pull requests are totally unhandeled for example

TOTAL 298 254

Kraken as a docke-registry:

  • Automatic replication across different storage backends (via the P2P network)
  • Deletion of images is not possible
  • Mutating tags are not really supported (no persistence guarantee for example)
  • Swift is not currently supported as storage backend
  • Trackers use redis as peer store (no longer true, see https://github.com/uber/kraken/pull/270)
  • There also is nginx in the mix, probably TLS termination for the HTTP API's


Dragonfly