Event Platform/Producer Requirements

From Wikitech

In order to ensure that Event Platform based event streams are automatically integratable with all consumers and downstream data systems, producers must ensure that the events satisify specific requirements before producing them. If you are using a supported WMF Event Platform producer service or library, these requirements should be satisified already. However, if you are working in a language or area that does not have a Event Platform client library, the you will be producing events directly to Kafka yourself.

This page describes the requirements that any Event Platform producer should satisfy.

Requirements

Events

Producer libraries

A producer will need to interact with and lookup Event Stream Configuration and WMF Schema Repositories. The URI locations from which to look up event stream configuration and event schemas should be configurable. E.g. A user should be able to provide your producer library a local schema repository base path for development.

All generic producer clients and libraries:

  • MUST look up the event's schema via its $schema URI field in the configured base schema repository URIs.
  • MUST look for any event stream configuration defined for the stream the event will be produced to. The stream the event will be produced to should be in the event's meta.stream field.
  • MUST ensure the event is allowed in its destination stream, as specified in the meta.stream field. This can be determined by checking that the stream's configured schema_title matches the event's schema's title.
  • MUST ensure that the event has a dt field set specifying its ISO-8601 event time.
  • SHOULD set the meta.dt field to indicate the event's ISO-8601 'system ingestion time' by the library.
  • MUST ensure the event is valid according to the schema it declares in its $schema field.
  • SHOULD use the message_key_fields event stream config setting to hoist event field values into a JSON object, and set this as the Kafka message key. (Example EventGate Javascript code that does this).
  • MAY automatically add datacenter prefixes to the stream name to make the destination Kafka topic name.
  • MAY produce to the datacenter specific Kafka topic.

Supported Event Platform producer clients

  • EventGate - An http producer proxy. At WMF, this takes events to produce over HTTP and produces them to Kafka.
  • wikimedia-event-utilities Java library - A java library for working with Event Platform events and streams. Has APIs for preparing events for production to Kafka.
  • eventutilities-python - A python library mostly (as of 2023-05) for building PyFlink + Event Platform applications

As new clients and libraries are implemented, please link to them here.