Event Platform/Event Utilities
Wikimedia Event Utilities are code libraries for interacting with Wikimedia's Event Platform, including Stream config, schemas and producing events to Kafka, and stream processing abstractions to aide building and deploying stream processing applications.
Event Utilities libraries that are used to produce events must follow Event Platform/Producer Requirements.
This library has logic and abstractions for working with streams and schemas using configured Stream Config endpoints and Schema respostories / Schema services.
- EventStreamFactory - Uses Stream Configuration and Schemas to build EventStreams.
- EventStream - Helps for getting information, schemas, config, etc. about an Event Platform stream.
- JsonEventGenerator - Creates Jackson JSON ObjectNodes that are suitable for producing to an Event Platform stream following Event Platform/Producer Requirements.
This library has Flink + Event Platform integration abstractions. It contains converters to convert from Event Platform event JSONSchemas to Flink types, and classes that help instantiate Flink DataStreams or Tables representing Event Platform streams.
The main entrypoints here are
- EventDataStreamFactory - helps instantiate Flink DataStreams
- EventTableDescriptorBuilder - helps instantiate Flink Tables
- EventStreamCatalog - Flink SQL Catalog + Event Platform stream integration. Event Flink Catalog Wikitech docs.
A python library that aims to wrap the eventutillities Java libraries. As of 2023-05, most of the work here is done to enable and ease building simple PyFlink applications.
This library is built and deployed to its Gitlab package registry using Gitlab CI. To install a release via pip, you must add the Gitlab pip package registry as a pip package index. This is most easily done using the
--extra-index-url pip option.
pip install eventutilities-python --extra-index-url https://gitlab.wikimedia.org/api/v4/projects/1014/packages/pypi/simple
Or, in a requirements.txt file, e.g:
--extra-index-url https://gitlab.wikimedia.org/api/v4/projects/1014/packages/pypi/simple eventutilities-python==0.8.0
Installing eventutililties-python via a pip wheel like this will include some Java dependency jars in its 'lib/' path. When using eventutilties-python flink helpers, these dependencies will automatically be added to the Flink environment.
A Flink distribution must be manually provided. You can do this many ways, but the easiest way is to install pyflink via pip:
pip install apache-flink