How to manage data automatically with custom Backbeat extensions

Backbeat, a key Zenko microservice, dispatches work to long-running background tasks. Backbeat uses Apache Kafka, the popular open-source distributed streaming platform, for scalability and high availability. This gives Zenko functionalities like: Asynchronous multi-site replication Lifecycle policies Metadata ingestion (supporting Scality RING today, with other backends coming soon) As with the rest of the Zenko stack, […]

Written By Dasha Gurova

On February 12, 2019

"

As with the rest of the Zenko stack, Backbeat is an open-source project, with code organized to let you use extensions to add features. Using extensions, you can create rules to manipulate objects based on metadata logs. For example, an extension can recognize music files by artist and move objects in buckets named after the artist. Or an extension can automatically move objects to separate buckets, based on data type (zip, jpeg, text, etc.) or on the owner of the object.

All Backbeat interactions go through CloudServer, which means they are not restricted to one backend and you can reuse existing solutions for different backends.

The Backbeat service publishes a stream of bucket and object metadata updates to Kafka. Each extension applies its own filters to the metadata stream, picking only metadata that meets its filter criteria. Each extension has its own Kafka consumers that consume and process metadata entries as defined.

To help you develop new extensions, we’ve added a basic extension called “helloWorld.” This extension filters the metadata stream to select only object key names with the name “helloworld” (case insensitive) and when processing each metadata entry, applies a basic AWS S3 putObjectTagging where the key is “hello” and the value is “world.”

This example extension shows:

How to add your own extension using the existing metadata stream from a Zenko 1.0 deployment
How to add your own filters for your extension
How to add a queue processor to subscribe to and consume from a Kafka topic

There are two kinds of Backbeat extensions: populators and processors. The populator receives all the metadata logs, filters them as needed, and publishes them to Kafka. The processor subscribes to the extension’s Kafka topic, thus receiving these filtered metadata log entries from the populator. The processor then applies any required changes (in our case, adding object tags to all “helloworld” object keys).

[maxbutton id=”1″ url=”https://www.zenko.io/try-zenko” text=”Try Zenko now!” ]

Example

Begin by working on the populator side of the extension. Within Backbeat, add all the configs needed to set up a new helloWorld extension, following the examples in this commit. These configurations are placeholders. Zenko will overwrite them with its own values, as you’ll see in later commits.

Every extension must have an index.js file in its extension directory (“helloWorld/” in the present example). This file must contain the extension’s definitions in its name, version, and configValidator fields. The index.js file is the entry point for the main populator process to load the extension.

Add filters for the helloWorld extension by creating a new class that extends the existing architecture defined by the QueuePopulatorExtension class. It is important to add this new filter class to the index.js definition as “queuePopulatorExtension”.

On the processor side of the extension, you need to create service accounts in Zenko to be used as clients to complete specific S3 API calls. In the HelloWorldProcessor class, this._serviceAuth is the credential set we pass from Zenko to Backbeat to help us perform the putObjectTagging S3 operation. For this demo, borrow the existing replication service account credentials.

Create an entry point for the new extensions processor by adding a new script in the package.json file. This part may be a little tricky, but the loadManagementDatabase method helps sync up Backbeat extensions with the latest changes in the Zenko environment, including config changes and service account information updates.

Instantiate the new extension processor class and finish the setup of the class by calling the start method, defined here.

Update the docker-entrypoint.sh file. These variables point to specific fields in the config.json file. For example, “.extensions.helloWorld.topic” points to the config.json value currently defined as “topic”: “backbeat-hello-world”.

These variable names (i.e. EXTENSION_HELLOWORLD_TOPIC) are set when Zenko is upgraded or deployed as a new Kubernetes pod, which updates these config.json values in Backbeat.

Finally, add the new extension to Zenko. You can see the variables defined by the Backbeat docker-entrypoint.sh file in these Zenko changes.

Some config environment variables aren’t so apparent to add because we did not add them to our extension configs, but they are necessary for running some of Backbeat’s internal processes. Also, because this demo borrows some replication service accounts, those variables (EXTENSIONS_REPLICATION_SOURCE_AUTH_TYPE, EXTENSIONS_REPLICATION_SOURCE_AUTH_ACCOUNT) must be defined as well.

Upgrade the existing Zenko deployment with:

$ helm upgrade --set ingress.enabled=true --set backbeat.helloworld.enabled=true zenko zenko

Where the Kubernetes deployment name is “zenko”. You must update the “backbeat” Docker image with the new extension changes.

With the Helm upgrade, you’ve added a new Backbeat extension! Now whenever you create an object with the key name of “helloworld” (case insensitive), Backbeat automatically adds object tagging, with a “hello” key and a “world” value to the object.

Have any questions or comments? Please let us know on our forum. We would love to hear from you.

Photo by Jan Antonin Kolar on Unsplash

Simple, secure S3 object storage software for modern applications

Learn More

How to manage data automatically with custom Backbeat extensions

Written By Dasha Gurova

Read more

Example

Learn More

Life at Scality

Get Hired

About

Explore

Get in Touch