Skip to main content

Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

SMILA/Documentation/HowTo/How to write a Worker


This HowTo describes the necessary steps for writing a worker in SMILA.

Preconditions

  • Set up your development environment, see How to set up the development environment.
  • You should have read and understood the documentation about the JobManager, especially the configuration of workers and workflows if you want to create new workers.
  • You should have at least an idea about the OSGi framework and OSGi services. For links to introductory articles and tutorials see [1]. For a quite comprehensive overview on OSGi see [2].

Project templates

Before writing your own worker we recommend you to take a look at the sample workers. You can get them by importing the bundles from the examples directory of SMILA's repository into your workspace and use them as templates:

You can also download these examples from the release downloads or the nightly build downloads. (SMILA-...-integrator-examples.zip)

If you do not want to use the example bundles as templates, you can also start with a new bundle by following the two HowTos:

Now you have two (completely empty) bundles, one to develop your worker(s), and one to test it/them.

Hint: The following steps apply only when you use the example bundles, not if you start with fresh bundles!

Integration

We now need to integrate our new workers in the SMILA application. Do the following steps to enable this:

Adding the worker description

The workers' descriptions are read by the jobmanager on startup, if a worker does not provide a description it won't get any tasks, so you have to provide one.

  • Edit the workers.json file in SMILA.application/configuration/org.eclipse.smila.jobmanager and add the following worker description to the json array (don't forget the comma):
{ 
  "name": "HelloWorldWorker",
  "input": [ 
    {   "name": "inputRecords",
        "type": "recordBulks"
    } ],
  "output": [ 
    {   "name": "outputRecords",
        "type": "recordBulks"
    } ]
}

Adding the bundles to the configuration

Next we need to make sure the bundle is started.

config.ini

To start the bundle in the built application, add the following line to SMILA.application/configuration/config.ini as the second last line:

  • org.eclipse.smila.integration.worker@4:start, \

(To be honest, it does not matter at all, where exaclty you add your bundle in the file, as long as the syntax (end of lines must be escaped for all lines except the last one, of course) is correct.)

launcher

You also have to adapt your launcher:

  • Click on Run configurations...
  • Select the OSGi Framework-->SMILA configuration
  • In the Bundles page, check the box before org.eclipse.smila.integration.worker, leave Start Level on default, set Auto-Start to true.
  • Click Apply

Scale up

Finally you should add the scale up limits (see ScaleUp) to the cluster configuration file (if you use the standard simple clusterconfig service, you will find the configuration file as org.eclipse.smila.clusterconfig.simple/clusterconfig.json).

E.g. add the following snippet to the existing ones in the workers map to limit scale up of the worker to a maximum of concurrent tasks (be sure, your worker label is the same as in the workers.json). If you do not add your worker's scale up here, the worker is limited to one concurrent task.

Example to limit the worker HelloWorldWorker to a maximum of 4 concurrent tasks:

    "HelloWorldWorker":{
      "maxScaleUp":4
    },

Running

You should now test your workspace setup to make sure that everything works with the prepared stuff.

Run the application

  • Select "Run" -> "Run Configurations" or "Debug Configurations"
  • Select "OSGi Frameworks" -> "SMILA".
  • Click "Run" or "Debug" and SMILA should start just like when started from the command line.

When starting the SMILA.launch in eclipse, you should see something like the following output in the console window:

...
Added worker HelloWorldWorker to WorkerManager.
...

You should also be able to read the worker definition using the jobmanager HTTP API now: Go to http://localhost:8080/smila/jobmanager/workers/ to see something like this:

{
  "workers" : [ ...,
     {
       "name" : "HelloWorldWorker",
       "url" : "http://localhost:8080/smila/jobmanager/workers/HelloWorldWorker/"
     }, ... ] 
}

You can now click on the link to the worker description and you should see the description of the HelloWorldWorker:

{
  "name" : "HelloWorldWorker",
  "input" : [ {
    "name" : "inputRecords",
    "type" : "recordBulks"
  } ],
  "output" : [ {
    "name" : "outputRecords",
    "type" : "recordBulks"
  } ]
}

Run the test case

To run the JUnit test case for the HelloWorldWorker

  • Stop the SMILA.launch if it is running.
  • Select "Run" -> "Run Configurations".
  • Select "JUnit Plugin Test" -> "TestHelloWorldWorker".
  • Click "Run".
  • You should find the following message in the "Console" view:
TestHelloWorldWorker: Value of attribute 'greeting' = 'HelloWorldWorker was here :-)'

This shows that the HelloWorldWorker has done something. Of course, the test also contains an assertion so that it will fail when the attribute has not the expceted value.

Create your own worker

Use template

The easiest way to create a new worker is by implementing it in the bundle org.eclipse.smila.integration.worker (see Project Templates). There you can just place your new worker beside the HelloWorldWorker example worker, or replace it. Things you have to do when renaming the bundle/package or creating your own worker bundle are described later on.

Bundle dependencies

The dependencies of the bundle are managed by the OSGi framework and have to be configured explicitly in the MANIFEST.MF file so that the OSGi framework can resolve them (in the correct versions) when the services are started.

To create a worker that reads and writes Records, we need at least the following bundles imported as packages (see META-INF -> "Dependencies" -> "Imported Packages"):

  • org.eclipse.smila.datamodel: For the Record class.
  • org.eclipse.smila.objectstore: Possible exceptions when accessing input/output streams.
  • org.eclipse.smila.taskworker: The TaskWorker bundle containing the Worker and TaskContext interfaces.
  • org.eclipse.smila.taskworker.input: Input streams of the TaskWorker bundle.
  • org.eclipse.smila.taskworker.output: Output streams of the TaskWorker bundle.

This is already configured. If access to other packages is needed, just extend the MANIFEST.MF file in section "Imported Packages" accordingly.

Worker Implementation Java Class

Create a worker class which implements org.eclipse.smila.taskworker.Worker. Have a look at the example worker org.eclipse.smila.integration.worker.HelloWorldWorker that comes with the SDK in the org.eclipse.smila.integration.worker bundle. You must implement two methods:

  • getName() must return a unique name for your worker. Exactly the same name (case sensitive) must be used later in the worker descriptions and workflow definitions.
  • perform() does the actual work. It is called with a TaskContext object that provides access to the task properties, input and output objects, and counters.

Thread safety: Make sure that your worker implemention is thread safe! Otherwise you can't use it for scale up.

OSGI Declarative Service

Every worker must be declared as an OSGi Declarative Service (DS) in order to be registered properly to the worker framework. To configure your worker as DS, you have to add an appropriate XML file to the folder org.eclipse.smila.integration.worker/OSGI-INF.

The file can be created either manually or using the Component Definition wizard.

Have a look at helloworldworker.xml as an example:

<?xml version="1.0" encoding="UTF-8"?>
<scr:component xmlns:scr="http://www.osgi.org/xmlns/scr/v1.1.0"
  name="HelloWorldWorker" immediate="true">
  <implementation class="org.eclipse.smila.integration.worker.HelloWorldWorker"/>                           
   <service>
     <provide interface="org.eclipse.smila.taskworker.Worker"/>
   </service>                  
</scr:component>

The file describes (1) the interface that the worker has to implement (and through which it will be accessed in the OSGi application by means of dependency injection), (2) the class being the concrete implementor of that interface, (3) the services that it references (our simple worker does not reference any, you can find a description later on), (4) and the name of the service.

To describe your own worker, just create a copy of the OSGI-INF/helloworldworker.xml file in the same directory. Then change at least the "name" attribute in the root element and the "class" element in the "implementation" element.

When you don't need the HelloWorldWorker anymore you may want to remove at least its component definition file from the bundle. Otherwise, it will always be running and asking for tasks in the final deployment. While it should not really be a problem, it causes some unnecessary overhead that can easily be avoided.

You should check in your MANIFEST.MF that your component definition is included in the build and it is listed as Service-Component (e.g. as a line in your MANIFEST.MF Service-Component: OSGI-INF/*.xml and the bin.includes of the build.properties file should contain OSGI-INF/).

Register your worker in jobmanager configuration

These are the steps to use your new worker with the jobmanager framework.

Worker definition

Edit workers.json from <WORKSPACE>/SMILA.application/configuration/org.eclipse.smila.jobmanager folder and add the definition for the new worker.

Important: The name in the worker definition has to be the same that is returned by the getName() method in the worker implementation!

For the example worker HelloWorldWorker we want to use one input and output slot. And we use recordBulks as data object type cause we want to modify (bulks of) records with this worker:

{ "name": "HelloWorldWorker",
  "input": [ 
         {  "name": "inputRecords",
            "type": "recordBulks"
         } ],
  "output": [ 
         {  "name": "outputRecords",
            "type": "recordBulks"
         } ]
}

Workflow definition

To use your worker in a workflow you have to add a new workflow or change an existing one. You can either use the jobmanager API to add a workflow definition to the running system, or you can edit workflows.json from <WORKSPACE>/SMILA.application/configuration/org.eclipse.smila.jobmanager folder and add/change a workflow.

This example is a test workflow that uses the HelloWorldWorker to manipulate all records which where pushed into the system using the bulkbuilder. Because it's pretty useless as such, we did not add it to SMILA.application/configuration/org.eclipse.smila.jobmanager/workflows.json, but it's used in the unit test bundle org.eclipse.smila.integration.worker.test: The test case reads the output bulk created by the HelloWorldWorker to check if it been running.

{
   "name":"HelloWorldWorkflow",
   "startAction":{
      "worker":"bulkbuilder",
      "output":{
         "insertedRecords":"importBucket"
      }
   },
   "actions":[
      {
         "worker":"HelloWorldWorker",
         "input":{
            "inputRecords":"importBucket"
         },
         "output":{
            "outputRecords":"helloWorldExportBucket"
         }
      }
   ]
}

Bucket definition

If you want to use a new persistent bucket for your workflow (see jobmanager documentation) you have to add it via the jobmanager API or add it to the configuration: Edit buckets.json from SMILA.application/configuration/org.eclipse.smila.jobmanager folder and create desired bucket.

Here's an example from the test bundle org.eclipse.smila.integration.worker.test for the workflow above that makes the final bucket helloWorldExportBucket persistent. For the unit test, the output bucket of the worker must be persistent so that the test case can still read the result records when the workflow has ended. Otherwise the jobmanager would remove the transient object immediately after the HelloWorldWorker has finished.

{
   "name":"helloWorldExportBucket",
   "type":"recordBulks"
}

Activate the Worker

Add the worker bundle to the configuration and set scale-up as described above if you haven't already done yet.

Testing

Use the launcher

If everything was done correctly and you start the SMILA.launch in Eclipse, you should see the same output as described above, but for your own worker.

You should also check whether your new workflow definition is visible in [3]. If not, you maybe misstyped a worker name or something. If there is no workflow at all, the workflows.json file has invalid syntax.

Create worker unit test

You can use the test bundle template org.eclipse.smila.integration.worker.test to add a test for your worker. Have a look at the example test class org.eclipse.smila.integration.worker.test.TestHelloWorldWorker that comes with the SDK.

All configuration files for the test are in org.eclipse.smila.integration.worker.test/configuration. This is similar to SMILA.application/configuration, but contains only the configuration files necessary to run the tests, not all files needed by a complete system. Also, some configuration files may differ from those in SMILA.application, e.g. some components may be configured with smaller limits to make tests run quicker. However, if you create a new worker, you must add its description to the workers.json in the test bundles and define persistent buckets and workflows required to run the test. Additionally make sure that the config.ini contains the names of your worker bundles and those of services your worker needs to access.

To start the test in eclipse you have to copy the launch for TestHelloWorldWorker and adapt it to your new test class.

Manually installing the worker in SMILA

In the following we describe the steps to deploy your worker manually to an existing SMILA installation.

Alternatively, you could also add your worker bundle to the SMILA build process and build a new SMILA application.

Create a feature project

A feature project is a container project that defines the Plug-ins needed for a specific feature. In our case our feature is to provide a worker, so we'll only have one Plug-in included in that feature, but it can also be reasonable to include all worker Plug-ins that are necessary to extend the SMILA to be able to handle a specific scenario in one feature that can be deployed and so includes all plugins necessary.

If you ever need to create an own feature project you can use Eclipse's New... wizard:

  • New --> Plug-in Development --> Feature Project
    • Enter a Project name (e.g. org.eclipse.smila.integration.feature)
    • Fill in other feature properties to describe the new feature ('Version' should match the version of your new worker bundle)
  • Next
    • select your worker bundle
  • Finish

Deploy your worker feature

Now it's easy to export your custom bundles to files that can be easily deployed into SMILA:

  • Select your feature project
  • Right-click on it
  • Click on Export...
  • Select Plug-in Development --> Deployable features
  • Next
  • Select your new worker feature
  • Select a destination folder. If you are re-exporting after changes (especially after renames), you should first delete the destination folder.
  • Click Finish

After that you will find plugins and features directories in your destination directory that contain the deployable software. The export process produces two additional files artifacts.jar and contents.jar which are not for our purposes.

Install your worker feature in a SMILA installation

  • Copy the features and plugins folder to your SMILA installation.
  • merge your configuration changes (e.g. configuration/org.eclipse.smila.jobmanager) into the SMILA configuration
    • copy your configuration/config.ini file (see above) or edit the installed config.ini directly to start up your bundle
      • e.g. for the above bundle and version this would be (in the second last line): org.eclipse.smila.integration.worker@4:start, \
  • start your system
  • In the console you should see:
...
Added worker HelloWorldWorker to WorkerManager.
...

Additionally you should be able to see the worker descriptions you added via the JobManager REST API: http://localhost:8080/smila/jobmanager/workers

Advanced How To's

How to access another OSGi Service inside your Worker

With SMILA there come a lot of components with APIs for different purposes. Sometimes you may want to access such an API inside your worker. With the concept of OSGi Declarative Services (DS) this is just a matter of configuration.

Example: Reading all cluster nodes

Assumed, we want to know the names of all cluster nodes in our worker. This is possible via ClusterConfigService API. Here are the steps to access this API in your worker:

  • Precondition: We assume you already configured your worker as OSGi Declarative Service as described before.
  • To use the ClusterConfigService you have to import the appropriate package org.eclipse.smila.clusterconfig in the MANIFEST.MF/Dependencies (see "Bundle Dependencies")
  • Configure ClusterConfigService as referenced service in the service description xml (OSGI-INF/...):
<?xml version="1.0" encoding="UTF-8"?>
<scr:component org.eclipse.smila.jobmanager name="MyWorker" immediate="true">
    <implementation class="mypackage.MyWorkerImpl" />
    <service>
       <provide interface="org.eclipse.smila.taskworker.Worker"/>
    </service>        
    <reference bind="setClusterConfigService"
               cardinality="1..1"
               interface="org.eclipse.smila.clusterconfig.ClusterConfigService"
               name="ClusterConfigService"
               policy="static"
               unbind="unsetClusterConfigService"/>
</scr:component>
  • Implement the specifed methods setClusterConfigService and unsetClusterConfigService in your worker implementation. This may look like this:
  private ClusterConfigService _ccs;

  public void setClusterConfigService(final ClusterConfigService ccs) {
    _ccs = ccs;
  }

  public void unsetClusterConfigService(final ClusterConfigService ccs) {
    if (_ccs == ccs) {
      _ccs = null;
    }
  }
  • Now, the OSGi framework will automatically set the SimpleClusterConfigService (which implements the interface ClusterConfigService) in your worker at startup via the specified method. So the ClusterConfigService API will be accessible at runtime:
   ...
   List<String> clusterNodes = _ccs.getClusterNodes();
   ...

How to add / access a configuration for your Worker

You can add a worker configuration, e.g. a property file, by adding it to the application configuration.

Example: Adding a property file "myWorker.properties" and access it in the worker

  • To add a worker configuration create an appropriate folder in the application configuration and place the property file there:
  SMILA.application/configuration/MY_BUNDLE_NAME/myWorker.properties 
  • To easiest way to access the configuration in your worker is via org.eclipse.smila.utils.config.ConfigUtils class
  • To use this class you have to import the appropriate package org.eclipse.smila.utils.config in the MANIFEST.MF/Dependencies (see "Bundle Dependencies")
    • For the following example code you should also import org.apache.commons.io
  • Your code could look somehow like that:
   
    InputStream configFileStream = null;
    try {
      configFileStream = ConfigUtils.getConfigStream(MY_BUNDLE_NAME, myWorker.properties);
      Properties props = new Properties();
      props.load(configFileStream);
      ...      
    } finally {
      if (configFileStream != null) {
        IOUtils.closeQuietly(configFileStream);
      }
    }

Add on: Read configuration at startup

  • If you want to initialize your worker by configuration at startup, you can use the activate() method automatically called by the OSGi framework at bundle startup.
  • To use an activate method you have to import the package org.osgi.service.component in the MANIFEST.MF.
  • Then your code could look like that
 protected void activate(final ComponentContext context) {
    try {
      readConfiguration();
      ...


Exception Handling and Logging

Exception Handling:

There are four possible outcomes of a worker:

  • SUCCESSFUL: The perform() method returns without exception. This is interpreted by the Worker Manager as a successful task execution so that it will finish the task with a SUCCESSFUL task completion status. All open output data objects will be committed. If this fails, it will continue as explained below, depending on the exception type. The task result includes all counters produced by the task execution so that they can be aggregated by the Job Manager in the job run data.
  • RECOVERABLE_ERROR: The perform() method aborts with org.eclipse.smila.taskworker.RecoverableTaskException. This will be interpreted as a temporary failure when accessing input data or writing output data to DOS. The result is that the task will be finished with a RECOVERABLE_ERROR task completion status and the Job Manager will usually reschedule the task for a later retry. Any produced counters will be ignored by the Job Manager in the job run data.
  • POSTPONE: The perform() method aborts with org.eclipse.smila.taskworker.PostponeTaskException. This means that the worker cannot yet perform the task for some reason but it should be resubmitted again later. The task will be re-added to the “todo queue” of this worker and it will be delivered again later (but quite soon, usually). Such tasks have the POSTPONE task completion status.
  • FATAL_ERROR: The perform() method aborts with any other exception (including all exceptions of type RuntimeException). This will be interpreted as an indicator that the input data cannot be processed at all, for example, because it is corrupted or contains invalid values. Such tasks will be finished with a FATAL_ERROR completion status and will not be rescheduled. Any produced counters will be ignored by the Job Manager in the job run data.

Logging:

You can use the log4j logging that comes with SMILA in your worker too. Your logging output will be logged in the standard smila.log.

  • import the package org.apache.commons.logging in the MANIFEST.MF.

Then your code could look somehow like that:

   private final Log _log = LogFactory.getLog(getClass());
   ...
   _log.debug("My worker was successful");
   ...

Create worker in new bundle resp. rename template bundle

For creating a new bundle:

  • Follow the description here to create a new bundle.

For renaming a bundle:

  • Right-click the bundle to rename in eclipse and select (Refactor/Rename).
  • Right-click java package and select (Refactor/Rename).
  • Open MANIFEST.MF and set a version property to the (renamed) exported package. Runtime/Exported Packages

Hint: if there are strange compile problems afterwards, and refresh resp. clean projects doesn't help, try restarting your eclipse IDE.

MANIFEST.MF / OSGI-INF / build.properties:

  • Apapt the changes in your OSGI-INF component description xml file
  • Please be sure that your OSGi component definition file is included in the MANIFEST.MF file in the Service-Component section! Otherwise the service component will not be recognized and thus not be started.
  • Please be sure that the OSGI-INF/ folder is included in your build.properties

test bundle:

  • Adapt the test bundle to the changes:
    • change name of test bundle and java package (Refactor/Rename, like described above for the worker bundle itself).
    • correct the imported packages in the code and the MANIFEST.MF (if not done correctly by refactoring)
    • adapt the test's run configuration, e.g. name, test bundle's java package, configuration file location (on tab "configuration")
    • adapt the config.ini file

Application launch:

  • Add the new/renamed bundle to the eclipse launcher and also to your application configuration/config.ini file with an appropriate start level.

feature project:

  • You have to add your new/renamed bundle to the feature project.
  • clear the destination folder for feature exports.

Back to the top