This documentation is for WSO2 Stream Processor 4.4.0 (the latest version of WSO2 SP. View documentation for the Streaming Integrator, the successor of WSO2 SP.

All docs This doc
||
Skip to end of metadata
Go to start of metadata

Introduction

In the previous tutorials, we looked at the core Siddhi functionalities including event ingestion, publishing and many forms of processing such as preprocessing, correlation, KPI analysis, trend analysis etc.

In this tutorial, let's move on to the Machine Learning capabilities offered by WSO2 Stream Processor including real-time prediction, online machine learning, anomaly detection etc. Let's look at how static real-time predictions can be made via WSO2 SP using PMML serializations.

The factory foreman of the Sweet Factory needs to predict the nature of next shipment of sugar syrup based on the shipments he has received so far. A predictive solution has been trained with inputs, temperature and density of the latest shipment. Using these, a prediction is required on whether the shipment received meets his requirements before it is dispatched to the factory. We can also assume that this pre-trained model is exported in PMML serialization and that it is available in the system. To build and train a model, you can use this PMML sample.

Before you begin:

PMML (Predictive Model Markup Language) is a standardized serialization that is used for exporting predictive solutions (machine learning models). PMML works by defining the model in one system and transferring the model to a different system via an XML file. This allows predictions to be made using events from the new system. This XML file can contain various data transformation and preprocessing steps in addition to one or more predictive models.

In addition to PMML, TensorFlow serialization is also supported by Stream Processor for static real-time predictions. For more details, see Siddhi Extensions Documentation - Tensorflow.

Tutorial steps

Let's get started!

  1. Let's add an input event stream definition to capture the events generated by the supplier before shipping the sugar syrup to the sweet factory.

    define stream SugarSyrupDataStream (temperature double, density double);
  2. Now let's define the output stream. To include a prediction on whether the shipment will be acceptable or not in the output, this definition must include an attribute for the prediction as shown below.

    define stream PredictedSugarSyrupDataStream (nextTemperature double, nextDensity double, decision bool);

  3. As you have learnt from previous tutorials, a reading from an input stream looks similar to the following.

    from SugarSyrupDataStream

    For this scenario, you need to update it as follows.

    1. To enable the PMML extension, you need to add the #pmml:predict() annotation as shown below.

      from SugarSyrupDataStream#pmml:predict()

    2. To access the pre-trained PMML model via which the predictions are made, specify the path as follows.

      from SugarSyrupDataStream#pmml:predict( "/home/user/decision-tree.pmml" )

    3. Let's also add the attributes that are needed by the model for prediction.

      from SugarSyrupDataStream#pmml:predict("/home/user/decision-tree.pmml", temperature, density)

      Based on the model definition, the output attributes can differ. Here, you have defined the model so that it can return a prediction on whether the shipment can be accepted, based on the given temperature and density.

  4. Let's route this output to the output stream as shown below.

    from SugarSyrupDataStream#pmml:predict("/home/user/decision-tree.pmml", temperature, density)
    select *
    insert into PredictedSugarSyrupDataStream;

The completed Siddhi application is as follows.

@App:name('SugerSyrupPredictionApp')

@source(type='http', receiver.url='http://localhost:5006/SugarSyrupEP', @map(type = 'json'))
define stream SugarSyrupDataStream (temperature double, density double);

@sink(type='log', prefix='Predicted next sugar syrup shipment:')
define stream PredictedSugarSyrupDataStream (nextTemperature double, nextDensity double, decision bool);

from SugarSyrupDataStream#pmml:predict("/home/user/decision-tree.pmml", temperature, density)
select *
insert into PredictedSugarSyrupDataStream;
  • No labels