The latest version for DAS is WSO2 Data Analytics Server 3.1.0. View documentation for the latest release.
WSO2 Data Analytics Server is succeeded by WSO2 Stream Processor. To view the latest documentation for WSO2 SP, see WSO2 Stream Processor Documentation.

All docs This doc
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
||
Skip to end of metadata
Go to start of metadata

Siddhi allows you to forecast future events using linear regression on real time data streams. The forecast function uses a dependent event stream (Y), an independent event stream (X) and a user-specified next X value, and returns the forecast Y value based on the regression equation of the historical data.

The two implementations of the forecast function can be distinguished as follows.

  • forecast: This allows you to specify a batch size (optional) that defines the number of events to be considered for the regression calculation when forecasting the Y value.
  • lengthTimeForecast: This allows you to restrict the number of events considered for the regression calculation when forecasting the Y value based on a specified time window and/or batch size.

Input parameters for the forecast function

The following table describes the input parameters available for the forecast function.

ParameterDescriptionRequired/OptionalDefault Value
Calculation IntervalThe frequency with which the regression calculation should be carried out.Optional1 (i.e., for every event)
Batch SizeThe maximum number of events that should be used for a regression calculation.Optional1,000,000,000
Confidence IntervalThe confidence interval to be used for a regression calculation.Optional0.95
Next X ValueThe value to be used to forecast the Y value. This can be a constant or an expression (e.g., x+5).Required 
Y StreamThe data stream of the dependent variable.Required 
X StreamThe data stream of the independent variable.Required 

Format: forecast(nextX, Y, X) or forecast(calculation interval, batch size, confidence interval, nextX, Y, X)

Input parameters for the lengthTimeForecast function

The following table describes the input parameters available for the lengthTimeForecast function.

ParameterDescriptionRequired/OptionalDefault Value
Time WindowThe maximum time duration that should be considered for a regression calculation.Required 
Batch SizeThe maximum number of events that shoukd be used for a regression calculation.Required 
Next X ValueThe value to be used to forecast the Y value. This can be a constant or an expression (e.g., x+5).Required 
Calculation IntervalThe frequency with which the regression calculation should be carried out.Optional1 (i.e., for every event)
Confidence IntervalThe confidence interval to be used for a regression calculation.Optional0.95
Y StreamThe data stream of the dependent variable.Required 
X StreamThe data stream of the independent variable.Required 

FormatlengthTimeForecast(time window, batch size, nextX, Y, X) or lengthTimeForecast(time window, batch size, nextX, calculation interval, confidence interval, Y, X)

Output parameters

The following table describes the output parameters.

The same output parameters are available for each implementation.

Parameter

Name

Description

Forecast Y

forecastY

The forecast Y value based on next X and regression equation.

Standard Error

stdError

The standard error of the regression equation.

β coefficients

beta0, beta1

β coefficients of the simple linear regression.

Input Stream Data

The name given in the input stream.

All the items sent in the input stream.

Examples

The queries given in the examples below return the following wen executed.

  • Y value based on the regression equation established using the Y stream and the X stream
  • The standard error of the regression equation (ε)
  • β coefficients
  • All the items available in the input stream

Example 1

The following query submits an expression to be used as the next X value (X+2), a dependent input stream (Y,) and an independent input stream (X) that are used to perform linear regression between Y and X streams, and compute the forecast Y value based on the next X value specified by you.

from StockExchangeStream#timeseries:forecast(X+5, Y, X)
select *
insert into StockForecaster

Example 2

The following query submits a time window (2 seconds), a batch size (100 events), a constant to be used as the next X value (10), a dependent input stream (Y) and an independent input stream (X) that are used to perform linear regression between Y and X streams, and compute the forecast Y value based on the next X value specified by you.

from StockExchangeStream#timeseries:lengthTimeForecast(2 sec, 100, 10, Y, X)
select *
insert into StockForecaster
  • No labels