This documentation is for WSO2 Complex Event Processor 4.0.0. View documentation for the latest release.
WSO2 Complex Event Processor is succeeded by WSO2 Stream Processor. To view the latest documentation for WSO2 SP, see WSO2 Stream Processor Documentation.

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Extension Type Stream Processor 
  • Description: This will run the R script loaded from a file for each event and produce aggregated outputs based on the provided input variable parameters and expected output attributes. 
  • ParameterfilePath: The file path of R script where this script that uses the input variable parameters and produce the expected output attributes. 
  • ParameteroutputAttributes: All output attributes separated by comma as string here each attribute is denoted as <name><space><type>. e.g., 'output1 string, output2 long'
  • Parameterinput1, input2, ...: Input parameters have to be variable attributes of the input stream, function does not accept any constant values as input parameters. 
  • Return Parameter: output1, output2, ...: Output parameters will be generated according to the provided outputAttributes.

  • Examples: eval('/home/user/test/script1.R', 'totalItems int, totalTemp double', items, temp) where items being int and temp being double will return [ totalItems, totalTemp ] where totalTemp will be int and totalTemp will be double.

Anchor
regex
regex

regex

Following are the functions of the RegEx extension.

find

<bool> find(<string> regex , <stringinputSequence)

  • Extension TypeFunction
  • Description: This method attempts to find the next sub-sequence of the 'inputSequence' that matches the 'regex' pattern and returns true if the sub sequence exists else false
  • Examples: find("\d\d(.*)WSO2", "21 products are produced by WSO2 currently") returns true while find("\d\d(.*)WSO2", "21 products are produced currently") returns false.

<bool> find(<stringregex , <stringinputSequence, <int> startingIndex)

  • Extension TypeFunction
  • Description: This method attempts to find the next sub-sequence of the 'inputSequence' that matches the 'regex' pattern starting from a given index in the 'inputSequence' and returns true if the sub sequence exists else false
  • Examples: find("\d\d(.*)WSO2", "21 products are produced within 10 years by WSO2 currently by WSO2 employees",  30) returns true while find("\d\d(.*)WSO2", "21 products are produced within 10 years by WSO2 currently by WSO2 employees", 35) returns false.

group

<string> group(<string> regex , <stringinputSequence <int> groupId)

  • Extension TypeFunction
  • Description: This method returns the input sub-sequence captured by the given group during the previous match operation else returns null
  • Examples: group("(\d\d)(.*)(WSO2.*)", "21 products are produced within 10 years by WSO2 currently by WSO2 employees", 3) returns "WSO2 employees".

lookingAt

<string> lookingAt(<string> regex , <string> inputSequence)

  • Extension TypeFunction
  • Description: This method attempts to match the 'inputSequence', starting at the beginning, against the 'regex' pattern.
  • Examples: lookingAt("\d\d(.*)WSO2", "21 products are produced by WSO2 currently in Sri Lanka") returns true while lookingAt("WSO2(.*)middleware(.*)", "sample test string and WSO2 is situated in trace and its a middleware company") returns false.

matches

<string> matches(<string> regex , <string> inputSequence)

  • Extension TypeFunction
  • Description: This method attempts to match the entire 'inputSequence' against the 'regex' pattern.
  • Examples: matches("WSO2(.*)middleware(.*)", "WSO2 is situated in trace and its a middleware company") returns true while matches("WSO2(.*)middleware", "WSO2 is situated in trace and its a middleware company") returns false.

Anchor
time
time

time

Following are the functions of the time extension.

currentDate

<string> currentDate()

  • Extension TypeFunction
  • Description: This method returns current system date in yyyy-MM-dd format.
  • Examples: currentDate() returns 2015-08-20.

currentTime

<string> currentTime()

  • Extension TypeFunction
  • Description: This method returns current system time in HH:mm:ss format.
  • Examples: currentTime() returns 13:15:10.

currentTimestamp

<string> currentTimestamp()

  • Extension TypeFunction
  • Description: This method returns current system timestamp in yyyy-MM-dd HH:mm:ss format.
  • Examples: currentTime() returns 2015-08-20 13:15:10.

dateAdd

  • dateValue - value of date. eg: "2014-11-11 13:23:44.657", "2014-11-11" , "13:23:44.657"
  • expr - In which amount, selected date format part should be incremented. eg: 2 ,5 ,10 etc
  • unit - Which part of the date format you want to manipulate. eg: "MINUTE" , "HOUR" , "MONTH" , "YEAR" , "QUARTER" , * "WEEK" , "DAY" , "SECOND"
  • dateFormat - Date format of the provided date value. eg: yyyy-MM-dd HH:mm:ss.SSS
  • timestampInMilliseconds - date value in milliseconds.(from the epoch) eg: 1415712224000L

<string> dateAdd(<string> dateValue , <longexpr, <string> unit, <string> dateFormat)

  • Extension TypeFunction
  • Description: this method returns added specified time interval to a date.
  • Examples: dateAdd("2014-11-11 13:23:44", 2, 'year',"yyyy-MM-dd HH:mm:ss") will return "2016-11-11 13:23:44"

<string> dateAdd(<string> dateValue , <longexpr, <string> unit)

  • Extension TypeFunction
  • Description: this method returns added specified time interval to a date.
  • Examples: dateAdd("2014-11-11 13:23:44", 2, 'year') will return "2016-11-11 13:23:44"

<stringdateAdd(<long> timestampInMilliseconds, <long> expr, <string> unit)

  • Extension TypeFunction
  • Description: this method returns added specified time interval to a timestamp in milliseconds.
  • Examples: dateAdd(1415692424000L, 2, 'year') will return "2016-11-11 13:23:44"

dateSub

  • dateValue - value of date. eg: "2014-11-11 13:23:44.657", "2014-11-11" , "13:23:44.657" 
  • unit - Which part of the date format you want to manipulate. eg: "MINUTE" , "HOUR" , "MONTH" , "YEAR" , "QUARTER" , * "WEEK" , "DAY" , "SECOND" 
  • expr - In which amount, selected date format part should be decremented. eg: 2 ,5 ,10 etc 
  • dateFormat - Date format of the provided date value. eg: yyyy-MM-dd HH:mm:ss.SSS 
  • timestampInMilliseconds - date value in milliseconds.(from the epoch) eg: 1415712224000L

<string> dateSub(<string> dateValue , <long> expr, <string> unit, <string> dateFormat)

  • Extension TypeFunction
  • Description: this method returns added specified time interval to a date.
  • Examples: dateSub("2014-11-11 13:23:44", 2, 'year',"yyyy-MM-dd HH:mm:ss") will return "2012-11-11 13:23:44"

<string> dateSub(<string> dateValue , <longexpr, <string> unit)

  • Extension TypeFunction
  • Description: this method returns added specified time interval to a date.
  • Examples: dateSub("2014-11-11 13:23:44", 2, 'year') will return "2012-11-11 13:23:44"

<string> dateSub(<long> timestampInMilliseconds, <long> expr, <string> unit)

  • Extension TypeFunction
  • Description: this method returns added specified time interval to a timestamp in milliseconds.
  • Examples: dateSub(1415692424000L, 2, 'year') will return 1352620424000

dateDiff

  • dateValue1 - value of date. eg: "2014-11-11 13:23:44.657", "2014-11-11" , "13:23:44.657" 
  • dateValue2 - value of date. eg: "2014-11-11 13:23:44.657", "2014-11-11" , "13:23:44.657" 
  • dateFormat1 - Date format of the provided dateValue1. eg: yyyy-MM-dd HH:mm:ss.SSS
  • dateFormat2 - Date format of the provided dateValue2. eg: yyyy-MM-dd HH:mm:ss.SSS
  • timestampInMilliseconds1 - date value in milliseconds.(from the epoch) eg: 1415712224000L 
  • timestampInMilliseconds2 - date value in milliseconds.(from the epoch) eg: 1423456224000L

<int> dateDiff(<string> dateValue1, <string> dateValue2<string> dateFormat1, <string> dateFormat2)

  • Extension TypeFunction
  • Description: Returns time(days) between two dates.
  • Examples: dateDiff('2014-11-11 13:23:44', '2014-11-9 13:23:44', 'yyyy-MM-dd HH:mm:ss', 'yyyy-MM-dd HH:mm:ss') will return 2

<int> dateDiff(<string> dateValue1, <string> dateValue2)

  • Extension TypeFunction
  • Description: Returns time(days) between two dates.
  • Examples: dateDiff('2014-11-11 13:23:44.000', '2014-11-9 13:23:44.000') will return 2

<int> dateDiff(<string> timestampInMilliseconds1, <string> timestampInMilliseconds2)

  • Extension TypeFunction
  • Description: Returns time(days) between two dates.
  • Examples: dateDiff(1415692424000, 1415519624000) will return 2

dateFormat

  • dateValue - value of date. eg: "2014-11-11 13:23:44.657", "2014-11-11" , "13:23:44.657" 
  • dateTargetFormat - Date format which need to be converted to. eg: yyyy/MM/dd HH:mm:ss 
  • dateSourceFormat - Date format of the provided date value. eg: yyyy-MM-dd HH:mm:ss.SSS
  • timestampInMilliseconds - date value in milliseconds.(from the epoch) eg: 1415712224000L 
  • dateTargetFormat - Date format which need to be converted to. eg: yyyy/MM/dd HH:mm:ss

<string> dateFormat(<string> dateValue,<string> dateTargetFormat,<string> dateSourceFormat)

  • Extension TypeFunction
  • Description: Returns a formatted date string
  • Examples: dateFormat('2014-11-11 13:23:55', 'ss', 'yyyy-MM-dd HH:mm:ss') will return 55

<string> dateFormat(<string> dateValue,<string> dateTargetFormat)

  • Extension TypeFunction
  • Description: Returns a formatted date string
  • Examples: dateFormat('2014-11-11 13:23:55.657', 'ss') will return 55

<string> dateFormat(<long> timestampInMilliseconds,<string> dateTargetFormat)

  • Extension TypeFunction
  • Description: Returns a formatted date string
  • Examples: dateFormat(1415692424000, 'yyyy-MM-dd') will return 2014-11-11  

extract

  • dateValue - value of date. eg: "2014-11-11 13:23:44.657", "2014-11-11" , "13:23:44.657" 
  • unit - Which part of the date format you want to manipulate. eg: "MINUTE" , "HOUR" , "MONTH" , "YEAR" , "QUARTER" , * "WEEK" , "DAY" , "SECOND" 
  • dateFormat - Date format of the provided date value. eg: yyyy-MM-dd HH:mm:ss.SSS 
  • timestampInMilliseconds - date value in milliseconds.(from the epoch) eg: 1415712224000L

<int> extract(<string> unit,<string> dateValue, <string> dataFormat)

  • Extension TypeFunction
  • Description: This method returns the time component of the dateValue specified in unit parameter
  • Examples: extract('year', '2014-3-11 02:23:44', 'yyyy-MM-dd hh:mm:ss') will return 2014

<int> extract(<string> unit,<string> dateValue)

  • Extension TypeFunction
  • Description: This method returns the time component of the dateValue specified in unit parameter
  • Examples: extract('year', '2014-3-11 02:23:44.234') will return 2014

<int> extract(<long> timestampInMilliseconds,<string> unit)

  • Extension TypeFunction
  • Description: This method returns the time component of the dateValue specified in unit parameter
  • Examples: extract(1394484824000, 'year') will return 2014

date

<string> date(<string> dateValue,<string> dateFormat)

  • Extension TypeFunction
  • Description: This method returns the date component of the dateValue
  • Examples: extact('2014-11-11 13:23:44', 'yyyy-MM-dd HH:mm:ss') will return 2014-11-11

timestampInMilliseconds

<long> timestampInMilliseconds()

  • Extension TypeFunction
  • Description: This method returns the current timestamp in milliseconds
  • Examples: timestampInMilliseconds() will return 1440160328693

<long> timestampInMilliseconds(<string> dateValue)

  • Extension TypeFunction
  • Description: This method returns the timestamp of the value specified in dateValue parameter. The value should be in 'yyyy-MM-dd HH:mm:ss.SSS' format
  • Examples: timestampInMilliseconds('2007-11-30 10:30:19.000') will return 1196398819000

<long> timestampInMilliseconds(<string> dateValue, <string> dateFormat)

  • Extension TypeFunction
  • Description: This method returns the timestamp of the value specified in dateValue parameter. The date format can be given in the dateFormat parameter
  • Examples: timestampInMilliseconds('2007-11-30 10:30:19', 'yyyy-MM-dd HH:mm:ss') will return 1196398819000

utcTimestamp

<string> utcTimestamp()

  • Extension TypeFunction
  • Description: Returns System time in yyyy-MM-dd HH:mm:ss format.
  • Examples: utcTimestamp() will return 2015-08-21 12:16:13

Anchor
nlp
nlp

nlp

findNameEntityType

<string> findNameEntityType(<string> entityType, <bool> groupSuccessiveMatch, <string> string-variable)

  • Extension TypeFunction
  • Description
    • The findNameEntityType function takes in 
      • entityType: a user given string constant as entity Type - PERSON, LOCATION, ORGANIZATION, MONEY, PERCENT, DATE or TIME
      • groupSuccessiveMatchuser given boolean constant in order to group successive matches of the given entity type and a text stream. 
      • streamAttribute: a string or the stream attribute which the text stream resides
    • It returns the entities in the text. If we give group successive matches as true the result will aggregate successive words of the same entity type.
  • Examples: findNameEntityType("PERSON",true,text), if text attribute contains "Bill Gates donates £31million to fight Ebola" result will be "Bill Gates". If groupSuccessiveMatch is "false" two events will be generated as "Bill" and "Gates".

findNameEntityTypeViaDictionary

<string> findNameEntityTypeViaDictionary(<string> entityType, <string> dictionaryFilePath<string> string-variable)

  • Extension TypeFunction
  • Description
    • The findNameEntityType function takes in 
      • entityType: a user given string constant as entity Type - PERSON, LOCATION, ORGANIZATION, MONEY, PERCENT, DATE or TIME
      • dictionaryFilePathpath to the dictionary which expected entities for the entity types and the dictionary should be in the following form
      • string-variable: a string or the stream attribute which the text stream resides
    • It returns the entities in the text. If we give group successive matches as true the result will aggregate successive words of the same entity type.
  • Examples: findNameEntityTypeViaDictionary("PERSON","dictionary.xml",text): If the text attribute contains "Bill Gates donates £31million to fight Ebola", and the dictionary consists of the above entries , the result will be "Bill".

findRelationshipByVerb

<stringtext, <string> subject<string> object <string> verb findRelationshipByVerb(<string> verb, <string> string-variable)

  • Extension TypeFunction
  • Descriptiontakes in a user given string constant as a verb, and a text stream. Then it returns whole text,  subject, object, verb relationship from the text stream that can be extracted for any form of that verb
    • The findRelationshipByVerb function takes in 
      • verb: user given string constant
      • string-variable: a string or the stream attribute which the text stream resides
    • It returns the complete string, subject object and the verb if the entered verb is resides in the input text.
  • Examples: findRelationshipByVerb("say", "Information just reaching us says another Liberian With Ebola Arrested At Lagos Airport"), returns 4 parameters. the whole text, subject as Information, object as Liberian, verb as "says".

findRelationshipByRegex

<stringtext, <string> subject<string> object <string> verb findRelationshipByRegex(<string> regex, <string> string-variable)

  • Extension TypeFunction
  • Descriptionit returns whole text, subject, object and verb from the text stream that match with the named nodes of the Semgrex pattern
    • The findRelationshipByRegex function takes in 
      • regex: user given regular expression that match the Semgrex pattern syntax
      • string-variable: a string or the stream attribute which the text stream resides
    • It returns the entities in the text. If we give group successive matches as true the result will aggregate successive words of the same entity type.
  • Examples: findRelationshipByRegex('{}=verb >/nsubj|agent/ {}=subject >/dobj/ {}=object', "gates foundation donates $50M in support of #Ebola relief"), returns 4 parameters. the whole text, subject as "foundation", object as "$", verb as "donates".

findSemgrexPattern

<stringtext, <string> match<string> object <string> verb findSemgrexPattern(<string> regex, <string> string-variable)

  • Extension TypeFunction
  • Descriptionit returns whole text, subject, object and verb from the text stream that match with the named nodes of the Semgrex pattern
    • The findSemgrexPattern function takes in 
      • regex: user given regular expression that match the Semgrex pattern syntax
      • string-variable: a string or the stream attribute which the text stream resides
    • it returns word(s)/phrase(s) from the text stream that match with the Semgrex pattern and word(s)/relation(s) that match with each named node and each named relation defined in the regular expression.
  • Examples: findSemgrexPattern('{lemma:die} >/.*subj|num.*/=reln {}=diedsubject', "Sierra Leone doctor dies of Ebola after failed evacuation.") returns 4 parameters. the whole text, match as "dies", reln as "nsubj", diedsubject as "doctor".
    • This will look for words with lemmatization die which are governors on any subject or numeric relation. The dependent is marked as the diedsubject and the relationship is marked as reln. Thus, the query will return an output stream that will out the full match of this expression, i.e the governing word with lemmatization for die. In addition it will out the named node diedsubject and the named relation reln for each match it find.

findTokensRegexPattern

<string> text, <string> match, <string> group_1, etc. findTokensRegexPattern(<string> regex, <string> string-variable)

  • Extension TypeFunction
  • Descriptionit returns whole text, subject, object and verb from the text stream that match with the named nodes of the Semgrex pattern
    • The findTokensRegexPattern function takes in 
      • regex: user given regular expression that match the Semgrex pattern syntax
      • string-variable: a string or the stream attribute which the text stream resides
    • it returns word(s)/phrase(s) from the text stream that match with the Semgrex pattern and word(s)/relation(s) that match with each named node and each named relation defined in the regular expression.
  • Examples: findTokensRegexPattern('([ner:/PERSON|ORGANIZATION|LOCATION/]+) (?:[]* [lemma:donate]) ([ner:MONEY]+)', text) returns 4 parameters. the whole text, match as "Paul Allen donates $ 9million", group_1 as "Paul Allen", group_2 as "$ 9million".
    • It defines three groups and the middle group is defined as a non capturing group. The first group looks for words that are entities of either PERSON, ORGANIZATON or LOCATION with one or more successive words matching same. Second group represents any number of words followed by a word with lemmatization for donate such as donates, donated, donating etc. Third looks for one or more successive entities of type MONEY.
Anchor
pmml
pmml

pmml 

predict

<double|float|long|int|string|booleanpredict(<string> pathToPmmlFile)

  • Extension TypeStream Processor 
  • Description: Process the input stream attributes according to the defined PMML standard model and outputs the processed results along with the input stream attributes.
    • The predict function takes in 
      • pathToPmmlFile: path to the PMML model file
    • Returns the outputs defined in the output fields. The number of outputs can be varied.
  • Examplespredict('<CEP HOME>/samples/artifacts/0301/decision-tree.pmml')
    • This model is implemented to detect network intruders. The input event stream is processed by the execution plan which uses the pmml predictive model to detect whether a particular user is an intruder to the network or not. The output stream contains the processed query results which includes the predicted responses along with the feature values extracted from the input event stream.

 <double|float|long|int|string|booleanpredict(<string> pathToPmmlFile,  <double|float|long|int|string|boolean> input)

  • Extension TypeStream Processor 
  • Description: Process the input stream attributes according to the defined PMML standard model and outputs the processed results.
    • The predict function takes in
      • pathToPmmlFile: path to the PMML model file
      • input: attribute of the input stream which is sent to the PMML model as values for predictions. Function does not accept any constant values as input parameters. You can have multiple input parameters according to the input stream definition.
    • Returns the processed outputs defined in the query. The number of outputs can be varied according to the query definition.
  • Examplespredict('<CEP HOME>/samples/artifacts/0301/decision-tree.pmml', root_shell double, su_attempted double, num_root double, num_file_creations double, num_shells double, num_access_files double, num_outbound_cmds double, is_host_login double, is_guest_login double , count double, srv_count double, serror_rate double, srv_serror_rate double)
    • This model is implemented to detect network intruders. The input event stream is processed by the execution plan which uses the pmml predictive model to detect whether a particular user is an intruder to the network or not. The output stream contains the processed query results which includes the predicted responses.

...