The latest version for DAS is WSO2 Data Analytics Server 3.2.0. View documentation for the latest release.
WSO2 Data Analytics Server is succeeded by WSO2 Stream Processor. To view the latest documentation for WSO2 SP, see WSO2 Stream Processor Documentation.
||
Skip to end of metadata
Go to start of metadata

Facets and score parameters are used to categorize data based on attributes of an event stream in WSO2 DAS.

Facets

A facet is an attribute of indexed records which is used to classify the records by the attribute value. Facet attributes allow you to carry out a faceted extended search within the defined categories.  There is no data type called FACET in event streams. Any STRING type field, of which you can define the attribute value as a JSON array  can be indexed as a facet.  Facets are defined when the table schema is created during the persisting of event streams

Facets are used in the implementations of the following REST APIs

Facet usage types

Facets are used in the Analytics REST API and Data Explorer in WSO2 DAS.  Different  usage types of facets are described below.

Searching by an attribute

This denotes implementing an attribute of a set of records as a facet. For an example,  in a record which represents a book, you can define the AUTHOR field as a facet when you are persisting the event stream as shown below.

categorizing records by one facet attribute

Click the Simulate option of the event stream to simulate sending events to the created event stream. 

simulate one facet attribute

For the attribute which you defined as a facet,   you need to send its values as a JSON string array  as shown in the example  below 

send event for one attribute facet

You can use this facet to retrieve the records of the books which are written by a particular author using the Data Explorer as shown in the example below.

retrieve records using one facet

You can perform the above search in the Analytics REST API using the following request. For more information, see  Drilling Down Through Categories via REST API.

POST https://localhost:9443/analytics/drillDown
 
{
           "tableName": "BOOK_STORE",
            "categories": [
                {
                          "fieldName": "AUTHOR",
                            "path" : ["C.Dickens"]
                }
            ],
            "query" : "timestamp : [1243214324532 TO 4654365223]",
            "recordStart" : 0,
            "recordCount" : 100
}
Extracting sub categories  

Another use of facets is to extract the sub categories of a category. You can retrieve the immediate sub-categories of a given category (which are represented in a JSON array), using the relevant API. The API returns the immediate subcategories of the given category in the corresponding table.  

For an example,  in a record which represents a book, you can define the PUBLISHED_DATE field as a facet when you are persisting the event stream as shown below.

persist events for facet hierarchy 

Click the Simulate option of the event stream to simulate sending events to the created even stream.   

simulate one facet attribute  

Send its values as a JSON string array for the attribute which you defined as a facet as shown in the example below.   

send events for the facet hierarchy example

e.g., If the above BOOK_STORE  table contains the below four records with the corresponding values for PUBLISHED_DATE attribute, the REST API returns the sub categories of ['1926'], which are '08', '04', and the sub categories of ['1926', '08'], which are '09' and '10'.

  • Record 1 -  PUBLISHED_DATE: ['1926', '08', '09']
  • Record 2 -  PUBLISHED_DATE: ['1926', '04', '02']
  • Record 3 -  PUBLISHED_DATE: ['1816', '09', '01']
  • Record 4 -  PUBLISHED_DATE: ['1926', '08', '10']

You can  retrieve PUBLISHED_DATE as a specific category and its sub categories using the Data Explorer as shown below.

retrieving sub categories of 1826

retrieving sub categories 

You can perform the above search in the Analytics REST API using the following requests. For more information, see Drilling Down Through Categories via REST API.

POST https://localhost:9443/analytics/facets
{
        "tableName" : "BOOK_STORE",
        "fieldName" : "PUBLISHED_DATE",
        "categoryPath" : ["1926"],
        "query" : "timestamp : [1213343534535 TO 465464564644]"
}
POST https://localhost:9443/analytics/facets
{
        "tableName" : "BOOK_STORE",
        "fieldName" : "PUBLISHED_DATE",
        "categoryPath" : ["1926","08"],
        "query" : "timestamp : [1213343534535 TO 465464564644]"
}
Performing a drill down search

This denotes a hierarchical implementation of a collection of several categories of attributes within one attribute.  The values of a set of records, which you can use to classify the records can be indexed as facets. Those fields which are indexed as facets are used to implement faceted search and drill-down.

For an example,  in a record which represents a book, you can define the  PUBLISHED_DATE  field as a facet when you are  persisting the event stream  as shown below.

persist events for facet hierarchy 

Click the  Simulate  option of the event stream to simulate sending events to the created even stream.  

simulate one facet attribute

For the attribute which you defined as a facet,   you need to send its values as a JSON string array  as shown in the example  below 

send events for the facet hierarchy example 

You can use a facet to  filter the books by the published date as a specific category and published year/month/date as its sub categories using the Data Explorer as shown below.

search results of the facet hierarchy 

You can perform the above search in the Analytics REST API using the following request. It will return the records which match the drill down search. For more information, see Retrieving Specific Records through a Drill Down Search via REST API.

POST https://localhost:9443/analytics/drillDown
 
{
           "tableName": "BOOK_STORE",
            "categories": [
                {
                          "fieldName": "PUBLISHED_DATE",
                            "path" : ["1866", "05", "03"]
                }
            ],
            "query" : "timestamp : [1243214324532 TO 4654365223]",
            "recordStart" : 0,
            "recordCount" : 100
}

In the above example, PUBLISHED_DATE is a facet of which values are defined in a three element JSON array. In this example, "05" is a sub-category of “1866”, and “03” is a sub-category of “05”. This information is useful to perform drill down search operations. If you want to retrieve records of which the PUBLISHED_DATE starts with “1866”, provide only “1866” in a JSON array as the value of the facet in the REST API request. Similarly, if you want to retrieve records of which the PUBLISHED_DATE is “1866/05/ANY_DAY”, provide [“1866”, “05”] as the value of the facet in the REST API request.

Also you can perform the above search in the Analytics REST API using the following request. It will return the number of records which match the drill down search. For more information, see Retrieving the Number of Records Matching the Drill Down Criteria via REST API

POST https://localhost:9443/analytics/drillDownCount
 
{
           "tableName": "BOOK_STORE",
            "categories": [
                {
                          "fieldName": "PUBLISHED_DATE",
                            "path" : ["1866", "05", "03"]
                }
            ],
            "query" : "timestamp : [1243214324532 TO 4654365223]
 			"recordStart" : 0,
            "recordCount" : 100
}
Searching data within a value range

This  denotes using facets to filter data based on a value range of an attribute which is defined as a facet.

For an example, in a record which represents a book, you can define the  PRICE  field as a facet when you are  persisting the event stream  as shown below.

Define the field based on which you want to search data as numeric (INTEGER, FLOAT etc.), and as an Index Column.

defining price field as a facet

Click the  Simulate  option of the event stream to simulate sending events to the created even stream.  

simulate one facet attribute

You can perform the above search in the Analytics REST API using the following request. For more information, see Retrieving the Event Count of Range Facets.

in this WSO2 DAS version, the Data Explorer does not support performing search operations on range facets.

POST https://localhost:9443/analytics/rangecount

{
        "tableName": "BOOK_STORE",
        "rangeField" : "PRICE",
        "ranges" : [
                {
                        "label" : "20USD - 30USD",
                        "from" : 20,
                        "to" : 30
                },
                {
                        "label" : "30USD - 40USD",
                        "from" : 30,
                        "to" : 40
                }
        ],
        "query" : "*:*"
}

Score parameters

Score parameters are used as function parameters of score functions. You can define only INTEGER, DOUBLE, FLOAT or LONG type fields as score parameters. You can define score parameters when you persist event stream definitions along with indices.  Score parameters are used in the implementations of the following REST APIs

Score functions

Score functions are used to override the default score of a record which has facet fields. Default score is 1. The default score is used to retrieve the drill down record count and sub categories of a category. If you override the default value, then the score of that record will be the evaluation of the score function. 

Retrieving the record score matching a drill down search

For an example, in  a record which represents a book you can define PRICE and DISCOUNT as score parameters that can be used for the following score function example:  ‘price - discount’ . You need to define these fields as score parameters and index columns when persisting  the event stream as shown below.

defining score parameters when persisting events

Click the  Simulate  option of the event stream to simulate sending events to the created even stream.  

simulate one facet attribute

For an example, consider an event stream with the following two records.

record1 (Book1) : 
	TITLE : Oliver Twist,
	AUTHOR : C.Dickens,
	PUBLISHED_DATE :["1866","08,"03"],
	COUNT : 22,
	PRICE : 30.00,
	DISCOUNT : 10.00

record2 (Book2) :
	TITLE : Great Expectations,
	AUTHOR : C.Dickens,
	PUBLISHED_DATE : ["1826","09,"14"],
	COUNT : 22,
	PRICE : 50.00,
	DISCOUNT : 12.00

Score parameters are useful when you want to use the drill down count API to get the sum of the scores of records.  If you invoke the API to retrieve the drilldown record count without a score function, then the score of each record is 1  (i.e. the default value). Therefore, the API returns the number of records. You can define a score function as “price - discount” as shown in the below REST API request. For more information, see Drilling Down Through Categories via REST API.

POST https://localhost:9443/analytics/drillDownCount
 
{
           "tableName": "BOOKS_STORE",
           "categories": [
                {
                          "fieldName": "AUTHOR",
                            "path" : ["C.Dickens”]

                }
            ],
		   “scoreFunction” : “PRICE - DISCOUNT”
}

Now, the score of each record is the output of the score function.  Therefore, the API returns 58 as the sum of the effective prices after applying the discount.

Retrieving the record score based on specific categories

You can use the REST API to retrieve the score of each record (after applying the score function) based on specific categories as shown below. For more information, see Retrieving the Number of Records Matching the Drill Down Criteria via REST API.

POST https://localhost:9443/analytics/facets
{
        "tableName" : "BOOK_STORE",
        "fieldName" : "PUBLISHED_DATE",
        "categoryPath" : ["1866", "08"],
        "query" : "timestamp : [1213343534535 TO 465464564644]",
        "scoreFunction" : "PRICE-DISCOUNT"
}

The sample out put of the above request is as follows. It denotes the following.

  • Output of the score function of the records with the PUBLISHED_DATE as ["1866", "08", "23"] is 15.
  • Output of the score function of the records with the PUBLISHED_DATE as ["1866", "08", "12"] is 25.
{
    "categoryPath" : ["1866", "08"],
    "categories" : {"23" : 15, "12" : 25}
}
  • No labels
  • Download PDF icon Download PDF
  • Download a PDF file of the documentation