Stream processing and analytics refer to collecting, analyzing and acting on events generated during business activities. This definition is paramount when designing a solution to address a real-time streaming use case. Collecting refers to the collection of data from various data sources. Analysis refers to the manipulation of data to identify interesting patterns and to extract information. Acting refers to notifying the results to other systems and personals and representing the analyzed data visually. Streaming data that needs to be processed or analyzed, sequentially passes through these sections.
The WSO2 SP architecture reflects this natural flow in its design as illustrated below.
WSO2 SP contains Siddhi as its core to collect, analyze and act on the incoming events. The following are the major components of SP.
Siddhi is the major component of SP which has the capability of running the stream processing and complex event processing logic. Stream processing logic can be scripted using a Streaming SQL language as Siddhi Application and deployed into stream processor for processing. It handles collection, analysis and performs actions based on the events which it receives and Siddhi Apps deployed.
Siddhi contains the following core elements:
|Siddhi functions||Siddhi core element and description|
Source: Sources receive events via multiple transport protocols such as HTTP, TCP, Kafka, JMS, etc., and in different data formats such as XML, JSON, Text, Binary, etc. The events received via sources are mapped into streams for processing.
Stream: Streams represent a continuous stream of events which adheres to a defined schema.
Table: Tables represent a static set of events which adheres to a defined schema. Table can be manipulated with "insert", "update", "delete", and "update or insert" operations. Events in tables can be retrieved by joining them with steams or using REST API.
Window: Windows represent a set of events which adheres to a defined schema which gets emitted based on the given window condition. Events in windows can be retrieved by joining them with steams or using REST API.
Aggregation: Aggregations consume events from a stream and perform predefined aggregations. The results of the aggregations can be retrieved by joining them with steams or using REST API.
Query: Queries help you process events. Queries process streams, tables, and windows and produce new streams, or update tables or windows. A query can contain filters, windows, aggregations, joins, patterns and/or sequence operations.
Store: Stores are mapped to tables and it allows you to store events in various databases and systems such as RDBMS, Apache Cassandra, MongoDB, Apache Solr, Apache HBase, Hazelcast and many more.
Trigger: Triggers produce periodic events to a stream to achieve periodic execution of query logic.
Sink: Sinks publish events arriving at streams via multiple transport protocols such as HTTP, Email, TCP, Kafka, JMS, etc., by mapping the events to different data formats such as XML, JSON, Text, Binary, etc.
For more information about Siddhi, see the Siddhi Query Guide.
Stream Processor Studio/Editor
The Stream Processor Studio provides an environment for developers to build Siddhi applications with the support of syntax highlighting, auto-completion, and with integrated documentation support. It also allows them to test the application using simulations and debug the application to verify the processing logic.
For more information, see Understanding the Development Environment.
This is used for data visualization in WSO2 SP. The data from real-time streams and stored tables can be visualized via the portal. The portal can contain several dashboards and widgets that can be generated and customized by users based on their requirements.
For more information, see Visualizing Data.
Business rules provide a mechanism for business users to manage the rules themselves. Here business users can create/edit/delete simple filters using a form-based interface and or predefined parameterized rules created by the developers.
For more information, see Working with Business Rules.
This lets you monitor the system in operation by getting to fine-grain details about its throughput, latency, and how much load it is handling to better understand and manage the environment.
For more information, see Monitoring Stream Processor.
The worker provides a lightweight stream processing server that lets you deploy and run Siddhi applications in production.
This is used only on fully distributed deployments, to automatically deploy and manage Siddhi applications on multiple Stream Processor worker nodes.