The latest version for DAS is WSO2 Data Analytics Server 3.1.0. View documentation for the latest release.
WSO2 Data Analytics Server is succeeded by WSO2 Stream Processor. To view the latest documentation for WSO2 SP, see WSO2 Stream Processor Documentation.

All docs This doc
Skip to end of metadata
Go to start of metadata

Index data in WSO2 DAS is stored in a local file system. All index data are partitioned into units known as shards. These shards can be viewed in the <DAS_HOME>/repository/data/index_data directory where there is a sub directory for each available shard.

Configuring shards

Shards that exist in the local file system can be managed by configuring the following parameters in the <DAS_HOME>/repository/conf/analytics/analytics-config.xml file.

indexReplicationFactorThe number of index data replicas that should be saved in the system. In a high availability deployment, this at least one replica should be saved.1

The number of index shards that are allowed to exist in the local file system at a given time. The number specified should be higher than the number of indexing nodes in the DAS cluster. The ideal number can be calculated as follows.

number of indexing nodes * [CPU cores used for indexing per node]

shardIndexRecordBatchSizeThe amount of index data to be processed by a shard index worker at a given time. This is expressed in bytes. The minimum amount should be 1000.20971520

The time interval during which a shard index processing worker can be inactive while processing operations are taking place, expressed in milliseconds. This parameter, together with the shardIndexRecordBatchSize parameter can be used to increase the final index batched data amount the an index worker processes at a given time. A higher batch data amount usually results in a higher throughput. However, it can also increase the latency of record insertion to indexing. The minimum value is 10, and the maximum value is 60000 (1 minute).


Allocating shards in a clustered deployment

In a WSO2 DAS cluster, the available shards are equally distributed among all the indexing nodes (i.e. nodes for which indexing is enabled). e.g., if the cluster has 3 indexing nodes and 6 shards, each indexing node is assigned two shards (i.e unless replication is enabled). When a new indexing node joins a cluster, the existing shard allocations change in order to assign some of the shards to the new node.

If you do not want a new node to operate as an indexing node, you should disable indexing at the time the node is started, using the following setting.


If you want to stop an existing node operating as an indexing node, you should restart it with the same setting. As a result, the existing shard allocation in the indexing cluster changes in order to reallocate the shards of the quitting node to other indexing nodes.

Mistakenly started indexing nodes

If you start a node as an indexing node by mistake, it changes global configurations and these changes need to be reverted manually. If the replication factor is equal to or greater than 1, you can still query and get the required data even if this node is inactive by following the procedure below.

  1. Restart the node as a non-indexing node (i.e. by setting the disableIndexing=true property at the time the node is restarted).
  2. If you want to clear the index data stored in the node, delete them from <DAS_HOME>/repository/data/indexing_data directory.
  3. If you want to use the node in another server profile, restart the node in the required profile.
  • When you restart an indexing node as a non-indexing node, you should also restart the other indexing servers for them to get the indexing updates of the node that stopped operating as an indexing node.
  • If you start an indexing server by mistake, it changes the global configurations. You need to make sure that the shard allocations are correct before proceeding.

When an indexing node is restarted as a non indexing node, the indexing data stored in it is not automatically removed. You can remove it if required from the <DAS-HOME>/repository/data/indexing_data directory.

Allocating shards manually

Shards  can be configured manually in the <DAS_HOME>/repository/conf/analytics/local-shard-allocation-config.conf file.

There are three modes when configuring the local shard allocations of a node.

NORMALThe indexing data for a shard is stored in the node to which the shard is assigned.

If you restart the server after adding a shard in the INIT mode,  that shard would be re-indexed in that node.

e.g., If the existing shard allocations are as follows, and you add the line 4, INIT and restart the server in order to reindex the data for shard 4. After the data is reindexed, the mode is changed to NORMAL.


This mode allows you to copy index data to a local node in order to let that node use it.

e.g., If you copy index data for shard 5, add the line 5, RESTORE to the following shard allocation, and then restart the server, the node allocates the 5th shard to that node (which is then used to search and index).

  • No labels