WSO2 BAM uses MapReduce jobs to deleting Cassandra data. As a result, you can delete a large amount of data using a cluster of Hadoop nodes. There are two ways of configuring data deletion.
Using Management Console
You can use the BAM Management Console to run the deletion process either manually or by scheduling it using a cron expression.
For instructions on using the management console for the deletion process, see Archiving Cassandra Data. After following the instructions, select Deletion instead of Archival as follows.
You can schedule the deletion process for a particular event stream using a toolbox. Deletion process can be defined by adding following properties to
stream.properties file, which can be found inside the toolbox directory.
The configuration properties are explained below:
|Enable or disable purging.|
Keeps only last 'n' no of days data in the Column Family of the particular stream. For example, according to the above configuration, the system only runs data from the last 90 days, and deletes older data.
Cron expression is used to schedule the deletion process. For example, according to the above configuration, the deletion job runs everyday at midnight.
If there are custom indexes created for a particular stream, data will be removed from the respctive index Column Families as well.