||
Skip to end of metadata
Go to start of metadata

WSO2 BAM uses MapReduce jobs to deleting Cassandra data. As a result, you can delete a large amount of data using a cluster of Hadoop nodes. There are two ways of configuring data deletion.

Using Management Console

You can use the BAM Management Console to run the deletion process either manually or by scheduling it using a cron expression. 

For instructions on using the management console for the deletion process, see Archiving Cassandra Data. After following the instructions, select Deletion instead of Archival as follows.

 Cassandra Data Deletion 

Using Toolboxes

You can schedule the deletion process for a particular event stream using a toolbox. Deletion process can be defined by adding following properties to stream.properties file, which can be found inside the toolbox directory.

streams.definitions.defn1.enablePurge=true
streams.definitions.defn1.purgeAfterDays=90
streams.definitions.defn1.purgeCron=0 0 0 * * ?

The configuration properties are explained below:

PopertyDescription

streams.definitions.defn1.enablePurge

Enable or disable purging.
streams.definitions.defn1.purgeAfterDays

Keeps only last 'n' no of days data in the Column Family of the particular stream. For example, according to the above configuration, the system only runs data from the last 90 days, and deletes older data.

streams.definitions.defn1.purgeCron

Cron expression is used to schedule the deletion process. For example, according to the above configuration, the deletion job runs everyday at midnight.

If there are custom indexes created for a particular stream, data will be removed from the respctive index Column Families as well.

  • No labels