This section explains how to start a DAS instance and connect it to an Apache Spark cluster in a location different to that DAS instance.
You are required to have access to an existing external Apache Spark cluster. For more information about configuring an Apache Spark cluster, see Apache Spark Documentation - Cluster Mode Overview.
Configuring DAS to connect to an external Apache Spark cluster
Download WSO2 DAS from here. Unzip the file you downloaded in your node, as well in each node of the external Apache Spark cluster.
- Set up MySQL as follows in your node, as well in each node of the external Apache Spark cluster. WSO2 DAS is configured with MySQL in this scenario because the datasource used needs to be of a type that can be accessed by the external Apache Spark cluster.
Download and install MySQL Server.
Download the MySQL JDBC driver.
Unzip the downloaded MySQL driver zipped archive, and copy the MySQL JDBC driver JAR (
mysql-connector-java-x.x.xx-bin.jar) into the
<DAS_HOME>/repository/components/libdirectory of all the nodes in the cluster.
- Enter the following command in a terminal/command window, where
usernameis the username you want to use to access the databases.
mysql -u username -p
- When prompted, specify the password that will be used to access the databases with the username you specified.
Create two databases named
About using MySQL in different operating systems
For users of Microsoft Windows, when creating the database in MySQL, it is important to specify the character set as latin1. Failure to do this may result in an error (error code: 1709) when starting your cluster. This error occurs in certain versions of MySQL (5.6.x) and is related to the UTF-8 encoding. MySQL originally used the latin1 character set by default, which stored characters in a 2-byte sequence. However, in recent versions, MySQL defaults to UTF-8 to be friendlier to international users. Hence, you must use latin1 as the character set as indicated below in the database creation commands to avoid this problem. Note that this may result in issues with non-latin characters (like Hebrew, Japanese, etc.). The following is how your database creation command should look.
mysql> create database <DATABASE_NAME> character set latin1;
For users of other operating systems, the standard database creation commands will suffice. For these operating systems, the following is how your database creation command should look.
mysql> create database <DATABASE_NAME>;
Execute the following script for the two databases you created in the previous step.
mysql> source <DAS_HOME>/dbscripts/mysql.sql;
Click here to view the commands for performing steps 6 and 7
- Create the following databases in MySQL.
<DAS_HOME>/repository/conf/datasources/analytics-datasources.xmlfile as shown below. This configuration should be done in your node, as well in each node of the external Apache Spark cluster.
For more information, see Datasources.
Create a symbolic link pointing to the
<DAS_HOME>in your DAS node as well as in each node of the external Apache Spark cluster. The path in which this symbolic link is located should be the same for each node. You can use a command similar to the following command in order to create the symbolic link (change the location specified as required).
sudo ln -s /home/centos/das/wso2das-3.0.0-SNAPSHOT das_symlinkThis creates a symbolic link in the
In a multi node DAS cluster that runs in a RedHat Linux environment, you also need to update the
<DAS_HOME>/bin/wso2server.shfile with the following entry so that the
<DAS_HOME>is exported. This is because the symbolic link may not be resolved correctly in this operating system.
Export CARBON_HOME=<symbolic link>
- Do the following in the
<DAS_home>/repository/conf/analytics/spark/spark-defaults.conffile in your DAS node as well as all the nodes of the external Apache Spark cluster.
- Set the
- Add the
carbon.das.symbolic.linkproperty and enter the symbolic link you created in the previous step as the value.
- Set the
- Start the external Apache Spark cluster if it is not already started.
- Start the DAS instance It will connect to the external Apache Spark cluster at start up.
- Go to external-spark-node-IP:4040 and check whether you can see the following web page.
The class paths listedn the
spark.driver.extraClassPathshould include the following symbolic link location.