This documentation is for WSO2 API Manager 2.1.0. View documentation for the latest release.

All docs This doc
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

This topic provides you with instructions on how to set up an active deployment  of WSO2 API Manager with multiple datacenters.

Before you begin...

Make sure that you have replicated databases and file systems. For instructions, see Changing the Default API-M Databases.

Deployment Architecture 

The following diagram shows the deployment architecture of WSO2 API Manager with multiple datacenters.

 
Traffic Management

Runtime traffic

A global load balancer is in place to handle API traffic and in this deployment the proposal is to do traffic partitioning based on geography or IP ranges. Based on the LB rules the traffic will simultaneously flow through to both active datacenters, through their datacenter local load balancers. At an event where one datacenter fails, the traffic is routed to the second datacenter where gateway will have data center local high availability.

Management traffic

API creation, throttling policy creation as such management activities are routed towards the designated Active-Master datacenter. Management traffic will only have datacenter local high availability.

Throttling

Throttling data will be published to the traffic managers of both datacenter. Each datacenter will have a local traffic manager for throttling decision making, however for higher accuracy the gateways will publish events to both traffic managers, one sitting locally and one sitting in the other data center. Throttle out event notification will not occur at once in both datacenter as there is no shared traffic manager topology in place (for efficiency reasons), however the deployment will eventually be consistent as the throttle data is cross published.

Analytics

Raw data accumulation will only happen to each datacenter and will not be replicated. The summarization data (STATS_DB) in each datacenter will be accepted bi-directionally. 

The exception is where API-M alerting usecase will not work in such a deployment is due to file-based indexing storage.

Configure the datacenters

This section explains how to configure the datacenters with separate databases.

The following diagram shows the deployment

Step 1 - Configure PostgreSQL Databases

In this setup, we use shared Event Store DB, Processed Data Store DB, and Stats DB for two Analytics nodes in one data center. The AM_DB, UM_DB, and REG_DB have also been shared between the API-M node and the two Analytics nodes in the data center.

Step 2 - Configure APIM-Analytics 2.1.0 clustered setup

  1. Configure the two APIM-Analytics nodes clustered setup. Instead of DAS 3.1.0 use API-M Analytics 2.1.0.
    When configuring databases, use the PostgreSQL databases configured in the previous step.


If the two API-M Analytics nodes run in the same virtual machine, it is mandatory to have a port offset. Use port offset 1 and 2 for the two Analytics servers.

Step 3 - Configure APIM 2.1.0 with APIM-Analytics 2.1.0 clustered setup

  1. Configure APIM 2.1.0 and the two APIM-Analytics 2.1.0 nodes. For instructions on how to configure these nodes, see Configuring APIM Analytics.
  2. When configuring databases, use the same set of databases used in Step 2.
  3. Open the <API-M_HOME>/repository/conf/api-manager.xml file, after enabling the Analytics. 
  4. Add both the Analytics server URLs under the DASServerURL section as a comma separated list as shown below.

    <DASServerURL>{tcp://localhost:7612,tcp://localhost:7613}</DASServerURL>

Apply the solution to add the data center ID

Before you begin...

Make sure that you have configured the databases according to the instructions in the previous section

To ensure that no primary key violation takes place, you have to change the database schema, by adding the data center ID as an extra column for the tables in STATS_DB, and also add it to the primary key combination. This is to make sure that when database syncing happens, both analytics clusters are able to write to their respective databases without conflicts. There is a custom spark User Defined Function (UDF) to read the data center name from a system property and that has been used whenever inserting data to the STATS_DB via the Spark script.

Follow the steps below to apply the changes for each of the datacenter.

  1. Shut down the APIM 2.1.0 server and the API-M Analytics 2.1.0 servers in the clustered setup.
  2. Add the following parameter to the <Analytics_Home>/repository/conf/analytics/spark/spark-defaults.conf file, in each Analytics server node.

    spark.executor.extraJavaOptions -Dcarbon.data.center=DC1 
  3. Copy and replace the analytics-apim.xml file in <Analytics_Home>/repository/conf/template-manager/domain-template/ directory in each Analytics server node.
  4. Add org.wso2.analytics.apim.spark_2.1.0.jar  as a patch to each of the APIM Analytics server nodes. This file contains the newly written UDF to get data center ID as system parameter.
  5. Copy and replace the <Analytics_Home>/repository/deployment/server/carbonapps/org_wso2_carbon_analytics_apim-1.0.0.car file with this CApp, for each Analytics server nodes.
  6. Run the following PostgreSQL script against the WSO2AM_STATS_DB.

     Expand to see the script...
    Alter table API_REQUEST_SUMMARY add column dataCenter varchar(256) NOT NULL DEFAULT 'DefaultDC';
    Alter table API_REQUEST_SUMMARY DROP CONSTRAINT API_REQUEST_SUMMARY_pkey;
    Alter table API_REQUEST_SUMMARY ADD PRIMARY KEY (api,api_version,version,apiPublisher,consumerKey,userId,context,hostName,year,month,day,dataCenter);
    
    Alter table API_VERSION_USAGE_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC';
    Alter table API_VERSION_USAGE_SUMMARY DROP CONSTRAINT API_VERSION_USAGE_SUMMARY_pkey;
    Alter table API_VERSION_USAGE_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,context,hostName,year,month,day,dataCenter);
    
    Alter table API_Resource_USAGE_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC';
    Alter table API_Resource_USAGE_SUMMARY DROP CONSTRAINT API_Resource_USAGE_SUMMARY_pkey;
    Alter table API_Resource_USAGE_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,consumerKey,context,resourcePath,method,hostName,year,month,day,dataCenter);
    
    Alter table API_RESPONSE_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC';
    Alter table API_RESPONSE_SUMMARY DROP CONSTRAINT API_RESPONSE_SUMMARY_pkey;
    Alter table API_RESPONSE_SUMMARY ADD PRIMARY KEY (api_version,apiPublisher,context,hostName,year,month,day,dataCenter);
    
    Alter table API_FAULT_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC';
    Alter table API_FAULT_SUMMARY DROP CONSTRAINT API_FAULT_SUMMARY_pkey;
    Alter table API_FAULT_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,consumerKey,context,hostName,year,month,day,dataCenter);
    
    Alter table API_DESTINATION_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC';
    Alter table API_DESTINATION_SUMMARY DROP CONSTRAINT API_DESTINATION_SUMMARY_pkey;
    Alter table API_DESTINATION_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,context,destination,hostName,year,month,day,dataCenter);
    
    Alter table API_LAST_ACCESS_TIME_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC';
    Alter table API_LAST_ACCESS_TIME_SUMMARY DROP CONSTRAINT API_LAST_ACCESS_TIME_SUMMARY_pkey;
    Alter table API_LAST_ACCESS_TIME_SUMMARY ADD PRIMARY KEY (tenantDomain,apiPublisher,api,dataCenter);
    
    Alter table API_EXE_TME_DAY_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC';
    Alter table API_EXE_TME_DAY_SUMMARY DROP CONSTRAINT API_EXE_TME_DAY_SUMMARY_pkey;
    Alter table API_EXE_TME_DAY_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,context,year,month,day,tenantDomain,dataCenter);
    
    Alter table API_EXE_TIME_HOUR_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC';
    Alter table API_EXE_TIME_HOUR_SUMMARY DROP CONSTRAINT API_EXE_TIME_HOUR_SUMMARY_pkey;
    Alter table API_EXE_TIME_HOUR_SUMMARY ADD PRIMARY KEY (api,version,tenantDomain,apiPublisher,context,year,month,day,hour,dataCenter);
    
    Alter table API_EXE_TIME_MIN_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC';
    Alter table API_EXE_TIME_MIN_SUMMARY DROP CONSTRAINT API_EXE_TIME_MIN_SUMMARY_pkey;
    Alter table API_EXE_TIME_MIN_SUMMARY ADD PRIMARY KEY (api,version,tenantDomain,apiPublisher,context,year,month,day,hour,minutes,dataCenter);
    
    Alter table API_THROTTLED_OUT_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC';
    Alter table API_THROTTLED_OUT_SUMMARY DROP CONSTRAINT API_THROTTLED_OUT_SUMMARY_pkey;
    Alter table API_THROTTLED_OUT_SUMMARY ADD PRIMARY KEY (api,api_version,context,apiPublisher,applicationName,tenantDomain,year,month,day,throttledOutReason,dataCenter);
    
    Alter table API_REQ_USER_BROW_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC';
    Alter table API_REQ_USER_BROW_SUMMARY DROP CONSTRAINT API_REQ_USER_BROW_SUMMARY_pkey;
    Alter table API_REQ_USER_BROW_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,year,month,day,os,browser,tenantDomain,dataCenter);
    
    /*Execute following queries only "APIM_GEO_LOCATION_STATS" are enabled fom admin app.
    Alter table API_REQ_GEO_LOC_SUMMARY add column dataCenter varchar(254);
    Alter table API_REQ_GEO_LOC_SUMMARY drop primary key;
    Alter table API_REQ_GEO_LOC_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,year,month,day,country,city,tenantDomain,dataCenter);
    
  7. Restart the APIM 2.1.0 server and the API-M Analytics 2.1.0 servers in the clustered setup.

Synchronize the databases

Why do we need to maintain a same data in the STAT_DB of both data centers?

In the active-active data center architecture, the request may come to one of the datacenters and be fulfilled by that datacenter. The analytics-related details of that request will be stored in the STATS_DB of the same data center. Therefore, when requesting for analytics-related details, both datacenters can provide different details according to their STATS_DBs. To avoid this, we need to maintain same set of data in the STATS_DBs of both the data centers.

You can synchronize databases by sharing the STATS_DB or by using a replication mechanism. Inserting the data center ID to the primary key into all the tables in the STATS_DB and include it in the composite key can be done in two methods.

  1. Using a bi-directional replication mechanism - This is a master-master node replication, where changes done in one node will be replicated in other nodes.
  2. Master-slave mechanism - The STATS_DB will be shared among all the nodes. When the master node becomes unavailable, the slave nodes will function as the master node.

Follow the steps below to synchronize the databases using the bi-directional replication(BDR) mechanism.

Note that these instructions are tested with Ubuntu OS and PostgreSQL

Before you begin...

Install and enable the PostgreSQL apt repository for PGDG. This repository is required by the BDR packages.

  • Create a file with the name pgdg.list in /etc/apt/sources.list.d/  and add the below line

    deb http://apt.postgresql.org/pub/repos/apt/ codename-pgdg main
  • Replace the codename according to your OS. E.g., Ubuntu 14.04 (trusty), 16.04 (xenial), 17.04 (zesty)

    Example - for Ubuntu 16.04
    deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main
  1. Create a 2ndquadrant.list file in the /etc/apt/sources.list.d/ with the repository URL given below. Change codename according to your OS version

    deb http://packages.2ndquadrant.com/bdr/apt/ codename-2ndquadrant main
  2. Import the repository key from here. Update the package lists and install the packages.

    wget --quiet -O - http://packages.2ndquadrant.com/bdr/apt/AA7A6805.asc | sudo apt-key add -
    sudo apt-get update
  3. Remove the postgresql-9.4 packages, if you have them installed already.

    BDR requires a patched version of PostgreSQL 9.4 that conflicts with the official packages. If you already have PostgreSQL 9.4 installed either from apt.postgresql.org or your official distribution repository, you will need to make a dump of all your databases, then uninstall the official PostgreSQL 9.4 packages before you install the BDR. 

    To get the du_dump...
    pg_dump database1 -f backup_stat_db.sql
    To remove the postgresql-9.4 packages...
    sudo apt-get remove postgresql-9.4
  4. Install the BDR packages. Sample commands are given below.

    sudo apt-get update
    sudo apt-get install postgresql-bdr-9.4 postgresql-bdr-9.4-bdr-plugin
  5. Make the following changes to the files in the /etc/postgresql/9.4/main/ directory in both nodes.

    postgresql.conf
    listen_addresses = '*'
    shared_preload_libraries = 'bdr'
    wal_level = 'logical'
    track_commit_timestamp = on
    max_connections = 100
    max_wal_senders = 10
    max_replication_slots = 10
    max_worker_processes = 10
    pg_hba.conf
    #Add the following configs
    hostssl all all x.x.x.x/32 trust    # Own IP address
    hostssl all all z.z.z.z/32 trust      # Second node IP address
    hostssl replication postgres x.x.x.x/32 trust   # Own IP address
    hostssl replication postgres z.z.z.z/32 trust    # Second node IP address
  6. Restart PostgreSQL in both nodes. Sample commands are given below.

    systemctl unmask postgresql
    systemctl restart postgresql
  7. Create the STATS_DB database and users.

    CREATE DATABASE stat_db;
    CREATE ROLE stat_db_user WITH SUPERUSER LOGIN PASSWORD 'SuperPass';
    GRANT ALL PRIVILEGES ON DATABASE stat_dbTO stat_db_user;
  8. Create BDR extension on the STATS_DB in both nodes. Sample commands are given below.

    \c stat_db;
    create extension pgcrypto;
    create extension btree_gist;
    create extension bdr;

    You can check the BDR extension as follows:

    SELECT bdr.bdr_variant();
    stat_db=# SELECT bdr.bdr_variant();
    Result:-
    Bdr_variant
    -------------
    BDR
    (1 row)
    SELECT bdr.bdr_version();
    stat_db =# SELECT bdr.bdr_version();
    Result:-
    Bdr_version
    -------------------
    1.0.2-2016-11-11-
    (1 row)
  9.  Create the first master node.

    Do this step only in Node 1.

    Creating the first master node
    SELECT bdr.bdr_group_create(local_node_name := 'node1', node_external_dsn := 'host=<OWN EXTERNAL IP> port=5432 dbname=stat_db');

    You can verify this as shown below.

    SELECT bdr.bdr_node_join_wait_for_ready();
    stat_db=# SELECT bdr.bdr_node_join_wait_for_ready();
    
    Bdr_node_join_wait_for_ready
    ------------------------------
    
    (1 row)



  10. Create the second master node.

    Do this step only in Node 2.

    Creating the second master node
    SELECT bdr.bdr_group_join(local_node_name := 'node2', node_external_dsn := 'host=<OWN EXTERNAL IP> port=5432 dbname= stat_db', join_using_dsn := 'host=<NODE1 EXTERNAL IP> port=5432 dbname= stat_db');

    You can verify this with the same command given in the previous step.

  11. Restore database data.

    Do this step in only one of the two nodes.

    psql stat_db < backup_stat_db.sql

You have now successfully set up an active multi data center deployment.

  • No labels