This documentation is for WSO2 API Manager 1.9.0 View documentation for the latest release.
Tuning Performance - API Manager 1.9.0 - WSO2 Documentation
||
Skip to end of metadata
Go to start of metadata

This section describes some recommended performance tuning configurations to optimize the API Manager. It assumes that you have set up the API Manager on Unix/Linux, which is recommended for a production deployment. We also recommend a distributed API Manager setup for most production systems. Out of all components of an API Manager distributed setup, the API Gateway is the most critical, because it handles all inbound calls to APIs. Therefore, we recommend you to have at least a 2-node cluster of API Gateways in a distributed setup.

Important:

  • Performance tuning requires you to modify important system files, which affect all programs running on the server. We recommend you to familiarize yourself with these files using Unix/Linux documentation before editing them.
  • The values we discuss here are general recommendations. They might not be the optimal values for the specific hardware configurations in your environment. We recommend you to carry out load tests on your environment to tune the API Manager accordingly.

OS-level settings

  1. To optimize network and OS performance, configure the following settings in the /etc/sysctl.conf file of Linux. These settings specify a larger port range, a more effective TCP connection timeout value, and a number of other important parameters at the OS-level.

    It is not recommended to use net.ipv4.tcp_tw_recycle = 1 when working with network address translation (NAT), such as if you are deploying products in EC2 or any other environment configured with NAT.

    net.ipv4.tcp_fin_timeout = 30
    fs.file-max = 2097152
    net.ipv4.tcp_tw_recycle = 1
    net.ipv4.tcp_tw_reuse = 1
    net.core.rmem_default = 524288
    net.core.wmem_default = 524288
    net.core.rmem_max = 67108864
    net.core.wmem_max = 67108864
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 65536 16777216
    net.ipv4.ip_local_port_range = 1024 65535      
  2. To alter the number of allowed open files for system users, configure the following settings in the /etc/security/limits.conf file of Linux (be sure to include the leading * character).

    * soft nofile 4096
    * hard nofile 65535

    Optimal values for these parameters depend on the environment.

  3. To alter the maximum number of processes your user is allowed to run at a given time, configure the following settings in the /etc/security/limits.conf file of Linux (be sure to include the leading * character). Each carbon server instance you run would require upto 1024 threads (with default thread pool configuration). Therefore, you need to increase the nproc value by 1024 per each carbon server (both hard and soft).

    * soft nproc 20000
    * hard nproc 20000

JVM-level settings

When an XML element has a large number of sub elements and the system tries to process all the sub elements, the system can become unstable due to a memory overhead. This is a security risk.

To avoid this issue, you can define a maximum level of entity substitutions that the XML parser allows in the system. You do this using the entity expansion limit as follows in the <APIM_HOME>/bin/wso2server.bat file (for Windows) or the <APIM_HOME>/bin/wso2server.sh file (for Linux/Solaris). The default entity expansion limit is 64000.

-DentityExpansionLimit=10000

In a clustered environment, the entity expansion limit has no dependency on the number of worker nodes.

WSO2 Carbon platform-level settings

In multitenant mode, the WSO2 Carbon runtime limits the thread execution time. That is, if a thread is stuck or taking a long time to process, Carbon detects such threads, interrupts and stops them. Note that Carbon prints the current stack trace before interrupting the thread. This mechanism is implemented as an Apache Tomcat valve. Therefore, it should be configured in the <PRODUCT_HOME>/repository/conf/tomcat/catalina-server.xml file as shown below.

<Valve className="org.wso2.carbon.tomcat.ext.valves.CarbonStuckThreadDetectionValve" threshold="600"/>
  • The className is the Java class used for the implementation. Set it to org.wso2.carbon.tomcat.ext.valves.CarbonStuckThreadDetectionValve.
  • The threshold gives the minimum duration in seconds after which a thread is considered stuck. The default value is 600 seconds.

APIM-level settings

Timeout configurations for an API call

The following diagram shows the communication/network paths that occur when an API is called. The timeout configurations for each network call are explained below.
 

  • Get throttle policy
    The throttle policy is taken from the registry and is based on your registry configurations. The differences are listed below:
    • Local Registry DB in the API Gateway - no network call occurs.
    • Mount directly to the Registry DB from the <APIM_HOME>/repository/conf/registry.xml file - A DB connection timeout occurs. This can be configured in the <APIM_HOME>/repository/conf/datasources/master-datasources.xml file and depends on the JDBC driver. For example,

      jdbc:mysql://localhost:3306/database?connectTimeout=60000&socketTimeout=60000
      jdbc:jtds:sqlserver://server:port/database;loginTimeout=60;socketTimeout=60
  • Key validation
     Key validation occurs via a Servlet HTTP call and the connection timeout can be configured by changing the following configuration details in the <APIM_HOME>/repository/conf/axis2/axis2_client.xml file:

    <transportSender name="https" class="org.apache.axis2.transport.http.CommonsHTTPTransportSender">
    <parameter name="SO_TIMEOUT">60000</parameter>
    <parameter name="CONNECTION_TIMEOUT">60000</parameter>
    </transportSender>

     

  • Client call API Gateway + API Gateway call Backend
    For backend communication, the API Manager uses PassThrough transport. This is configured in the <APIM_HOME>/repository/conf/passthru-http.properties file. For more information, see Configuring passthru-http.properties in the ESB documentation.
General APIM-level recommendations

Some general APIM-level recommendations are listed below:

Improvement AreaPerformance Recommendations
API Gateway nodes

Increase memory allocated by modifying /bin/wso2server. sh with the following setting:

  • -Xms2048m -Xmx2048m -XX:MaxPermSize=1024m
NHTTP transport of API Gateway

Recommended values for the <AM_HOME>/repository/conf/nhttp.properties file are given below. Note that the commented out values in this file are the default values that will be applied if you do not change anything.

Property descriptions:

snd_t_coreTransport sender worker pool's initial thread count
snd_t_maxTransport sender worker pool's maximum thread count
snd_io_threadsSender-side IO workers, which is recommended to be equal to the number of CPU cores. I/O reactors usually employ a small number of dispatch threads (often as few as one) to dispatch I/O event notifications to a greater number (often as many as several thousands) of I/O sessions or connections. Generally, one dispatch thread is maintained per CPU core.
snd_alive_secSender-side keep-alive seconds
snd_qlenSender queue length, which is infinite by default

Recommended values:

# HTTP Sender thread pool parameters

  • snd_t_core=200
  • snd_t_max=250
  • snd_alive_sec=5
  • snd_qlen=-1
  • snd_io_threads=16

# HTTP Listener thread pool parameters

  • lst_t_core=200
  • lst_t_max=250
  • lst_alive_sec=5
  • lst_qlen=-1
  • lst_io_threads=16

#timeout parameters

  • http.socket.timeout.receiver: Recommended socket timeout for listener is 120000

  • http.socket.timeout.sender: Recommended socket timeout for sender is 120000

PassThrough transport of API Gateway

Recommended values for the <AM_HOME>/repository/conf/passthru-http.properties file are given below. Note that the commented out values in this file are the default values that will be applied if you do not change anything.

Property descriptions

worker_thread_keepalive_sec

Defines the keep-alive time for extra threads in the worker pool
worker_pool_queue_lengthDefines the length of the queue that is used to hold runnable tasks to be executed by the worker pool
io_threads_per_reactorDefines the number of IO dispatcher threads used per reactor

http.max.connection.per.host.port

Defines the maximum number of connections per host port
worker_pool_queue_lengthDetermines the length of the queue used by the PassThrough transport thread pool to store pending jobs.

Recommended values

  • worker_thread_keepalive_sec: Default value is 60s. This should be less than the socket timeout.

  • worker_pool_queue_length: Set to -1 to use an unbounded queue. If a bound queue is used and the queue gets filled to its capacity, any further attempts to submit jobs will fail, causing some messages to be dropped by Synapse. The thread pool starts queuing jobs when all the existing threads are busy and the pool has reached the maximum number of threads. So, the recommended queue length is -1.

  • io_threads_per_reactor: Value is based on the number of processor cores in the system. (Runtime.getRuntime().availableProcessors())

  • http.max.connection.per.host.port : Default value is 32767, which works for most systems but you can tune it based on your operating system (for example, Linux supports 65K connections).

  • worker_pool_size_core: 400
  • worker_pool_size_max: 500
  • io_buffer_size: 16384
  • http.socket.timeout: 60000
  • snd_t_core: 200  
  • snd_t_max: 250  
  • snd_io_threads: 16  
  • lst_t_core: 200  
  • lst_t_max: 250  
  • lst_io_threads: 16

Make the number of threads equal to the number of processor cores.
Time-out configurations

The API Gateway routes the requests from your client to an appropriate endpoint. The most common reason for your client getting a timeout is when the Gateway's timeout is larger than client's timeout values. You can resolve this by either increasing the timeout on the client's side or by decreasing it on the API Gateway's side.

Here are few parameters, in addition to the timeout parameters discussed in the previous sections.

synapse.global_timeout_interval

Defines the maximum time that a callback is waiting in the Gateway for a response from the backend. If no response is received within this time, the Gateway drops the message and clears out the callback. This is a global level parameter that affects all the endpoints configured in Gateway.

Global timeout is defined in the <APIM_HOMe>/repository/conf/synapse.properties file. Recommended value is 120000.

Endpoint-level timeout

You can define timeouts per endpoint for different backend services, along with the action to be taken in case of a timeout.

The example below sets the endpoint to 30 seconds and executes the fault handler in case of a timeout.

<timeout>
   <duration>10000</duration>
   <responseAction>fault</responseAction>
</timeout>
Key Manager nodes

Set the following in the <APIM_HOME>/repository/conf/axis2/axis2_client.xml file:

<parameter name="defaultMaxConnPerHost">1000</parameter> 
<parameter name="maxTotalConnections">30000</parameter> 

Set the MySQL maximum connections:

mysql> show variables like "max_connections"; 
 max_connections was 151 
 set to global max_connections = 250; 

Set the open files limit to 200000 by editing the /etc/sysctl.conf file:

sudo sysctl -p

Set the following in the <APIM_HOME>/repository/conf/tomcat/catalina-server.xml file.

maxThreads="750" 
minSpareThreads="150" 
disableUploadTimeout="false" 
enableLookups="false" 
connectionUploadTimeout="120000" 
maxKeepAliveRequests="600" 
acceptCount="600" 

Set the following connection pool elements in the <APIM_HOME>/repository/conf/datasources/master-datasources.xml file:

<maxActive>50</maxActive>
<maxWait>60000</maxWait>
<testOnBorrow>true</testOnBorrow>
<validationQuery>SELECT 1</validationQuery>
<validationInterval>30000</validationInterval>

Note that you set the <testOnBorrow> element to true and provide a validation query (e.g., in Oracle, SELECT 1 FROM DUAL), which is run to refresh any stale connections in the connection pool. Set a suitable value for the <validationInterval> element, which defaults to 30000 milliseconds. It determines the time period after which the next iteration of the validation query will be run on a particular connection. It avoids excess validations and ensures better performance.

  • No labels