BATTLE OF THE NODES: RAC PERFORMANCE MYTHS

There are some below finding related to RAC Architecture

Myth : Run same batch process concurrently from both nodes.

Let me tell you a story about customer performance issue. A batch process was designed to accept organization_id as an argument and spawns multiple processes to complete the business functionality in an E-Business Suite 11i environment. Client decided to schedule these batch programs to execute from 2 nodes, in order to use CPUs from both nodes effectively. Programs will process one set of organization_id in one node and other set of organization_ids in the second node. While I can understand the need to use all CPUs effectively to improve performance, it arises from fallacy that

logical isolation of the batch process is good enough. Since each disjoint set of organization_ids will be processed by a node, there is sufficient logical isolation, but no physical isolation. In a nutshell, even though these processes are operating on two disjoint set of logical rows, physically these rows are within the same physical block, i.e. no effective use of partitioning. Few batch processes were started from node 1 with a set of organization_ids and few of them started from node 2 with another set of organization_ids. Each of these batch processes in turn started with

at least 5 processes in each node. Performance of the whole application and the batch process was unacceptable. These batch processes were concurrently accessing same set of blocks from both nodes. As we discussed already, transfer of CR blocks needs log flush and this increases LGWR activity. Further,

only one instance can hold a buffer in CURRENT mode, and of course, CURRENT mode transfers also needs log flush. Essentially, both CR and CUR mode global cache transfer increased drastically and LGWR activity also increased.

Typical global cache waits are 1-5ms for a decent hardware setup and a typical single block or multiblock reads are in the order of 2-10ms. But, a typical access to SGA buffer in local instance in nanoseconds in the range of 10-100ns. In a non-RAC setup, these batch processes will access buffers in nano-seconds and were working fine in development environment. With RAC and with this design of logical-isolation-but-no-physical-isolation, waits increased to 1-3ms from 10-100ns, which is 1000

times worse. In addition, these transfers were introducing hot spots for LGWR processing and interconnect transfer leading to a performance impact for the application.

 

Work around-I   (TWO LOCAL MACHINE REQUIRE FOR RAC SETUP AND TESTING)

we can create the different database services and configured in that service in database url for Datasource or specific module like

  1. Session management
  2. usage metering/subscriber authorization

services had configuration for Preferred Instance and Available Instance so it’s internally manage failover.

R&D Case Study

Benefits of using RAC Service in Ebusiness suite   

http://blog.veebrij.com/?p=199

Work around-II (WE CAN TEST AND CHECK)

We might have primary driver and secondary driver on module level in NetVertex

We will specific instance-1 url on node-1 and instance-2 url on node-2.

Driver will take care for available instance during failover.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.