Creating an Oracle RAC (Real Application Cluster) using ASMLIB or Udev

February 11, 2021

In the previous post in this series we discussed the two most common methods for mapping shared storage for an Oracle RAC. In this post, we are going to discuss creating RAC using asmlib. The changes for creation using udev are covered at the end of this post. As a review, below is the section of the previous post that discussed configuring shared storage using asmlib:

CONFIGURING SHARED STORAGE USING ORACLE ASMLIB

Now, if you are going to map the shared storage using asmlib, there are more steps involved than using udev. The base oracleasm package is part of the kernel. Add the oracleasm support package: ‘yum -y install oracleasm-support’ on all nodes. Then run the command ‘oracleasm configure -i’ to set up the asmlib packages. Answer the questions as shown below:

[root@racnode2 ~]# oracleasm configure -i

Configuring the Oracle ASM library driver.

 
 

This will configure the on-boot properties of the Oracle ASM library

driver. The following questions will determine whether the driver is

loaded on boot and what permissions it will have. The current values

will be shown in brackets (‘[]’). Hitting <ENTER> without typing an

answer will keep that current value. Ctrl-C will abort.

 
 

Default user to own the driver interface []: oracle

Default group to own the driver interface []: dba

Start Oracle ASM library driver on boot (y/n) [n]: y

Scan for Oracle ASM disks on boot (y/n) [y]: y

Writing Oracle ASM library driver configuration: done

[root@racnode2 ~]#reboot

 
 

Ensure that the above commands are done on all nodes in the cluster. As noted, reboot after configuring asm.

 
 

The next set of commands are done on only the first cluster node.

First, create partitions on the shared disks. Repeat for each disk:

[root@racnode1 ~]# fdisk /dev/sdb

Welcome to fdisk (util-linux 2.23.2).

 
 

Changes will remain in memory only, until you decide to write them.

Be careful before using the write command.

 
 

Device does not contain a recognized partition table

Building a new DOS disklabel with disk identifier 0x619c61f3.

 
 

Command (m for help): n

Partition type:

p primary (0 primary, 0 extended, 4 free)

e extended

Select (default p): p

Partition number (1-4, default 1):

First sector (2048-41943039, default 2048):

Using default value 2048

Last sector, +sectors or +size{K,M,G} (2048-41943039, default 41943039):

Using default value 41943039

Partition 1 of type Linux and of size 20 GiB is set

 
 

Command (m for help): w

The partition table has been altered!

 
 

Calling ioctl() to re-read partition table.

Syncing disks.

[root@racnode1 ~]#

 
 

Note that on the above, the first sector is 2048 to ensure that the partition starts on a sector boundary. This partition alignment is the default in OEL7, but make sure that the sector alignment is there. It makes a substantial difference in performance.

The next step is to mark each disk for oracle asm: ‘oracleasm createdisk <name> <device>’. For example, ‘oracleasm createdisk ASMDISK1 /dev/sdb1’. Do this for each node, something like this:

[root@racnode1 ~]# oracleasm createdisk asmdisk1 /dev/sdb1

Writing disk header: done

Instantiating disk: done

[root@racnode1 ~]# oracleasm createdisk asmdisk2 /dev/sdc1

Writing disk header: done

Instantiating disk: done

[root@racnode1 ~]# oracleasm createdisk asmdisk3 /dev/sdd1

Writing disk header: done

Instantiating disk: done

[root@racnode1 ~]# oracleasm createdisk asmdisk4 /dev/sde1

Writing disk header: done

Instantiating disk: done

[root@racnode1 ~]# oracleasm createdisk asmdisk5 /dev/sdf1

Writing disk header: done

Instantiating disk: done

 
 

Next, on each additional host run the command ‘oracleasm scandisks’, then ‘oracleasm listdisks’. The output should look something like this:

[root@racnode2 ~]# oracleasm scandisks

Reloading disk partitions: done

Cleaning any stale ASM disks…

Scanning system for ASM disks…

Instantiating disk “ASMDISK1”

Instantiating disk “ASMDISK2”

Instantiating disk “ASMDISK4”

Instantiating disk “ASMDISK3”

Instantiating disk “ASMDISK5”

[root@racnode2 ~]# oracleasm listdisks

ASMDISK1

ASMDISK2

ASMDISK3

ASMDISK4

ASMDISK5

[root@racnode2 ~]#

 
 

This completes the shared device configuration using oracle asmlib.

See this link for shared device configuration using udev:

https://dbakerber.wordpress.com/2019/10/16/udev-rules-for-oracle-storage/

The cluster build step by step that follows uses asmlib. At the end of this post, the changes required when using udev rules are noted. The changes are fairly minor.

Personally, I prefer to use udev rules because it allows us to skip the partitioning step, and reduces the number of oracle packages running on the host. Also, it can be problematic patching when using asmlib since it is a kernel package for OEL.

Now download the Oracle GI (Grid Infrastructure) software from here. You should download this onto the first cluster node as the oracle user:

https://www.oracle.com/database/technologies/oracle19c-linux-downloads.html

Note that this is an entire home installation. Once downloaded, create a directory for the files. Most people create a /u01 mountpoint and file system for the Oracle binaries. Note that initially this must be owned by Oracle. Run the following commands (as root) on all the cluster nodes:

#mkdir -p /u01

#chown oracle:dba /u01

 

Next, log back in as Oracle, and create the full path for the oracle GI home on the first node. The installation process will create the required directories on the other nodes as long as /u01 has been created and is owned by oracle:

[oracle@racnode1 ~]$ mkdir -p /u01/app/19.3.0/grid

Next, move or copy the installation file to that location, and unzip the GI home.

[oracle@racnode1 grid]$ mv LINUX.X64_193000_grid_home.zip /u01/app/19.3.0/grid/

[oracle@racnode1 grid]$ cd /u01/app/19.3.0/grid/

[oracle@racnode1 grid]$ unzip LINUX.X64_193000_grid_home.zip

 

Note that we will be using xwindows to complete the installation. That is the reason we chose the GUI installation option when we first created the VMware guests. If you have questions on how to configure xwindows with putty, see this post (look section 2): https://dbakerber.wordpress.com/2015/09/23/installing-oracle-software-part-1-prerequisites/.

 

For further reference, below is the hosts file for this installation. I have configured dnsmasq for names resolution:

 

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

10.12.1.107 racnode1 racnode1.localdomain

10.12.1.136 racnode2 racnode2.localdomain

 

10.12.1.108 racnode1-vip racnode1-vip.localdomain

10.12.1.137 racnode2-vip racnode2-vip.localdomain

 

10.12.1.146 racnode1-scan racnode1-scan.localdomain

10.12.1.147 racnode1-scan racnode1-scan.localdomain

10.12.1.148 racnode1-scan racnode1-scan.localdomain

 

192.168.63.101 racnode1-priv1 racnode1-priv1.localdomain

192.168.44.101 racnode1-priv2 racnode1-priv2.localdomain

192.168.63.102 racnode2-priv1 racnode2-priv1.localdomain

192.168.44.102 racnode2-priv2 racnode2-priv2.localdomain

 

Start the installation process by running the gridSetup.sh program.

[oracle@racnode1 grid]$ ./gridSetup.sh

 

We are building a new cluster.

 

This is a standalone cluster.

The name of the cluster isn’t really important, though the name must be 15 characters or fewer in length, and contain only letters, numbers, and hyphens (no underscores). The requirements are the same for the scan name, and the length is not validated here, so make sure you check it yourself. If the names are too long, the root script will fail.

 

This automatically includes the name of the host you are running on. Add additional hosts.

 

Next, click on the ss connectivity button, shown below.

Enter the oracle password, and tell it to setup. This sets up passwordless ssh across all the nodes in the cluster.

 

 

Click ok, then next.

Assign the networks as shown below. We will used the ens35 network in the database setup, so make sure it stays as do not use.

 

 

While Oracle wants you to create the GIMR, I have never seen any purpose in it at all. As far as I can tell, it just takes valuable resources. And in any case, this is not a production build.

On the next screen, the /dev/sd* devices will show up. First, try and changing the disk discovery string to ORCL:*. Sometimes this works. Most often, I have had to change it to /dev/oracleasm/disks/*

 

The shared disks should now show up. Three will be needed for normal redundancy, and since this is the data for the cluster, the group should be either normal or high redundancy. Five disks would be needed for high redundancy. Enable the ASM filter driver if you desire.

Next, enter the passwords as appropriate. I typically use the same password for everything, then change and/or lock accounts as required.

Do not enable IPMI.

 

Register the cluster with OEM if you want. You need to have the OEM agent already installed, and OEM set up and running in order to do this.

I normally use dba for all groups. Sometimes oinstall is also used.

Don’t worry about any errors on the groups.

Take the defaults for oracle base and the inventory location. For a GI RAC home, the location must be outside of the oracle base.

 

I normally run the root scripts myself so I can better monitor for errors.

 

Check for the prereqs, and run the fixup script if necessary.

The items below can be safely ignored.

Continue.

Click install, and let it run.

When all files are set up and copied, you will be prompted to run the root scripts. Run orainstRoot.sh on each node, then run root.sh on each node. Note that root.sh must be run on node 1 first, then on the other nodes in any order.

Below is the output of the root scripts.

[root@racnode1 ~]# /tmp/GridSetupActions2021-02-11_10-43-18AM/CVU_19.0.0.0.0_oracle/runfixup.sh

All Fix-up operations were completed successfully.

[root@racnode1 ~]# /u01/app/oraInventory/orainstRoot.sh

Changing permissions of /u01/app/oraInventory.

Adding read,write permissions for group.

Removing read,write,execute permissions for world.

 

Changing groupname of /u01/app/oraInventory to oinstall.

The execution of the script is complete.

[root@racnode1 ~]# /u01/app/19.3.0/grid/root.sh

Performing root user operation.

 

The following environment variables are set as:

ORACLE_OWNER= oracle

ORACLE_HOME= /u01/app/19.3.0/grid

 

Enter the full pathname of the local bin directory: [/usr/local/bin]:

Copying dbhome to /usr/local/bin …

Copying oraenv to /usr/local/bin …

Copying coraenv to /usr/local/bin …

 

 

Creating /etc/oratab file…

Entries will be added to the /etc/oratab file as needed by

Database Configuration Assistant when a database is created

Finished running generic part of root script.

Now product-specific root actions will be performed.

Relinking oracle with rac_on option

Using configuration parameter file: /u01/app/19.3.0/grid/crs/install/crsconfig_params

The log of current session can be found at:

/u01/app/oracle/crsdata/racnode1/crsconfig/rootcrs_racnode1_2021-02-11_11-30-32AM.log

2021/02/11 11:30:44 CLSRSC-594: Executing installation step 1 of 19: ‘SetupTFA’.

2021/02/11 11:30:44 CLSRSC-594: Executing installation step 2 of 19: ‘ValidateEnv’.

2021/02/11 11:30:44 CLSRSC-363: User ignored prerequisites during installation

2021/02/11 11:30:44 CLSRSC-594: Executing installation step 3 of 19: ‘CheckFirstNode’.

2021/02/11 11:30:47 CLSRSC-594: Executing installation step 4 of 19: ‘GenSiteGUIDs’.

2021/02/11 11:30:48 CLSRSC-594: Executing installation step 5 of 19: ‘SetupOSD’.

2021/02/11 11:30:48 CLSRSC-594: Executing installation step 6 of 19: ‘CheckCRSConfig’.

2021/02/11 11:30:49 CLSRSC-594: Executing installation step 7 of 19: ‘SetupLocalGPNP’.

2021/02/11 11:32:02 CLSRSC-594: Executing installation step 8 of 19: ‘CreateRootCert’.

2021/02/11 11:32:15 CLSRSC-4002: Successfully installed Oracle Trace File Analyzer (TFA) Collector.

2021/02/11 11:32:16 CLSRSC-594: Executing installation step 9 of 19: ‘ConfigOLR’.

2021/02/11 11:32:33 CLSRSC-594: Executing installation step 10 of 19: ‘ConfigCHMOS’.

2021/02/11 11:32:33 CLSRSC-594: Executing installation step 11 of 19: ‘CreateOHASD’.

2021/02/11 11:32:40 CLSRSC-594: Executing installation step 12 of 19: ‘ConfigOHASD’.

2021/02/11 11:32:41 CLSRSC-330: Adding Clusterware entries to file ‘oracle-ohasd.service’

2021/02/11 11:33:09 CLSRSC-594: Executing installation step 13 of 19: ‘InstallAFD’.

2021/02/11 11:33:17 CLSRSC-594: Executing installation step 14 of 19: ‘InstallACFS’.

2021/02/11 11:33:29 CLSRSC-594: Executing installation step 15 of 19: ‘InstallKA’.

2021/02/11 11:33:38 CLSRSC-594: Executing installation step 16 of 19: ‘InitConfig’.

 

ASM has been created and started successfully.

 

[DBT-30001] Disk groups created successfully. Check /u01/app/oracle/cfgtoollogs/asmca/asmca-210211AM113420.log for details.

 

2021/02/11 11:35:27 CLSRSC-482: Running command: ‘/u01/app/19.3.0/grid/bin/ocrconfig -upgrade oracle oinstall’

CRS-4256: Updating the profile

Successful addition of voting disk b349b830f8644fc2bf0433b684627a8b.

Successful addition of voting disk 39ef9e8dd1674f06bfdc2b727c5b0583.

Successful addition of voting disk 9b708235ed7e4f1cbfd211a72b8facfa.

Successfully replaced voting disk group with +DATA.

CRS-4256: Updating the profile

CRS-4266: Voting file(s) successfully replaced

## STATE File Universal Id File Name Disk group

— —– —————– ——— ———

1. ONLINE b349b830f8644fc2bf0433b684627a8b (/dev/oracleasm/disks/ASMDISK1) [DATA]

2. ONLINE 39ef9e8dd1674f06bfdc2b727c5b0583 (/dev/oracleasm/disks/ASMDISK2) [DATA]

3. ONLINE 9b708235ed7e4f1cbfd211a72b8facfa (/dev/oracleasm/disks/ASMDISK3) [DATA]

Located 3 voting disk(s).

2021/02/11 11:37:30 CLSRSC-594: Executing installation step 17 of 19: ‘StartCluster’.

2021/02/11 11:39:12 CLSRSC-343: Successfully started Oracle Clusterware stack

2021/02/11 11:39:12 CLSRSC-594: Executing installation step 18 of 19: ‘ConfigNode’.

2021/02/11 11:42:21 CLSRSC-594: Executing installation step 19 of 19: ‘PostConfig’.

2021/02/11 11:43:18 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster … succeeded

[root@racnode1 ~]#

 

After clicking ok that the scripts have been run, it will run through several more steps and then continue to the finish screen. Sometimes the cluster integrity check fails, but if ASM is running and you can see the shared space in ASM, everything should be fine.

 

Click close.

At this point, the cluster is built.

 

Notes for using udev rules instead of ASMLIB.

If you choose to use udev rules, the differences are minor. Where we entered the disc discover path as /dev/oracleasm/disks above, change it to /dev/oracleasm/*, as below:

You will notice that the names of the disks are the names defined in the file /etc/udev/rules.d/99-oracleasm.rules.

 

Once again, to create a disk group with normal redundancy, we need three disks.

 

The remainder of the process is the same as above.

In this post, we have completed the creation of an Oracle RAC (Real Application Cluster) on VMware workstation guests using oracle enterprise linuz 7.9 (OEL7.9).

Oracle Software Installation on a VMware Workstation Guest – Phase 1, Shared Storage

February 10, 2021

In my previous posts, I covered the creation of a VMware workstation guest in Oracle Enterprise Linux to be used in an Oracle RAC (Real Application Cluster). In this post, we will discuss the next steps in creation of the Oracle RAC using ASM. We will cover using both asmlib and udev rules for identify the Oracle ASM storage.

An Oracle database RAC is always installed using storage that is shared on all nodes in the cluster. When we created the VMware guests, we also created a group of VMDKS (shared disks) that are mapped to all the VMware guests. These will be used for the ASM storage.

You may recall that when the workstations were set up, these lines were added to the VMX file:

disk.locking = “FALSE”

diskLib.dataCacheMaxSize = “0”

disk.enableuuid = “TRUE”

bios.bootdelay = “8000”

The first two settings, disk.locking and disklib.DatacacheMaxSize, are required for the disks to be shared properly. The disk.enableuuid setting is required in order to use udev rules for shared disks. This is not required if oracle asmlib is used, but does not prevent asmlib usage either.

The first step is to install the oracle preinstall package. On each node of your cluster install the package oracle-database-preinstall-19c. ‘yum -y install oracle-database-preinstall-19c’. This will install the required packages for the oracle database, set up certain operating system parameters in sysctl.conf, and create the Oracle user with appropriate groups and privileges.

Next, on each node, set the password for the oracle user. The password should be the same on all nodes. We are going to use 0Racle@ (zero, not o) for the purpose of this exercise.

As root:

#passwd oracle

Changing password for user oracle

New password:

Retype new password:

passwd: all authentication tokes updated successfully.

 

Do this on all the cluster nodes. It sets the oracle password and allows the user to log in as the oracle user.

Now, in order to configure the shared storage, you only use one of the two options below. Either use udev rules or asmlib. Do not do both.

CONFIGURING SHARED STORAGE USING UDEV RULES.

Now we prepare the shared storage. First, lets discuss doing this with udev rules, which is the method I prefer. In order to use udev rules, the disk.enableuuid options must be set to true. This causes VMware to generate a UUID for each vmdk. Identifying the UUID used to be a fairly involved process, but since the onset of RHEL/OEL 7 it has become much simpler. There is a discussion here on how to do it: https://dbakerber.wordpress.com/2019/10/16/udev-rules-for-oracle-storage/, and a short version of that information follows.

This short script generates the required lines which you will use:

#!/bin/sh

COUNTER=0

for disk in `ls /dev/sd* -c1 | grep -v sda`

do

shortserial=`udevadm info –query=property –name $disk | grep ID_SERIAL_SHORT | awk -F “=” ‘{print $2}’`

COUNTER=$((COUNTER+1))

echo KERNEL==\”sd?\”, ENV{ID_SERIAL_SHORT}==\”$shortserial\” SYMLINK+=\”oracleasm/asm-disk$COUNTER\”, OWNER=\”oracle\”, GROUP=\”dba\”, MODE=\”0660\”

done

 

Output:

KERNEL==”sd?”, ENV{ID_SERIAL_SHORT}==”6000c29db4db52b2e2bee9c23aa56fa5″ SYMLINK+=”oracleasm/asm-disk1″, OWNER=”oracle”, GROUP=”dba”, MODE=”0660″

KERNEL==”sd?”, ENV{ID_SERIAL_SHORT}==”6000c29d94c12a97079919b62a7b03ad” SYMLINK+=”oracleasm/asm-disk2″, OWNER=”oracle”, GROUP=”dba”, MODE=”0660″

KERNEL==”sd?”, ENV{ID_SERIAL_SHORT}==”6000c297d9addf5b1cdb04390a28709c” SYMLINK+=”oracleasm/asm-disk3″, OWNER=”oracle”, GROUP=”dba”, MODE=”0660″

KERNEL==”sd?”, ENV{ID_SERIAL_SHORT}==”6000c29e9e2a48db3724ce436216e2bc” SYMLINK+=”oracleasm/asm-disk4″, OWNER=”oracle”, GROUP=”dba”, MODE=”0660″

KERNEL==”sd?”, ENV{ID_SERIAL_SHORT}==”6000c2991798200359672b5ee090813b” SYMLINK+=”oracleasm/asm-disk5″, OWNER=”oracle”, GROUP=”dba”, MODE=”0660″

 

Create a file called /etc/udev/rules.d/99-oracleasm.rules and copy the output into this file. There should be 1 line for each shared vmdk. Now copy this file to each node in your cluster. Note that the same file should be on every server. This ensures that each vmdk will have the same name on each. After copying the file, reboot each node.

After reboot, check to make sure the shared storage is available:

[oracle@racnode1 ~]$ ls -la /dev/oracleasm

total 0

drwxr-xr-x 2 root root 140 Feb 10 14:38 .

drwxr-xr-x 21 root root 3660 Feb 10 14:38 ..

lrwxrwxrwx 1 root root 6 Feb 10 14:38 asm-disk1 -> ../sdf

lrwxrwxrwx 1 root root 6 Feb 10 14:38 asm-disk2 -> ../sdc

lrwxrwxrwx 1 root root 6 Feb 10 14:38 asm-disk3 -> ../sdb

lrwxrwxrwx 1 root root 6 Feb 10 14:38 asm-disk4 -> ../sde

lrwxrwxrwx 1 root root 6 Feb 10 14:38 asm-disk5 -> ../sdd

[oracle@racnode1 ~]$

 

On each node, you should see the above output. Note that the /sd letters may be different for each host, but the asm-diskx mapping will be the same, which is the reason for the udev rules. When we map these in ASM, we will use the asm-diskx name.

 

CONFIGURING SHARED STORAGE USING ORACLE ASMLIB

Now, if you are going to map the shared storage using asmlib, there are more steps involved than using udev. The base oracleasm package is part of the kernel. Add the oracleasm support package: ‘yum -y install oracleasm-support’ on all nodes. Then run the command ‘oracleasm configure -i’ to set up the asmlib packages. Answer the questions as shown below:

[root@racnode2 ~]# oracleasm configure -i

Configuring the Oracle ASM library driver.

 

This will configure the on-boot properties of the Oracle ASM library

driver. The following questions will determine whether the driver is

loaded on boot and what permissions it will have. The current values

will be shown in brackets (‘[]’). Hitting <ENTER> without typing an

answer will keep that current value. Ctrl-C will abort.

 

Default user to own the driver interface []: oracle

Default group to own the driver interface []: dba

Start Oracle ASM library driver on boot (y/n) [n]: y

Scan for Oracle ASM disks on boot (y/n) [y]: y

Writing Oracle ASM library driver configuration: done

[root@racnode2 ~]#reboot

 

Ensure that the above commands are done on all nodes in the cluster. As noted, reboot after configuring asm.

 

The next set of commands are done on only the first cluster node.

First, create partitions on the shared disks. Repeat for each disk. The first disk to be partitioned should be /dev/sdb. But verify that is the correct disk by running fdisk -l to verify it is one of the shared disks.

[root@racnode1 ~]# fdisk /dev/sdb

Welcome to fdisk (util-linux 2.23.2).

 

Changes will remain in memory only, until you decide to write them.

Be careful before using the write command.

 

Device does not contain a recognized partition table

Building a new DOS disklabel with disk identifier 0x619c61f3.

 

Command (m for help): n

Partition type:

p primary (0 primary, 0 extended, 4 free)

e extended

Select (default p): p

Partition number (1-4, default 1):

First sector (2048-41943039, default 2048):

Using default value 2048

Last sector, +sectors or +size{K,M,G} (2048-41943039, default 41943039):

Using default value 41943039

Partition 1 of type Linux and of size 20 GiB is set

 

Command (m for help): w

The partition table has been altered!

 

Calling ioctl() to re-read partition table.

Syncing disks.

[root@racnode1 ~]#

 

Note that on the above, the first sector is 2048 to ensure that the partition starts on a sector boundary. This partition alignment is the default in OEL7, but make sure that the sector alignment is there. It makes a substantial difference in performance.

The next step is to mark each disk for oracle asm: ‘oracleasm createdisk <name> <device>’. For example, ‘oracleasm createdisk ASMDISK1 /dev/sdb1’. Do this for each node, something like this:

[root@racnode1 ~]# oracleasm createdisk asmdisk1 /dev/sdb1

Writing disk header: done

Instantiating disk: done

[root@racnode1 ~]# oracleasm createdisk asmdisk2 /dev/sdc1

Writing disk header: done

Instantiating disk: done

[root@racnode1 ~]# oracleasm createdisk asmdisk3 /dev/sdd1

Writing disk header: done

Instantiating disk: done

[root@racnode1 ~]# oracleasm createdisk asmdisk4 /dev/sde1

Writing disk header: done

Instantiating disk: done

[root@racnode1 ~]# oracleasm createdisk asmdisk5 /dev/sdf1

Writing disk header: done

Instantiating disk: done

 

Next, on each additional host run the command ‘oracleasm scandisks’, then ‘oracleasm listdisks’. The output should look something like this:

[root@racnode2 ~]# oracleasm scandisks

Reloading disk partitions: done

Cleaning any stale ASM disks…

Scanning system for ASM disks…

Instantiating disk “ASMDISK1”

Instantiating disk “ASMDISK2”

Instantiating disk “ASMDISK4”

Instantiating disk “ASMDISK3”

Instantiating disk “ASMDISK5”

[root@racnode2 ~]# oracleasm listdisks

ASMDISK1

ASMDISK2

ASMDISK3

ASMDISK4

ASMDISK5

[root@racnode2 ~]#

 

This completes the shared device configuration using oracle asmlib.

 

Next, we install the Oracle Grid Infrastructure software. This will be done in my next post.

VMware Workstation Creation for Oracle RAC (part 3)

February 8, 2021

In this post, we will complete the buildout of the VMware workstation guests for an Oracle RAC (real application cluster).

At this point, we have built a single VMware guest with prerequisites need to create an Oracle RAC. Next, we will create the additional host or hosts that are required, and make the changes to those hosts needed to complete the build of an Oracle RAC.

The first step is to shut down your existing workstation if you have not already. Just use the command ‘shutdown -h now’.

Next, we clone the VM. From VMware workstation, choose Manage=>clone.

Clone the current state of the machine.

You will be making a full clone. Linked clones do not work for this.

Choose the name and location, and choose finish. The new workstation will be built. Since the full disk space must be preallocated, it will take some time.

When completed, click close. Do not start up the new host yet. There are some settings that need to be changed. Specifically, we want to assign new mac addresses to the NICs in the VM. Any routers that used reserved IPs will use the MAC address to assign them, so we want to make sure this workstation has different MAC addresses than the original.

On the settings screen, click on a network card and choose ‘advanced’.

Where it says MAC address, choose generate, then OK.

Do the same for each NIC.

Now that you have assigned the MAC addresses, power on the VM.

Log into the VM as the root user, and go to applications=>system tools=>settings=>network

Then, just as done in the initial set up, set up the IP addresses. The first NIC should have a public IP assigned, and you will want to reserve that in your network.

Assign appropriate IP addresses to the other NICs. Example below:

Next, save the changes and exit the conversation.

Open a command prompt window and set the host name. The command is: ‘hostnamectl set-hostname <new host name>’ , in our case: ‘hostnamectl set-hostname racnode2’.

Do the same for each cluster node. I have had as many as a 4 node cluster running on my desktop hardware, but the performance was not very good at that point.

Our next step is to set up the hosts file for cluster. In an Oracle RAC, each host needs the following address set:

1 physical address on the public network

1 virtual address on the public network

1 private address on a network

A second private address on a private network is optional.

In addition, each cluster needs a scan name that resolves to 3 different IP addresses on the public network. That is a single name that resolves to three addresses.

Below is an example of a typical hosts file for an Oracle RAC:

10.12.1.106    racnode1    racnode1.localdomain

10.12.1.117    racnode2    racnode2.localdomain

10.12.1.107    racnode1-vip    racnode1-vip.localdomain

10.12.1.118    racnode2-vip    racnode2-vip.localdomain

192.168.63.101 racnode1-priv    racnode1-priv.localdomain

102.168.44.101 racnode1-priv2    racnode1-priv2.localdomain

192.168.63.102    racnode2-priv    racnode2-priv.localdomain

192.168.44.102    racnode2-priv2    racnode2-priv2.localdomain

10.12.1.120    racnode-scan    racnode-scan.localdomain

10.12.1.121    racnode-scan    racnode-scan.localdomain

10.12.1.122    racnode-scan    racnode-scan.localdomain

In this example, the 10.12.1.x is the public network, and the addresses that start with 192.168.x.x are the private network. Note that the virtual IPs and scan addresses are on the public network. Also note how the scan address has 3 IPs, but the name is same on all three.

Next, restart dnsmasq (systemctl restart dnsmasq), and verify that the names resolve properly. Once this is verified, copy the hosts file to all the RAC nodes and restart dnsmasq there.

Now, shut down all other nodes except the first so we can build and add the shared storage.

On the screen for the VMware workstation, choose settings. We are choosing to do the creation from racnode2, but they can be created on any of the workstations. Sadly you can only create one disk at a time.

Click on add, and choose hard disk, then next:

Take the default, SCSI.

Create a new virtual disk.

On the next screen, choose options below. These disks are going to be used as ASM storage, so you will want all disks to be the same size. We are choosing 40G. The pre-allocation and single file options are in order to allow for best performance.

Choose next, and pick the location for the shared disk. I have multiple HDDs attached to my desktop, so normally I spread these shared disks around multiple drives. The first one goes here:

Typically I will put the second disk on F or G drive. But that is entirely up to you.

Click finish.

The disk will be allocated. Make sure that you note the full path of the disk. You will need it for the next steps.

Once the disk is created, we need to add it to the other nodes in the cluster.

So, the first part of the process is the same.

On the screen for the VMware workstation, choose settings. Then click on add, and choose hard disk, then next:

This time, we choose ‘Use an existing virtual disk’.

Next, choose the file.

Note that there is a racnode2.vmdk and a racnode2-flat.vmdk. The file we want is the first one. The second is the actual data, but VMware is looking for the first file that contains information about the disk.

Click on finish, and the disk is added to the VM.

Now, we repeat the process for as many shared vmdk’s that we will need. I usually have at least 5, and as many as 10 if I want multiple asm disk groups. Below is an example of a completed VMware workstation with a normal complement of shared disks.

Note that we have two different sizes of shared disks, as this RAC has two disk groups.

Now we have completed all the requirements for setting up VMware workstations for building our home RAC. Remember, this is not a production system and never will be. It however works very well for blogging and prototyping. The final step is installing the Oracle software and building the Real Application Cluster, which I will do in my next post.

VMware Workstation Creation for Oracle RAC (part 2)

February 7, 2021

In this post, we will complete the configuration of an Oracle Enterpise Linux VMware workstation guest for use in Oracle RAC.

In our previous post, we completed the initial build of the VMware workstation, including creation of the guest, installation of software, and the configuration of the file systems. In this brief entry, we will complete the operating system configuration for the guest.

The first step is to disable Secure Linux (selinux).

To do this, simple edit the file /etc/selinux/config. Where it says selinux=target, change the line to read selinux=disabled.

Next, disable the firewall. While it is possible just to open the appropriate ports, its easier to disable it entirely:

systemctl stop firewalld

systemctl disable firewalld

Now remove libvirt. This interferes with the dnsmasq software that we will use for names resolution:

systemctl stop libvirtd.service

Systemctl disable libvirtd.service

Now update the software to the latest releases. This will run for some time, usually the software changes are extensive.

yum -y update

Next, edit /etc/dnsmasq.conf and add the following line to the bottom of the file:

local=/localdomain/

Modify /etc/resolv.conf to read as follows:

search localdomain

nameserver 127.0.0.1

nameserver 10.12.1.1 # this would be your local nameserver, often 192.168.1.1

Next, update resolv.conf so it is not modified on reboot, ‘chattr +i /etc/resolv.conf’

At this point, your operating system updates are complete. Next, shutdown the guest ‘shutdown -h now’.

Now, modify the settings in the vmx file for the guest to accommodate shared storage. I also add a line to give a boot up delay so I can enter the bios on startup if I desire. The file is normally in a directory named for the guest in the documents and settings directory under virtual machines. Eg, C:\Users\Andrew Kerber\Documents\Virtual Machines\racnode1\racnode1.vmx

disk.locking = “FALSE”

diskLib.dataCacheMaxSize = “0”

disk.enableuuid = “TRUE”

bios.bootdelay = “8000”

Now, start the VM back up and ensure it is working properly. At this point the VM and operating system are configured properly for use in an Oracle RAC.

VMware Workstation Creation for Oracle RAC (part 1)

February 5, 2021

Building an Oracle Enterprise Linux VMware Workstation Guest for Oracle RAC

In this article, we will discuss building an Oracle Real Application Cluster using a Linux VMware workstation guests on a Windows 10 host. While not usable for a production environment, VMware workstation can be used for blogging and for doing basic setup validations.

The basic steps are as follows:

  1. Purchase and download VMware workstation. You can also so this with Oracle’s Virtual Box (which is free), but the steps are slightly different.
  2. Download Oracle Linux (OEL). You can download VMware templates, but I prefer to download OEL itself from the Oracle E-delivery Cloud.
  3. Use VMware virtual network editor to build the internal network.
  4. Start VMware workstation and build your VM.
  5. Clone the VM as many times as needed for you cluster.

I am going to skip over the steps for downloading the software from Oracle E-Delivery. I downloaded OEL version 7.9.

Start VMware workstation and choose File>New Virtual Machine. Choose custom for the configuration.

 

 

 

 

 

 

 

On the next screen, choose workstation 16x. The newest version.

 

 

Under the guest operating system installation, choose the location of the OEL 7.9 ISO that we downloaded earlier.

 

 

 

Next, choose an appropriate name and location for your virtual machine.

Choose an appropriate processor configuration. Remember that this VM is running off of the hardware on the machine that you are using to build from. How much horsepower do you have to spare? In my case, I specifically configured my desktop with extra cores and RAM so I could build VMs on it. Even so, a 3 node RAC will be resource intensive.

 

Next, Choose 8G RAM. This is sufficient to run Oracle and still have memory for the desktop machine.

 

 

Next, choose an appropriate network type for the local network connection. This will normally be NAT or bridged.

 

 

Then choose an appropriate IO controller type. The default is normally fine.

 

Choose SCSI for the disk type.

 

 

We are going to create a new virtual disk.

 

On the next step, we do not take the defaults. We want to use a single file for the disk, and we want to pre-allocate the space. This gives the best performance. 150G is sufficient for a DB home, GI home, and Agent Home.

On the next screen we will customize the hardware, in this case the network interfaces.

Earlier we defined the following networks with the virtual network editor that comes with VMware workstation:

 

Each VM will have the first NIC on VMnet0, the second on VMnet1, and the third on VMnet2. VMnet0 is the public network. And 1 and 2 are private networks. So, on the next screen we choose customize hardware, create two new NICs and reconfigure the first NIC as shown below:

 

When done with the NICs, choose finish, and it will build your VM. Since it allocates 150G, it will take some time to do this.

When done building the VM it will boot up on the ISO image you specified earlier. Choose install.

Choose the appropriate language options then continue.

This will take you to the installation summary page, which is where most of the work is done. The first step is to configure the ‘Network and host name’ within the VM. Make sure you know the IP ranges for the networks that you have defined. In this case, the two local networks are 192.168.63.x and 192.168.44.x. The public network is 10.12.1.x.

For the first NIC, just turn it on and tell it to start on power up. It should have been assigned an IP within your home network. Once you see the IP that was specified, you will want to go to your home router and assigned this VM the IP it was initially granted as a reserved IP in your network. This will make sure that the names stay assigned to the same IP address.

 

 

On each NIC, make sure you choose ‘automatically connect when available’ so that the NIC activates on boot up.

 

On the second and third NICs, assign a static IP within the network range of the local network that has been assigned to this NIC.

When done with each NIC, choose ‘save’. Ensure that the configuration is saved. At this point, it should say that the network is connected.

 

Next, choose installation destination. It will bring up the screen below. Choose the ‘I will configure partitioning’ option, then click done.

After clicking done, it will bring up the screen below. Tell it to create the partitions automatically, you will then edit the layout it produces.

We will modify the configuration below to meet the requirements of an Oracle configuration. Oracle requires 16G swap, and normally you want a large / file system

 

Change the layout above to what you see here. A large / mount point, and 16G Swap.

 

When prompted, accept the partitioning changes.

Next, set the time zone as appropriate.

Next go to software selection and choose server with GUI. For additional packages, choose what you would like. I general choose system administration tools,

large systems performance, and performance tools. Don’t worry about the Oracle software prerequisites, we will get those later.

 

When done selecting software, return to the main screen. Choose whether or not to implement kdump (I normally turn it off) .

Then click on begin installation. During the installation process it will ask you to set the root password and create a user. GO ahead and do this.

Let the installation process complete. It will reboot and ask you to accept the OEL license. If you want to use the software you have to accept the license.

It will reboot again. Then log in as root and go through the various first time user screens. I always turn off the location sharing.

At this point, we are ready to complete the setup. There are four steps involved in this:

  1. Configure selinux and firewall for the oracle database.
  2. Update packages to the most recent version.
  3. Configure DNSMASQ.
  4. Install the required Oracle packages.

We will discuss those steps in the next blog.

 

 

 

Udev Rules for Oracle Storage

October 16, 2019

As I have discussed in the past, one of the problems that must be dealt with when configuring Oracle RAC is device naming persistence. In order for RAC to work properly, the storage device names must be the same across all servers.

There are two ways to deal with this, the Oracle supplied asmlib (now ASM filter driver, asmfd), and udev rules.  Asmlib is supplied with Oracle Enterprise Linux, and also now with RHEL.  However, many people prefer not to deal with the complications involved with asmlib or asmfd, and use udev rules.

Remember when using udev rules, you must put the same rules file on all the servers in the cluster, most often the file is something like this: /etc/udev/rules.d/99-oracleasm.rules.

With the latest releases of RHEL, udev naming has changed in, this time for the better. The only problem of course is that as per usual, its poorly documented. In the past, we have used lsscsi for device names,  in conjunction with scsi_id. But now there is a better way. After substantial experimentation trying to get lsscsi to work, I did some extensive research and finally determined that the best way to set up udev rules for your oracle storage is to use udevadm. This method should actually work for almost every storage supplier, so this is very handy. The command is udevadm. In our case, udevadm info –query=propery –name <your device name>.

This returns lots of useful information:

udevadm info –query=property –name /dev/sdb

DEVLINKS=/dev/disk/by-id/scsi-36000c2984bb594f77534b298654126f3 /dev/disk/by-id/wwn-0x6000c2984bb594f77534b298654126f3 /dev/disk/by-path/pci-0000:00:10.0-scsi-0:0:1:0 
DEVNAME=/dev/sdb 
DEVPATH=/devices/pci0000:00/0000:00:10.0/host0/target0:0:1/0:0:1:0/block/sdb 
DEVTYPE=disk 
FC_TARGET_LUN=0 
ID_BUS=scsi 
ID_MODEL=VMware_Virtual_S 
ID_MODEL_ENC=VMware\x20Virtual\x20S 
ID_PATH=pci-0000:00:10.0-scsi-0:0:1:0 
ID_PATH_TAG=pci-0000_00_10_0-scsi-0_0_1_0 
ID_REVISION=1.0 
ID_SCSI=1 
ID_SCSI_SERIAL=6000c2984bb594f77534b298654126f3 
ID_SERIAL=36000c2984bb594f77534b298654126f3 
ID_SERIAL_SHORT=6000c2984bb594f77534b298654126f3 
ID_TYPE=disk 
ID_VENDOR=VMware_ 
ID_VENDOR_ENC=VMware\x2c\x20 
ID_WWN=0x6000c2984bb594f7 
ID_WWN_VENDOR_EXTENSION=0x7534b298654126f3 
ID_WWN_WITH_EXTENSION=0x6000c2984bb594f77534b298654126f3 
MAJOR=8 
MINOR=16 
MPATH_SBIN_PATH=/sbin 
SUBSYSTEM=block 
TAGS=:systemd: 
USEC_INITIALIZED=66016

In our case, we are interested in any of the three highlighted serial numbers. Any of these can be used and we have chosen to use the ID_SERIAL_SHORT, the command in your rules file is: ENV{ID_SERIAL_SHORT} The line in the rules file is:

KERNEL==”sd?”, ENV{ID_SERIAL}==”36000c2984bb594f77534b298654126f3″ SYMLINK+=”oracleasm/asm-disk1″, OWNER=”oracle”, GROUP=”dba”, MODE=”0660″

This will give a consistent name to the storage device for your oracle database.

After a little experimentation, I wrote this simple script to auto-generate the rules:

#!/bin/sh 
COUNTER=0 
for disk in `ls /dev/sd* -c1 | grep -v sda` 
do 
shortserial=`udevadm info --query=property --name $disk | grep ID_SERIAL_SHORT | awk -F "=" '{print $2}'` 
COUNTER=$((COUNTER+1)) 
echo KERNEL==\"sd?\", ENV{ID_SERIAL_SHORT}==\"$shortserial\" SYMLINK+=\"oracleasm/asm-disk$COUNTER\", OWNER=\"oracle\", GROUP=\"dba\", MODE=\"0660\" 
done

Note that in initialization of the ‘for’ loop, we do grep -v sda, this is because sda is almost always going to be an internal device, not attached storage, and not used for the database storage. This the above script produces output like this:

KERNEL=="sd?", ENV{ID_SERIAL_SHORT}=="6000c291476a173c2886cc425ea7050e" SYMLINK+="oracleasm/asm-disk1", OWNER="oracle", GROUP="dba", MODE="0660" 
KERNEL=="sd?", ENV{ID_SERIAL_SHORT}=="6000c29ef4eff651c416774c6172fe33" SYMLINK+="oracleasm/asm-disk2", OWNER="oracle", GROUP="dba", MODE="0660" 
KERNEL=="sd?", ENV{ID_SERIAL_SHORT}=="6000c2994337b652f937a6ffe64f9217" SYMLINK+="oracleasm/asm-disk3", OWNER="oracle", GROUP="dba", MODE="0660" 
KERNEL=="sd?", ENV{ID_SERIAL_SHORT}=="6000c29f2971757fc3fb8b34e5e9e35f" SYMLINK+="oracleasm/asm-disk4", OWNER="oracle", GROUP="dba", MODE="0660" 
KERNEL=="sd?", ENV{ID_SERIAL_SHORT}=="6000c2907f869d72b62d5774f933a97c" SYMLINK+="oracleasm/asm-disk5", OWNER="oracle", GROUP="dba", MODE="0660" 
KERNEL=="sd?", ENV{ID_SERIAL_SHORT}=="6000c29ae1d265baf6604af619b2f79a" SYMLINK+="oracleasm/asm-disk6", OWNER="oracle", GROUP="dba", MODE="0660" 
KERNEL=="sd?", ENV{ID_SERIAL_SHORT}=="6000c299e07c9ed590a2884cabfa855e" SYMLINK+="oracleasm/asm-disk7", OWNER="oracle", GROUP="dba", MODE="0660" 
KERNEL=="sd?", ENV{ID_SERIAL_SHORT}=="6000c29e1848051e29a84408a27d2522" SYMLINK+="oracleasm/asm-disk8", OWNER="oracle", GROUP="dba", MODE="0660" 
KERNEL=="sd?", ENV{ID_SERIAL_SHORT}=="6000c290af449698eb820b0e4b329471" SYMLINK+="oracleasm/asm-disk9", OWNER="oracle", GROUP="dba", MODE="0660" 
KERNEL=="sd?", ENV{ID_SERIAL_SHORT}=="6000c2997873b2d7c86a1d561bedb294" SYMLINK+="oracleasm/asm-disk10", OWNER="oracle", GROUP="dba", MODE="0660" 
KERNEL=="sd?", ENV{ID_SERIAL_SHORT}=="6000c2984bb594f77534b298654126f3" SYMLINK+="oracleasm/asm-disk11", OWNER="oracle", GROUP="dba", MODE="0660"

We copy the output above to the file /etc/udev/rules.d/99-oracleasm.rules on all the servers in the cluster, and run the command ‘udevadm -trigger’ or reboot all nodes.  All the storage devices will then come up in the same location, and typically the Oracle installer will find them when setting up the Oracle Grid Infrastructure.

In summary, we have successfully generated udev rules for attached storage. The script provided is very simple, and should work for all of the attached storage devices.

SymmetricDS Bi-Directional Replication

August 12, 2019

This post we are going to discuss a lower cost alternative to Oracle GoldenGate.  SymmetricDS.

SymmetricDS is open source Java software designed to run SQL replication.  It works well, but since it uses triggers on the database for all of its Processing, it is going to have a significant impact on your system, and is likely to run into problems at high volumes.  I have not load tested SymmetricDS vs Golden Gate, but the tuning options for SymmetricDS are limited, so it is unlikely to be able to compete with GoldenGate on performance.  However, SymmetricDS is indeed a viable and low cost option.

In our example, we are using the same configuration as our previous blog, https://dbakerber.wordpress.com/2019/08/09/goldengate-multi-master-with-integrated-replicat-and-apply/ .  The tnsnames entries for each database are here. Note that we dont actually use the TNS entries, but having them available makes the connection information easy to find:

ORA183T =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.12.1.28)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ora183t)
    )
  )

ORA121T =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.12.1.8)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ora121t)
    )
  )

The SymmetricDS program is running on an OEL 7.6 server named oralin, with an IP address of 10.12.1.229.  The Oracle client has been installed on this server.  We have the bin directory for the client in the PATH. 

SymmetricDS is installed by simply unzipping the distribution file.  In our case, we unzipped the files to /u01/app/symmetricds, so that it sits in the same structure as our Oracle client.

In symmetricDS, your sources are defined by ‘engine’ files, which give the connection information, including the driver and connection username and password.    The sample engine files are in the samples directory. In this case, we took the store-001.properties file, renamed it to ora121t-001.properties, and copied it to the engines subdirectory.  The store-002.properties files was copied to ora183t.properties.  SymmetricDS expects the engine files to be in the engines subdirectory, and will automatically connect to the instances defined there.

Below are our two engine files:

[oracle@oralin engines]$ cat ora121t-001.properties
#
# Licensed to JumpMind Inc under one or more contributor
# license agreements.  See the NOTICE file distributed
# with this work for additional information regarding
# copyright ownership.  JumpMind Inc licenses this file
# to you under the GNU General Public License, version 3.0 (GPLv3)
# (the "License"); you may not use this file except in compliance
# with the License.
#
# You should have received a copy of the GNU General Public License,
# version 3.0 (GPLv3) along with this library; if not, see
# <
http://www.gnu.org/licenses/>.
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
#

# Friendly name to refer to this node from command line
engine.name=ora121t-001

# The class name for the JDBC Driver
db.driver=oracle.jdbc.driver.OracleDriver

# The JDBC URL used to connect to the database
db.url=jdbc:oracle:thin:@10.12.1.8:1521:ora121t

# The database user that SymmetricDS should use.
db.user=hrds

# The database password
db.password=tiger

# This node will contact the root node’s sync.url to register itself.
registration.url=
http://10.12.1.229:31415/sync/ora121t-001
sync.url=http://10.12.1.229:31415/sync/ora121t-001

# Node group this node belongs to, which defines what it will sync with

# Must match the sym_node_group configuration in database.
group.id=ora121t

# External ID for this node, which is any unique identifier.
external.id=001

# How to run routing (in millis), which puts changes into batches.
job.routing.period.time.ms=5000

# How often to run push (in millis), which sends changes to other nodes.
job.push.period.time.ms=10000

# How often to run pull (in millis), which receives changes job.pull.period.time.ms=10000

[oracle@oralin engines]$ cat ora183t-002.properties
#
# Licensed to JumpMind Inc under one or more contributor
# license agreements.  See the NOTICE file distributed
# with this work for additional information regarding
# copyright ownership.  JumpMind Inc licenses this file
# to you under the GNU General Public License, version 3.0 (GPLv3)
# (the "License"); you may not use this file except in compliance
# with the License.
#
# You should have received a copy of the GNU General Public License,
# version 3.0 (GPLv3) along with this library; if not, see
# <
http://www.gnu.org/licenses/>.
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
#

# Friendly name to refer to this node from command line
engine.name=ora183t-002

# The class name for the JDBC Driver
db.driver=oracle.jdbc.driver.OracleDriver

# The JDBC URL used to connect to the database
db.url=jdbc:oracle:thin:@10.12.1.28:1521:ora183t

# The database user that SymmetricDS should use.
db.user=hrds

# The database password
db.password=tiger

# This node will contact the root node’s sync.url to register itself.
# registration.url=
http://localhost:31415/sync/corp-000
registration.url=http://10.12.1.229:31415/sync/ora121t-001
sync.url=http://10.12.1.229:31415/sync/ora183t-002

# Node group this node belongs to, which defines what it will sync with who.
# Must match the sym_node_group configuration in database.
group.id=ora183t

# External ID for this node, which is any unique identifier you want to use.
external.id=002

# How to run routing (in millis), which puts changes into batches.
job.routing.period.time.ms=5000

# How often to run push (in millis), which sends changes to other nodes.
job.push.period.time.ms=10000

# How often to run pull (in millis), which receives changes from other nodes.
job.pull.period.time.ms=10000

Once the engine files are created, create the required symmetricDS tables by running symadmin to create the required tables in the 001 engine:

[oracle@oralin bin]$ ./bin/symadmin –engine ora121t-001 create-sym-tables

We are using the same schema for replication and running the symmetricDS application, the HRDS schema.  This is not required, but remember to grant appropriate privileges using a different schema (eg, symmetric).  Below are the commands used to build the identity information for this configuration:

insert into hrds.sym_node (node_id,node_group_id,external_id,sync_enabled,sync_url,schema_version,symmetric_version, database_type,database_version,batch_to_send_count, batch_in_error_count,created_at_node_id)
 values ('000','ora121t','000',1,null,null,null,null,null,0,0,'000');
 insert into hrds.sym_node (node_id,node_group_id,external_id,sync_enabled,sync_url,schema_version,symmetric_version, database_type,database_version,batch_to_send_count, batch_in_error_count,created_at_node_id)
 values ('001','ora183t','001',1,null,null,null,null,null,0,0,'000');
 insert into hrds.sym_node_identity values ('001');
 insert into hrds.sym_node_security (node_id,node_password,registration_enabled,registration_time,initial_load_enabled,initial_load_time, created_at_node_id)
 values ('000','5d1c92bbacbe2edb9e1ca5dbb0e481',0,current_timestamp,0,current_timestamp,'000');
 insert into hrds.sym_node_security (node_id,node_password,registration_enabled,registration_time,initial_load_enabled,initial_load_time, created_at_node_id)
 values ('001','5d1c92bbacbe2edb9e1ca5dbb0e481',1,null,1,current_timestamp,'000');
 -- note:  the insert above sets initial_load_enabled to 1 to enable initial load.
 insert into sym_node_group (node_group_id) values ('ora121t');
 insert into sym_node_group (node_group_id) values ('ora183t');
 commit;

Once the identify information is created as shown above, its time to define the replication.  Note that the tables are not particularly intuitive, so the summary below is from my previous blog to explain the table usage:

Node/Group Identification Concepts

The individual nodes are defined in the sym_node table.  This is just a list of the individual nodes involved in the replication process.

The sym_node_security table assigns creates connection information among the nodes.

The sym_node_group table contains the names of the groups involved in replication. In this example, we only have one group.

The sym_node_group_link table defines how data is replicated between groups. If the data_event_action column is set to ‘P’ data is pushed on events. If it is set to ‘W’ the group waits for a pull event. Note that for this example, we only have one group (store) and thus both stores will push data.

Driver Table Concepts

At the lowest level, the sym_channel table allocates a channel for replication.  For large loads, multiple channels can be created. 

Next, the sym_trigger table defines what tables are replicated, what events trigger events, and what channel the event is sent to.

The sym_router table defines routers that route events between groups.  A router can have the same group as the source and destination, this would imply bi-directional replication is configured.

The sym_trigger_router table identifies what events are sent to which routers.

So, the basic concept is that you define a channel, define trigger event and assign it to a channel, define a router, and assign a trigger to a router.

Below are the commands we used to load data into the driver tables:

insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('ora121t', 'ora183t', 'W');
 insert into sym_channel
 (channel_id, processing_order, max_batch_size, enabled, description)
 values('ora121t_outbound', 1, 100000, 1, 'ora121t transaciton');
 insert into sym_trigger (trigger_id, source_table_name, channel_id, last_update_time, create_time,sync_on_incoming_batch)
 values ('ora121t_outbound_tr', '*','ora121t_outbound',sysdate,sysdate,1);
 insert into sym_router
 (router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time)
 values('ora121t_2_ora183t', 'ora121t', 'ora183t', 'default',current_timestamp, current_timestamp);
 insert into sym_trigger_router
 (trigger_id,router_id,initial_load_order,last_update_time,create_time)
 values('ora121t_outbound_tr','ora121t_2_ora183t', 1, current_timestamp, current_timestamp);

commit;
 insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('ora183t', 'ora121t', 'P');
 insert into sym_channel
 (channel_id, processing_order, max_batch_size, enabled, description)
 values('ora183t_outbound', 1, 100000, 1, 'ora183t transaciton');
 insert into sym_trigger (trigger_id, source_table_name, channel_id, last_update_time, create_time,sync_on_incoming_batch)
 values ('ora183t_outbound_tr', '*','ora183t_outbound',sysdate,sysdate,1);
 insert into sym_router
 (router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time)
 values('ora183t_2_ora121t', 'ora183t', 'ora121t', 'default',current_timestamp, current_timestamp);
 insert into sym_trigger_router
 (trigger_id,router_id,initial_load_order,last_update_time,create_time)
 values('ora183t_outbound_tr','ora183t_2_ora121t', 100, current_timestamp, current_timestamp);
 commit

Note that even though we use ‘*’, only the tables without a sym_ name prefix will be replicated.  SymmetricDS assumes that all table names beginning with sym_ are its own tables and does not replicate them.

Next, we start the program. ./bin/sym.  Watch the output of the program.  There may be errors.  Since these are java errors, tracking down the cause and finding a fix may take time.

The error you are most likely to see is this:

[ora121t-001] – RegistrationUriHandler – ora183t:002:? was not allowed to register.

[ora183t-002] – RegistrationService – Waiting for registration to be accepted by the server. Registration is not open.

[ora183t-002] – RegistrationService – Could not register.  Sleeping for 20000ms before attempting again.

If you see the registration error, just let it run and start a session on another terminal and run this command:

./symadmin –engine ora121t-001 open-registration ora183t 002

The command above opens registration for the second engine.  If the command works, you will see this:

[ora183t-002] – DataLoaderService – Using registration URL of http://10.12.1.229:31415/sync/ora121t-001/registration?nodeGroupId=ora183t&externalId=002&syncURL=http%3A%2F%2F10.12.1.229%3A31415%2Fsync%2Fora183t-002&schemaVersion=%3F&databaseType=Oracle&databaseVersion=18.0&symmetricVersion=3.10.3&deploymentType=server&hostName=oralin.localdomain&ipAddress=10.12.1.229

[ora121t-001] – RegistrationService – Completed registration of node ora183t:002:002

[ora121t-001] – ConfigurationChangedDataRouter – About to refresh the cache of nodes because new configuration came through the data router

After successful registration, you will see output about creating a large number of triggers. As noted earlier, SymmetricDS creates triggers on the database to run replication.  The triggers you see being created are in response to the SQL statements we ran above to configure the replication.

Once again, follow the output for errors.

You will note that in the above process we only ran updates on one database instance.  SymmetricDS then read the information in the configuration tables and wrote the required information to the other instance.

At this time, the bi-directional replication is configured and running.  Run updates on individual tables and verify that the data is replicating properly.

Goldengate Multi-Master with Integrated Replicat and Apply

August 9, 2019

In this article we are going to discuss the simplest bi-directional Goldengate configuration that we could build.

The environment is as follows:

Two Linux VMs running Oracle Enterprise Linux, x86_64.  One server is build in virtual box, using IP address 10.12.1.8, hostname gg121too.  The other is built in VMware workstation, and is IP address 10.12.1.28, hostname gg183too.

This sort of simple bi-directional setup is what might be built in preparation for upgrade to a more recent oracle version. In our case we are replicating from Oracle 12.1.0.2 to Oracle 18.3.0 and back.

Use Case:

Prior to the upgrade, all activity is on the 12.1.0.2 database, and all changes are replicated to the oracle 18.3.0 instance.  In the near future, we plan to upgrade to 18.3.0 by redirecting our application to the 18.3.0 instance.  Since Goldengate is set up as multi-master, after switching the apps to the 18.3.0 instance, all updates will be replicated to the 12.1.0.2 instance.  At a point when everyone is content that the 18.3 instance is running properly without problems, we will remove the Goldengate replication and remove the 12.1 database.

Install Oracle Goldengate on each server. In our case, we chose the Goldengate version appropriate to the oracle version, then chose /u01/gghome for the Goldengate home directory, and chose the option to start the manager after installation. Choosing the option to start the manager also means that the directories will be created.

On each server, the manager parameters are as follows:

PORT 7809
AUTOSTART ER *

The GLOBALS has this line:

CHECKPOINTTABLE gguser.checkpoint

Next, we configured both database instances for Goldengate. First we create the Goldengate tablespace. Then we created the Goldengate users and granted them required privileges on the database:

SQL> create tablespace golden datafile ‘/u01/app/oradata/ORA121T/golden01.dbf’ size 1024m autoextend on;

SQL> create user gguser identified by gguser;

SQL> grant dba, create session, select any table, unlimited tablespace, select any dictionary to gguser;

SQL> create user ggadmin identified by ggadmin;

SQL> grant dba, create session, select any table, unlimited tablespace, select any dictionary to gguser;

SQL> exec dbms_goldengate_auth.grant_admin_privilege(‘GGUSER’);

SQL> exec dbms_goldengate_auth.grant_admin_privilege(‘GGADMIN’);

SQL> alter user gguser default tablespace golden;

The GGUSER user will be used to run the replication process itself.  The GGADMIN user will be used to run administrative commands that do not get replicated between the two systems. For example, should sequences get out of synch, manual intervention may be required to make sure the sequence is at the right value in each instance.

In order for Goldengate to work properly, supplemental log data must be added to the database, force logging must be turned on, and flashback should be turned on.  In order to turn on flashback database, run these commands:

SQL> shutdown immediate;
SQL> startup mount;
SQL> alter database flashback on;
SQL> alter database archivelog;  — Only if necessary
SQL> alter database open;
SQL> alter system set enable_goldengate_replication=TRUE scope=both;

Next, to turn on supplemental log data and force logging:

SQL> alter database add supplemental log data;
SQL> alter database force logging;
SQL> alter database switch logfile;

Note that the commands above should be run on both the source and destination databases.

Next we will set up our tnsnames.ora file.  While the entry names are the same on each server, the entries themselves are different.  Below are the tns entries for gg121too:

ORA121T =
  (DESCRIPTION =
    (SDU = 2097152)
    (ADDRESS =
      (PROTOCOL = BEQ)
      (PROGRAM = /u01/app/oracle/product/12.1.0/dbhome_1/bin/oracle)
      (ARGV0 = ora121t)
      (ARGS='(DESCRIPTION=(SDU=2097152)(LOCAL=YES)(ADDRESS=(PROTOCOL=BEQ)))’)
      (ENVS=’ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1,ORACLE_SID=ora121t’)
    )
    (CONNECT_DATA =
      (SID = ora121t)
    )
  )

ORA183T =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.12.1.28)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ora183t)
    )
  )

On ora183too below are the TNS entries:

ORA121T =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.12.1.8)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ora121t)
    )
  )

ORA183T =
  (DESCRIPTION =
    (SDU = 2097152)
    (ADDRESS =
      (PROTOCOL = BEQ)
      (PROGRAM = /u01/app/oracle/product/18.3.0/dbhome_1/bin/oracle)
       (ARGV0 = ora183t)
      (ARGS='(DESCRIPTION=(SDU=2097152)(LOCAL=YES)(ADDRESS=(PROTOCOL=BEQ)))’)
      (ENVS=’ORACLE_HOME=/u01/app/oracle/product/18.3.0/dbhome_1,ORACLE_SID=ora183t’)
    )
    (CONNECT_DATA =
      (SID = ora183t)
    )
  )

Note that these entries are optimized to ensure the best performance for the local connection.

Next, connect to Goldengate and configure the credentials. 

On gg121too:

GGSCI (gg121too.localdomain)1>add credentialstore
GGSCI (gg121too.localdomain)2> alter credentialstore add user gguser@ora121t, password gguser alias ggulocal
GGSCI (gg121too.localdomain)3>alter credentialstore add user ggadmin@ora121t, password ggadmin alias ggalocal
GGSCI (gg121too.localdomain)4>alter credentialstore add user gguser@ora183t, password gguser alias gguremote
GGSCI (gg121too.localdomain)5>alter credentialstore add user ggadmin@ora183t, password ggadmin alias ggaremote

Note what we are doing here.  The intent is that the local credentials always have the word ‘local’ in the name, and the remote system credentials will always have the word remote in the name.  Thus on gg183too, the corresponding commands are below:

GGSCI (gg183too.localdomain)1>add credentialstore
GGSCI (gg183too.localdomain)2> alter credentialstore add user gguser@ora183t, password gguser alias ggulocal
GGSCI (gg183too.localdomain)3>alter credentialstore add user ggadmin@ora183t, password ggadmin alias ggalocal
GGSCI (gg183too.localdomain)4>alter credentialstore add user gguser@ora121, password gguser alias gguremote
GGSCI (gg183too.localdomain)5>alter credentialstore add user ggadmin@ora121t, password ggadmin alias ggaremote

We will be copying the hr schema from the 12c to the 18c instance. Most often multiple schemas will need to be replicated, but we are keeping this to a simple case.

Next run these commands:

GGSCI (gg121too.localdomain)1> dblogin useridalias ggulocal
GGSCI (gg121too as gguser@ora121t) 2> add checkpointtable gguser.checkpoint
GGSCI (gg121too as gguser@ora121t) 6> add trandata hr.* cols *

At this point, Goldengate is configured on the source database.  Now create the checkpointtable on the destination database:

GGSCI (gg121too.localdomain)1> dblogin useridalias gguremote
GGSCI (gg121too as gguser@ora183t) 2> add checkpointtable gguser.checkpoint

Now we are ready to begin the configuration of the replication.

Connect back to the local instance and create the extract process:

GGSCI (gg121too) 4> dblogin useridalias ggulocal

GGSCI (gg121.localdomain as gguser@ora121t) 9> register extract capture1,  database

2019-08-08 17:35:51  INFO    OGG-02003  Extract CAPTURE1 successfully registered with database at SCN 1804072. <== Make note of this SCN it will be used later.

GGSCI (gg121.localdomain as gguser@ora121t) 10> add extract capture1, integrated tranlog, begin now
EXTRACT (Integrated) added.

GGSCI (gg121.localdomain as gguser@ora121t) 11> add rmttrail ./dirdat/rt, extract capture1
RMTTRAIL added.

Create the parameter file for the capture, and add these lines (edit params CAPTURE1):

extract capture1
useridalias ggulocal
rmthost 10.12.1.28, mgrport 7809
tranlogOptions IntegratedParams (max_sga_size 256), excludeuser ggadmin
DDL include mapped
rmttrail ./dirdat/rt
LOGALLSUPCOLS
UPDATERECORDFORMAT COMPACT
table hr.*;

Next, start the extract and monitor the file ggserr.log.  Fix any errors that occur:

GGSCI (gg121.localdomain as gguser@ora121t) 10> start extract capture1

Next, export the data in the schemas that are being replicated. In this case, we are using the HR schema.  The SCN from above is used as the flashback scn for the datapump export:

expdp directory=homedir dumpfile=hrscn.dmp flashback_scn=1804072 schemas=HR

Export: Release 12.1.0.2.0 – Production on Thu Aug 8 17:39:46 2019

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Username: system/password

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 – 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
FLASHBACK automatically enabled to preserve database integrity.
Starting “SYSTEM”.”SYS_EXPORT_SCHEMA_01″:  system/******** directory=homedir dumpfile=hrscn.dmp flashback_scn=1804071 schemas=HR
Estimate in progress using BLOCKS method…
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 512 KB
Processing object type SCHEMA_EXPORT/USER
Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
Processing object type SCHEMA_EXPORT/ROLE_GRANT
Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
Processing object type SCHEMA_EXPORT/TABLESPACE_QUOTA
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE
Processing object type SCHEMA_EXPORT/TABLE/PROCACT_INSTANCE
Processing object type SCHEMA_EXPORT/TABLE/TABLE
Processing object type SCHEMA_EXPORT/TABLE/COMMENT
Processing object type SCHEMA_EXPORT/PROCEDURE/PROCEDURE
Processing object type SCHEMA_EXPORT/PROCEDURE/ALTER_PROCEDURE
Processing object type SCHEMA_EXPORT/VIEW/VIEW
Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/TRIGGER
Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type SCHEMA_EXPORT/STATISTICS/MARKER
. . exported “HR”.”COUNTRIES”                            6.906 KB      25 rows
. . exported “HR”.”DEPARTMENTS”                          7.570 KB      27 rows
. . exported “HR”.”EMPLOYEES”                            17.53 KB     107 rows
. . exported “HR”.”JOBS”                                 7.554 KB      19 rows
. . exported “HR”.”JOB_HISTORY”                          7.640 KB      10 rows
. . exported “HR”.”LOCATIONS”                            8.882 KB      23 rows
. . exported “HR”.”REGIONS”                              5.992 KB       4 rows
. . exported “HR”.”TEST”                                 19.72 KB     123 rows
Master table “SYSTEM”.”SYS_EXPORT_SCHEMA_01″ successfully loaded/unloaded
******************************************************************************
Dump file set for SYSTEM.SYS_EXPORT_SCHEMA_01 is:
  /home/oracle/hrscn.dmp
Job “SYSTEM”.”SYS_EXPORT_SCHEMA_01″ successfully completed at Thu Aug 8 17:40:35 2019 elapsed 0 00:00:33

Next, copy the file hrscn.dmp to the destination system and import it into the database:

[oracle@gg183too gghome]$ impdp system/password directory=homedir dumpfile=hrscn.dmp

Import: Release 18.0.0.0.0 – Production on Thu Aug 8 16:55:49 2019
Version 18.7.0.0.0

Copyright (c) 1982, 2018, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 – Production
Master table “SYSTEM”.”SYS_IMPORT_FULL_01″ successfully loaded/unloaded
Starting “SYSTEM”.”SYS_IMPORT_FULL_01″:  system/******** directory=homedir dumpfile=hrscn.dmp
Processing object type SCHEMA_EXPORT/USER
Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
Processing object type SCHEMA_EXPORT/ROLE_GRANT
Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
Processing object type SCHEMA_EXPORT/TABLESPACE_QUOTA
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE
Processing object type SCHEMA_EXPORT/TABLE/PROCACT_INSTANCE
Processing object type SCHEMA_EXPORT/TABLE/TABLE
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
. . imported “HR”.”COUNTRIES”                            6.906 KB      25 rows
. . imported “HR”.”DEPARTMENTS”                          7.570 KB      27 rows
. . imported “HR”.”EMPLOYEES”                            17.53 KB     107 rows
. . imported “HR”.”JOBS”                                 7.554 KB      19 rows
. . imported “HR”.”JOB_HISTORY”                          7.640 KB      10 rows
. . imported “HR”.”LOCATIONS”                            8.882 KB      23 rows
. . imported “HR”.”REGIONS”                              5.992 KB       4 rows
. . imported “HR”.”TEST”                                 19.72 KB     123 rows
Processing object type SCHEMA_EXPORT/TABLE/COMMENT
Processing object type SCHEMA_EXPORT/PROCEDURE/PROCEDURE
Processing object type SCHEMA_EXPORT/PROCEDURE/ALTER_PROCEDURE
Processing object type SCHEMA_EXPORT/VIEW/VIEW
Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/TRIGGER
Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type SCHEMA_EXPORT/STATISTICS/MARKER
Job “SYSTEM”.”SYS_IMPORT_FULL_01″ successfully completed at Thu Aug 8 16:56:42 2019 elapsed 0 00:00:44

Now create and start the replicat process.  Connect to ggsci on the destination system and create the process.

GGSCI (gg183too) 8> edit params apply1

replicat apply1
DBOPTIONS INTEGRATEDPARAMS(parallelism 2)
assumetargetdefs
useridalias ggulocal
handlecollisions
discardfile ./dirrpt/hrerr.dsc, purge
ddl include all
map hr.*, target hr.*;

GGSCI (gg183too) 2> add replicat apply1, integrated, exttrail ./dirdat/rt

GGSCI (gg183too) 2> start replicat apply1

Once again, monitor the file ggserr.log for any errors and fix.  At this point, one way replication from the 12.1 source to the 18.3 target should be working.  Run some updates to verify that everything is working.

Next, we configure the reverse replication in case we need it.  This is basically identical to the initial replication, the only difference is we dont need to do an export/import.

GGSCI (gg183too) 2> dblogin useridalias ggulocal
GGSCI (gg183too as gguser@ora183t) 2> register extract capture2, database
GGSCI (gg183too as gguser@ora183t) 3> add extract capture2, integrated tranlog, begin now
GGSCI (gg183too as gguser@ora183t) 9> add rmttrail ./dirdat/xt, extract capture2

GGSCI (gg183too as gguser@ora183t) 3> edit params CAPTURE2

extract capture2
useridalias ggulocal
rmthost 10.12.1.8, mgrport 7809
tranlogOptions IntegratedParams (max_sga_size 256), excludeuser ggadmin
DDL include mapped
rmttrail ./dirdat/xt
LOGALLSUPCOLS
UPDATERECORDFORMAT COMPACT
table hr.*;

GGSCI (gg183too as gguser@ora183t) 3> start extract capture2

Monitor for errors in ggserr.log.

Then, go back to the source instance and configure the replicat process.

GGSCI (gg121too) 7> add replicat apply2 integrated exttrail ./dirdat/xt
R

GGSCI (gg121too) 7> edit params apply2

replicat apply2
  DBOPTIONS INTEGRATEDPARAMS(parallelism 2)
  AssumeTargetDefs
  DiscardFile ./dirrpt/rpdw.dsc, purge
  USERIDALIAS ggulocal
  DDLERROR DEFAULT DISCARD IGNOREMISSINGOBJECTS
  REPERROR (DEFAULT, EXCEPTION)
  MAP HR.*, target HR.*;

GGSCI (gg121too) 8> start replicat apply2

Once again, monitor the ggserr.log file for problems.  At this point, everything should be replicating.  Run an update to verify that changes are copying in both directions.

Multi-master replication is now configured and running between and Oracle 12.1 instance and an oracle 18.3 instance.

Implementing the GI Standalone Agent for Oracle Goldengate

August 6, 2019

In my previous post, we discussed configuring multi-master replication in a clustered environment.  In this post, we will discuss configuring and using the standalone agent to manage the Goldengate processes using a virtual IP (VIP).

The first step is to remove the current agent configuration. The command to do this is:

$ggclust11 agctl remove goldengate GG_SOURCE

$ggclust21 agctl remove goldengate GG_TARGET

We will create a virtual IP that the Goldengate processes will use. The pump processes will modified to connect to this VIP to transmit data to the replicat process.

In order to create the VIP, we first must determine the correct network.  It must be on the same network as the Virtual IPs and SCAN IPs for the database. The command below will show the correct network to use, though most likely as a DBA you would already know this:

crsctl status resource -p -attr NAME,USR_ORA_SUBNET -w "TYPE = ora.network.type" |sort | uniq

NAME=ora.net1.network
USR_ORA_SUBNET=10.12.1.0

For this process, we have chosen 10.12.1.201 for the VIP for the ggclust11/12 cluster.  The name of this VIP is ggclust1-vip.

The VIP for the ggclust21/22 cluster is ggclust2-vip, 10.12.1.202.

In order to configure a VIP, the agctl command must be run as root.  In our previous example, the commands were run as Oracle. The commands below were run to configure the VIPs on the two clusters:

ggclust11 # cd /u01/app/oracle/xag/bin

ggclust11 # ./agctl add goldengate GG_SOURCE \
--gg_home /u02/gghome \
--oracle_home /u01/app/oracle/product/19.3.0/dbhome_1 \
--db_services ora.demogg1.oggserv.svc --use_local_services \
--monitor_extracts hrext1,hrpump --monitor_replicats hrrep2 --filesystems ora.acfs.ggvol.acfs \
--network 1 \
--ip 10.12.1.201 --user oracle --group oinstall \
--nodes ggclust11,ggclust12

ggclust21 # cd /u01/app/oracle/xag/bin

ggclust21 # 

./agctl add goldengate GG_TARGET \
--gg_home /u02/gghome \
 --oracle_home /u01/app/oracle/product/19.3.0/dbhome_1 \
 --db_services ora.demogg2.oggserv.svc --use_local_services \
 --network 1 \
 --ip 10.12.1.202 \
 --monitor_extracts hr2ext1,hrpump2 --monitor_replicats hrrep1 --filesystems ora.acfs.ggdata.acfs \
 --user oracle --group oinstall \
 --nodes ggclust21,ggclust22

Next modify the parameter file for the pump (hrpump and hrpump2) process so that the remotehost line points to the newly created VIP.

For hrext1:

rmthost ggclust2-vip, mgrport 7809

For hr2ext1:

rmthost ggclust1-vip, mgrport 7809

Finally, start the goldengate replication on both clusters:

ggclust11$ agctl start goldengate GG_SOURCE

ggclust12$ agctl start goldengate GG_TARGET

At this point, if you want to relocate the replication, you can use the ‘agctl relocate goldengate’ command, for example:

agctl relocate goldengate GG_SOURCE –node ggclust12

Note that since you are moving the VIP each time you do this, the pump processes will abend.  You will either need to restart the pump processes manually, or configure them for autorestart. (AUTORESTART EXTRACT HRPUMP in the file mgr.prm).

To stop the processes, run the stop command:

agctl stop goldengate GG_SOURCE

To check the status of the Goldengate processes:

agctl status goldengate GG_SOURCE

At this point you have configured the most basic high availability for Goldengate in a cluster.  You will need to add monitoring scripts and options processes for when Goldengate runs into problems.

Goldengate Multi-Master Replication on Oracle 19

August 6, 2019

In this article we are going to configure Oracle Goldengate for mult-master replication in a RAC environment.  Note that this is a complex environment, and your environment will have different problems and issues.  Here is my environment:

We have two-2 node RAC Databases:

Cluster 1 has nodes ggclust11 and ggclust12.  Cluster 1 runs cluster database demogg1, with instances demogg11 and demogg12.

Cluster number 2, has nodes ggclust21 and ggclust22 and runs database ggclust2, with instances ggclust21 and ggclust22.

We are going to use the hr schema in the sample schemas provided by oracle for our replication. I have added a column to each table, last_upd_dt_tm of data type timestamp in order to allow us to effectively run multi-master replication. I also created a trigger for each table that fills out the last_updt_dt_tm column on update or insert.

See my previous blog (https://dbakerber.wordpress.com/2019/07/31/oracle-software-installation-bugs/) for a couple of fun bugs I ran into while building the cluster and creating the databases.

On both databases the following commands were run as part of the initial setup:

Alter database archivelog;
Alter database flashback on;
Alter database add supplemental log data;
Alter database force logging;

I have configured dnsmasq for names resolution so that the DNS works as it would in an enterprise level environment. Below is the hosts file used for the two clusters:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4 
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
10.12.1.190 ggclust1-scan.localdomain ggclust1-scan 
10.12.1.191 ggclust1-scan.localdomain ggclust1-scan 
10.12.1.192 ggclust1-scan.localdomain ggclust1-scan 
10.12.1.193 ggclust11-vip.localdomain ggclust11-vip 
10.12.1.200 ggclust12-vip.localdomain ggclust12-vip 
10.12.1.195 ggclust2-scan.localdomain ggclust2-scan 
10.12.1.196 ggclust2-scan.localdomain ggclust2-scan 
10.12.1.197 ggclust2-scan.localdomain ggclust2-scan 
10.12.1.198 ggclust21-vip.localdomain ggclust21-vip 
10.12.1.199 ggclust22-vip.localdomain ggclust22-vip
10.12.1.238 ggclust11.localdomain ggclust11 
10.12.1.46 ggclust12.localdomain ggclust12 
10.12.1.37 ggclust21.localdomain ggclust21 
10.12.1.15 ggclust22.localdomain ggclust22
192.168.31.111 ggclust11-priv.localdomain ggclust11-priv 
192.168.31.112 ggclust12-priv.localdomain ggclust12-priv 
192.168.31.121 ggclust21-priv.localdomain ggclust21-priv 
192.168.31.122 ggclust22-priv.localdomain ggclust22-priv

After creating the cluster, the next step is to configure a service for Goldengate processes to connect to from the remote host. Since you typically only want Goldengate running on one node, this service is configured to be only available on one node at a time:
srvctl add service -db demogg1 -service oggserv -preferred demogg11 -available demogg12 
srvctl add service -db demogg2 -service oggserv -preferred demogg21 -available demogg22

Next, set up tnsnames entries for the Goldengate processes.  Samples are below, the local entry is the complex entry configured to optimize network performance.  The local entry was different for each server:

DEMOGG1 = 
(DESCRIPTION = 
(SDU=2097152) 
(ADDRESS = 
(PROTOCOL = BEQ) 
(PROGRAM = /u01/app/oracle/product/19.3.0/dbhome_1/bin/oracle) 
(ARGVO=demogg11) 
(ARGS='(DESCRIPTION=(SDU=2097152)(LOCAL=YES)(ADDRESS=(PROTOCOL=BEQ)))') 
(ENVS='ORACLE_HOME=/u01/app/oracle/product/19.3.0/dbhome_1,ORACLE_SID=demogg12') 
) 
(CONNECT_DATA=(SID=demogg12)) 
)
DEMOGG2 = 
(DESCRIPTION = 
(SDU=2097152) 
(ADDRESS = (PROTOCOL = TCP)(HOST = ggclust2-scan)(PORT = 1521)) 
(CONNECT_DATA = 
(SERVER = DEDICATED) 
(SERVICE_NAME = oggserv) 
) 
) 

Next, I configured an ACFS share for the important data directories for Oracle Goldengate.  This was configured in ASMCA, as shown below:

image

image

Next, Oracle Goldengate was installed on all 4 nodes.  Goldengate does not have a cluster installation option, and the home directory was set to /u02/gghome.  The ‘start manager’ option was chosen, then the following commands were run to put the key directories on the shared file system (/u02/ggdata):

cd /u02/gghome
rmdir dirchk 
rmdir dirprm 
rmdir dirdat 
rmdir dirtmp 
rmdir dircrd 
rmdir dirrpt 
cd /u02/ggdata
mkdir dirchk 
mkdir dirprm 
mkdir dirdat 
mkdir dirtmp 
mkdir dircrd 
mkdir dirrpt 
mkdir BR
cd /u02/gghome
ln -s /u02/ggdata/dirprm /u02/gghome/dirprm 
ln -s /u02/ggdata/dirchk /u02/gghome/dirchk 
ln -s /u02/ggdata/dirdat /u02/gghome/dirdat 
ln -s /u02/ggdata/dircrd /u02/gghome/dircrd 
ln -s /u02/ggdata/dirrpt /u02/gghome/dirrpt 

Note that you may want to have different parameter files based on the cluster node that is running the Goldengate process, in which case you would not put dirprm on shared storage.

At this point, the preparation is ready.  Now to configure the replication.  The schema we are using for this study is HR, password is tiger.  The goldengate user for the replication process is gguser.  We will also create a ggadmin user whose changes will not be replicated.

Next, create the individual users on the database:

1. Create the tablespace for goldengate, in our case golden:

create tablespace golden datafile ‘+DATA’ size 1024m autoextend on;

2. Create the users.

create user gguser identified by gguser;
create user ggadmin identified by ggadmin;

3. Grant required privileges.

exec dbms_goldengate_auth.grant_admin_privilege('GGUSER'’);
exec dbms_goldengate_auth.grant_admin_privilege('GGADMIN');
grant connect, resource, alter any table to gguser, ggadmin;
@marker_setup.sql
@ddl_setup.sql
@role_setup.sql
@ddl_enable.sql
@sequence

Next, we configured the credentials we will be using.  All credentials are configured the same so that there is never any question which db to connect to.  The ggulocal useridalias is to the local database, and the gguremote user is always to the remote database:

add credentialstore
alter credentialstore add user gguser@demogg1, password gguser alias ggulocal 
alter credentialstore add user ggadmin@demogg1, password ggadmin alias ggalocal 
alter credentialstore add user gguser@demogg2, password gguser alias gguremote 
alter credentialstore add user ggadmin@demogg2, password ggadmin alias ggaremote 

Next, add the transaction data and create the checkpointtable.  Do this on both databases since we are going to do multi-master replication.

GGSCI (ggclust11.localdomain) dblogin useridalias ggulocal
Successfully logged into database.
GGSCI (ggclust11.localdomain as gguser@demogg11) 2&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt;add trandata hr.* cols *

GGSCI (ggclust11.localdomain as gguser@demogg11) 3> add checkpointtable gguser.checkpoint

Next, ensure that the manager and GLOBALS are configured properly> These files are the same on both clusters:
edit params mgr


PORT 7809
Autostart ER *

Edit the GLOBALS file and configure it as follows:

checkpointtable gguser.checkpoint 
GGSCHEMA gguser 
ALLOWOUTPUTDIR /u02/ggdata 
ALLOWOUTPUTDIR /u02/gghome 

Exit ggsci and re-enter, then restart the mgr process.
Now we are ready to begin the actual building of the replication. On ggclust11 we will build the initial extract and pump processes.

First the extract. Edit the parameter file for the extract and add these lines (edit params hrext1):

EXTRACT hrext1 
setenv (ORACLE_SID='demogg1') ç Not that this is actually the just the entry in the oratab. 
useridalias ggulocal 
tranlogOptions IntegratedParams (max_sga_size 256), excludeuser ggadmin, excludetag 00 
CACHEMGR CACHEDIRECTORY /u02/ggdata/dirtmp 
BR BRDIR /u02/ggdata/BR 
DDL include all 
Exttrail /u02/ggdata/dirdat/hr 
LOGALLSUPCOLS 
UPDATERECORDFORMAT COMPACT 
table hr.*;

As we noted earlier, we will be using the ggadmin user for maintenance, which is why we have the ‘excludeuser ggadmin’ entry. All dml and ddl done by the ggadmin user will be ignored by Goldengate. The excludetag will ensure that the we don’t run into problems with ping-pong updates. Note that the settings we are using for the excludetag are the default.

GGSCI (ggclust11.localdomain) Dblogin useridalias ggulocal 
GGSCI (ggclust11.localdomain) register extract hrext1 database # Note that this command will return an scn, eg 947747283. Take note of this scn for later use. 
GGSCI (ggclust11.localdomain) add extract hrext1 integrated tranlog, begin now 
GGSCI (ggclust11.localdomain) add exttrail /u02/ggdata/dirdat/hr, extract hrext1

Next we create the pump process to transmit the data.

Edit params hrpump:

extract hrpump 
useridalias gguremote 
CACHEMGR CACHEDIRECTORY /u02/ggdata/dirtmp 
ddl include all 
rmthost ggclust21-vip, mgrport 7809 
rmttrail /u02/ggdata/dirdat/ph 
table hr.*;
</pre>
GGSCI (ggclust11.localdomain) 1&gt; add extract hrpump, exttrailsource /u02/ggdata/dirdat/hr

GGSCI (ggclust11.localdomain) 1&gt; add rmttrail /u02/ggdata/dirdat/ph, extract hrpump
<pre>


At this point the extract and the pump process have been created. Now, run the datapump export using the scn from above:

ggclust11$ expdp system/password directory=dumpdir schemas=hr parallel=4 dumpfile=hrdump%u.dmp flashback_scn=947747283

Then, log into the target system, ggclust21 in our example, and run the import:

ggclust21$ impdp system/password directory=dumpdir parallel=4 dumpfile=hrdump%u.dmp

Now, go back to the source system, ggclust11, and start the extracts

GGSCI (ggclust11.localdomain) start extract hrext1
GGSCI (ggclust11.localdomain) start extract hrpump

Ensure that you monitor the file ggserr.log for errors. It is not uncommon to have simple typos that need to be fixed.

Next connect to ggclust21 and create the replicat process.  Replicat is the term Oracle Goldengate uses for the process that writes the data to the target system.  Note the resolveconflict section. Since we plan to set up multi-master replication, basic conflict resolution is required.  The setup below will handle the simple problems.  There will be conflicts that require manual resolution in any true bi-directional environment.

GGSCI (ggclust21.localdomain) edit params hrrep1
replicat hrrep1
SETENV(ORACLE_SID='demogg2')
DBOPTIONS INTEGRATEDPARAMS(parallelism 2), settag 00
assumetargetdefs
useridalias ggulocal
handlecollisions
discardfile /u02/ggdata/dirrpt/hrerr.dsc, purge
ddl include all
map hr.*, target hr.*
RESOLVECONFLICT (UPDATEROWEXISTS, (DEFAULT, USEMAX (last_updt_dt_tm))),
RESOLVECONFLICT (DELETEROWEXISTS, (DEFAULT, OVERWRITE)),
RESOLVECONFLICT (INSERTROWEXISTS, (DEFAULT, USEMAX (last_updt_dt_tm))),
RESOLVECONFLICT (UPDATEROWMISSING, (DEFAULT, OVERWRITE)),
RESOLVECONFLICT (DELETEROWMISSING, (DEFAULT, DISCARD)),
MAPEXCEPTION
(TARGET ggadmin.EXCEPTIONS, EXCEPTIONSONLY, INSERTALLRECORDS, COLMAP
(
excp_date = @DATENOW(),
rep_name = @GETENV ('GGENVIRONMENT', 'GROUPNAME'),
table_name = @GETENV ('GGHEADER', 'TABLENAME'),
errno = @GETENV ('LASTERR', 'DBERRNUM'),
errtype = @GETENV ('LASTERR', 'ERRTYPE'),
optype = @GETENV ('LASTERR', 'OPTYPE'),
transind = @GETENV ('GGHEADER', 'TRANSACTIONINDICATOR'),
transimgind = @GETENV ('GGHEADER', 'BEFOREAFTERINDICATOR'),
committimestamp = @GETENV ('GGHEADER', 'COMMITTIMESTAMP'),
reccsn = @GETENV ('TRANSACTION', 'CSN'),
recseqno = @GETENV ('RECORD', 'FILESEQNO'),
recrba = @GETENV ('RECORD', 'FILERBA'),
rectranspos = @GETENV ('RECORD', 'RSN'),
reclength = @GETENV ('GGHEADAER', 'RECORDLENGTH'),
logrba = @GETENV ('GGHEADER', 'LOGRBA'),
logposition = @GETENV ('GGHEADER', 'LOGPOSITION'),
grouptype = @GETENV ('GGENVIRONMENT', 'GROUPTYPE'),
filename = @GETENV ('GGFILEHEADER', 'FILENAME'),
fileno = @GETENV ('GGFILEHEADER', 'FILESEQNO'),
srcrowid = @GETENV ('TRANSACTION', 'CSN'),
srcdbcharset = @GETENV ('GGFILEHEADER', 'DBCHARSET'),
replag = @GETENV ('LAG', 'SEC'),
cnt_cdr_conflicts = @GETENV ('DELTASTATS','TABLE', @GETENV ('GGHEADER', 'TABLENAME'),'CDR_CONFLICTS'), cnt_cdr_resolutions = @GETENV ('DELTASTATS','TABLE', @GETENV ('GGHEADER', 'TABLENAME'),'CDR_RESOLUTIONS_SUCCEEDED'),
cnt_cdr_failed = @GETENV ('DELTASTATS','TABLE', @GETENV ('GGHEADER', 'TABLENAME'),'CDR_RESOLUTIONS_FAILED')
)
);

Now lets get the replicat going:

GGSCI (ggclust21.localdomain)dblogin useridalias ggulocal
GGSCI (ggclust21.localdomain ggrep21) add replicat hrrep1 integrated exttrail /u02/ggdata/dirdat/ph # note that this exttrail corresponds to the remotetrail on the pump extract
GGSCI (ggclust21.localdomain ggrep21) register replicat hrrep1 database
GGSCI (ggclust21.localdomain ggrep21) start replicat hrrep1

Once again, monitor the file ggserr.log and correct any problems that appear.

Once you have verified that the extracts and replicat are running properly, insert a row into the source schema and verify that it gets replicated to the target. Once the row replicates, you know that the processes are working properly.

Now, its time to set up the replication in the other direction. The setup is identical, with the exception that we do not need to run a datapump export/import since the initial source of the data was our source database.  Below are the commands to set up the extract and pump processes.  These are set up on the target database this time:

GGSCI (ggclust21.localdomain ggrep21)  register extract hr2ext1 database;
GGSCI (ggclust21.localdomain ggrep21) add extract hr2ext1, integrated tranlog, begin now
GGSCI (ggclust21.localdomain ggrep21) add exttrail /u02/ggdata/dirdat/xr, extract hr2ext1
EXTRACT hr2ext1
setenv (ORACLE_SID='demogg2')
useridalias ggulocal
tranlogOptions IntegratedParams (max_sga_size 256), excludeuser ggadmin, excludetag 00
CACHEMGR CACHEDIRECTORY /u02/ggdata/dirtmp
BR BRDIR /u02/ggdata/BR
DDL include all
Exttrail /u02/ggdata/dirdat/xr
LOGALLSUPCOLS
UPDATERECORDFORMAT COMPACT
table hr.*;

extract hrpump2
Dblogin useridalias gguremote
ddl include all
rmthost ggclust11-vip, mgrport 7809
rmttrail /u02/ggdata/dirdat/xh
table hr.*;

GGSCI (ggclust21.localdomain ggrep21) add extract hrpump2, exttrailsource /u02/ggdata/dirdat/xr
GGSCI (ggclust21.localdomain ggrep21) add rmttrail /u02/ggdata/dirdat/xh, extract hrpump2
GGSCI (ggclust21.localdomain ggrep21) start extract hr2ext1
GGSCI (ggclust21.localdomain ggrep21) start extract hrpump2

Now monitor the ggserr.log for errors on the extract process.  If there are no errors, return to the source system (ggclust11) and build the replicat process:

GGSCI (ggclust11.localdomain) ggrep1 register replicat hrrep2 database
GGSCI (ggclust11.localdomain) ggrep1 add replicat  hrrep2 integrated exttrail /u02/ggdata/dirdat/xh
replicat hrrep2
 SETENV(ORACLE_SID='demogg1')
 DBOPTIONS INTEGRATEDPARAMS(parallelism 2)
 AssumeTargetDefs
 DiscardFile /u02/ggdata/dirrpt/rpdw.dsc, purge
 USERIDALIAS ggulocal
 DDLERROR DEFAULT DISCARD IGNOREMISSINGOBJECTS
 REPERROR (DEFAULT, EXCEPTION)
 MAP HR.*, target HR.*
 RESOLVECONFLICT (UPDATEROWEXISTS, (DEFAULT, USEMAX (last_updt_dt_tm))),
 RESOLVECONFLICT (DELETEROWEXISTS, (DEFAULT, OVERWRITE)),
 RESOLVECONFLICT (INSERTROWEXISTS, (DEFAULT, USEMAX (last_updt_dt_tm))),
 RESOLVECONFLICT (UPDATEROWMISSING, (DEFAULT, OVERWRITE)),
 RESOLVECONFLICT (DELETEROWMISSING, (DEFAULT, DISCARD)),
 MAPEXCEPTION
 (TARGET ggadmin.EXCEPTIONS, EXCEPTIONSONLY, INSERTALLRECORDS, COLMAP
 (
 excp_date = @DATENOW(),
 rep_name = @GETENV ('GGENVIRONMENT', 'GROUPNAME'),
 table_name = @GETENV ('GGHEADER', 'TABLENAME'),
 errno = @GETENV ('LASTERR', 'DBERRNUM'),
 errtype = @GETENV ('LASTERR', 'ERRTYPE'),
 optype = @GETENV ('LASTERR', 'OPTYPE'),
 transind = @GETENV ('GGHEADER', 'TRANSACTIONINDICATOR'),
 transimgind = @GETENV ('GGHEADER', 'BEFOREAFTERINDICATOR'),
 committimestamp = @GETENV ('GGHEADER', 'COMMITTIMESTAMP'),
 reccsn = @GETENV ('TRANSACTION', 'CSN'),
 recseqno = @GETENV ('RECORD', 'FILESEQNO'),
 recrba = @GETENV ('RECORD', 'FILERBA'),
 rectranspos = @GETENV ('RECORD', 'RSN'),
 reclength = @GETENV ('GGHEADAER', 'RECORDLENGTH'),
 logrba = @GETENV ('GGHEADER', 'LOGRBA'),
 logposition = @GETENV ('GGHEADER', 'LOGPOSITION'),
 grouptype = @GETENV ('GGENVIRONMENT', 'GROUPTYPE'),
 filename = @GETENV ('GGFILEHEADER', 'FILENAME'),
 fileno = @GETENV ('GGFILEHEADER', 'FILESEQNO'),
 srcrowid = @GETENV ('TRANSACTION', 'CSN'),
 srcdbcharset = @GETENV ('GGFILEHEADER', 'DBCHARSET'),
 replag = @GETENV ('LAG', 'SEC'),
 cnt_cdr_conflicts = @GETENV ('DELTASTATS','TABLE', @GETENV ('GGHEADER', 'TABLENAME'),'CDR_CONFLICTS'), cnt_cdr_resolutions = @GETENV ('DELTASTATS','TABLE', @GETENV ('GGHEADER', 'TABLENAME'),'CDR_RESOLUTIONS_SUCCEEDED'),
 cnt_cdr_failed = @GETENV ('DELTASTATS','TABLE', @GETENV ('GGHEADER', 'TABLENAME'),'CDR_RESOLUTIONS_FAILED')
 )
 );
GGSCI (ggclust11.localdomain) ggrep1 start replicat hrrep2.

At this point, the bi-directional (multi-master) replication should be working properly.  Insert a row into each database and verify that it gets copied.  Then delete it and make sure it is deleted.

Some notes:

The exttrail for the replicat is always the same as the remotetrail for the pump process. This is /u02/ggdata/dirdat/ph on demogg1 (for hrpump/hrext1).

The exttrailsource for the pump process is the same as the exttrail for the extract process. This is /u02/ggdata/dirdat/hr on demogg1 (from hrext1).

The last step in this process is to install and configure the Grid Infrastructure standalone agent.  My experimentation with the agent shows a decent amount of potential, but substantial customization in the form of script writing is going to be required to make it really useful.  This will probably be the subject of another blog post, but for this post I will just cover the installation and configuration.

This agent is used to manage the Goldengate processes. The agent is downloaded from here: http://www.oracle.com/technetwork/database/database-technologies/clusterware/downloads/xag-agents-downloads-3636484.html

After downloading the agent, install it in a directory that is outside of both the database and GI home. I chose /u01/oracle/xag. Just unzip the file and run this command to install the agent on both nodes of each cluster. Run the command once per cluster, the setup installs the agent on both cluster nodes:

$./xagsetup.sh --install --directory /u01/oracle/xag --all_nodes

Now with the agent installed, we put the Goldengate under the control of the agent. The first step of this process it to put the standalone agent bin directory in your path statement ahead of all other oracle software. This is to ensure that the agent commands run against the correct agent. Then we put Goldengate under agent control (ggclust11):

./agctl add goldengate GG_SOURCE \
--gg_home /u02/gghome \
--oracle_home /u01/app/oracle/product/19.3.0/dbhome_1 \
--db_services ora.demogg2.oggserv.svc --use_local_services \
--monitor_extracts hrext1,hrpump1 --monitor_replicats hrrep2 \
--filesystems ora.acfs.ggvol.acfs \
--user oracle --group oinstall

The command for ggclust21:

./agctl add goldengate GG_TARGET \
 --gg_home /u02/gghome \
  --oracle_home /u01/app/oracle/product/19.3.0/dbhome_1 \
  --db_services ora.demogg2.oggserv.svc --use_local_services \
  --monitor_extracts hr2ext1,hrpump2 --monitor_replicats hrrep1 --filesystems ora.acfs.ggdata.acfs \
  --user oracle --group oinstall

As the last step in this process, add this line to the GLOBALS file for all 4 nodes, then stop and restart all the process:

XAGENABLE

With the golden gate processes managed by an agent you should see this:

GGSCI (ggclust21.localdomain) 2&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt; info all

Program Status Group Lag at Chkpt Time Since Chkpt XAG

MANAGER 				RUNNING 				MANAGED		/xag.GG_TARGET.goldengate/ggclust21
EXTRACT 				RUNNING 				HREXT1 	00:00:00 00:00:02 NON-CRITICAL
EXTRACT 				RUNNING 				HRPUMP 	00:00:00 00:00:04 NON-CRITICAL
REPLICAT 				RUNNING 				HRREP2 	00:00:00 00:00:00 NON-CRITICAL

At this point, Multi-master goldengate is completely configured in a cluster. Remember that if you want to move the Goldengate process to run on the other node in the cluster, you will also need to move the service configured for Goldengate in this step:

srvctl add service -db demogg1 -service oggserv -preferred demogg11 -available demogg12 

Please note that there is manual intervention that will be required should you want to run Goldengate on the second node of the cluster. I would recommend writing scripts to use to move the Goldengate processes to the other cluster node. For example, you will need to change the destination of the pump process on the target node to point to ggclust12-vip from ggclust11-vip. For that reason, you may find it expedient to create a virtual IP on both clusters to be used by the Goldengate processes. The agent can be used for that purpose, but still remember that the agent only works on one cluster, it will not affect the remote cluster.  You will need to run agent commands on both clusters in order to move the Goldengate processes.