SymmetricDS Bi-Directional Replication

August 12, 2019

This post we are going to discuss a lower cost alternative to Oracle GoldenGate.  SymmetricDS.

SymmetricDS is open source Java software designed to run SQL replication.  It works well, but since it uses triggers on the database for all of its Processing, it is going to have a significant impact on your system, and is likely to run into problems at high volumes.  I have not load tested SymmetricDS vs Golden Gate, but the tuning options for SymmetricDS are limited, so it is unlikely to be able to compete with GoldenGate on performance.  However, SymmetricDS is indeed a viable and low cost option.

In our example, we are using the same configuration as our previous blog, https://dbakerber.wordpress.com/2019/08/09/goldengate-multi-master-with-integrated-replicat-and-apply/ .  The tnsnames entries for each database are here. Note that we dont actually use the TNS entries, but having them available makes the connection information easy to find:

ORA183T =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.12.1.28)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ora183t)
    )
  )

ORA121T =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.12.1.8)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ora121t)
    )
  )

The SymmetricDS program is running on an OEL 7.6 server named oralin, with an IP address of 10.12.1.229.  The Oracle client has been installed on this server.  We have the bin directory for the client in the PATH. 

SymmetricDS is installed by simply unzipping the distribution file.  In our case, we unzipped the files to /u01/app/symmetricds, so that it sits in the same structure as our Oracle client.

In symmetricDS, your sources are defined by ‘engine’ files, which give the connection information, including the driver and connection username and password.    The sample engine files are in the samples directory. In this case, we took the store-001.properties file, renamed it to ora121t-001.properties, and copied it to the engines subdirectory.  The store-002.properties files was copied to ora183t.properties.  SymmetricDS expects the engine files to be in the engines subdirectory, and will automatically connect to the instances defined there.

Below are our two engine files:

[oracle@oralin engines]$ cat ora121t-001.properties
#
# Licensed to JumpMind Inc under one or more contributor
# license agreements.  See the NOTICE file distributed
# with this work for additional information regarding
# copyright ownership.  JumpMind Inc licenses this file
# to you under the GNU General Public License, version 3.0 (GPLv3)
# (the "License"); you may not use this file except in compliance
# with the License.
#
# You should have received a copy of the GNU General Public License,
# version 3.0 (GPLv3) along with this library; if not, see
# <
http://www.gnu.org/licenses/>.
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
#

# Friendly name to refer to this node from command line
engine.name=ora121t-001

# The class name for the JDBC Driver
db.driver=oracle.jdbc.driver.OracleDriver

# The JDBC URL used to connect to the database
db.url=jdbc:oracle:thin:@10.12.1.8:1521:ora121t

# The database user that SymmetricDS should use.
db.user=hrds

# The database password
db.password=tiger

# This node will contact the root node’s sync.url to register itself.
registration.url=
http://10.12.1.229:31415/sync/ora121t-001
sync.url=http://10.12.1.229:31415/sync/ora121t-001

# Node group this node belongs to, which defines what it will sync with

# Must match the sym_node_group configuration in database.
group.id=ora121t

# External ID for this node, which is any unique identifier.
external.id=001

# How to run routing (in millis), which puts changes into batches.
job.routing.period.time.ms=5000

# How often to run push (in millis), which sends changes to other nodes.
job.push.period.time.ms=10000

# How often to run pull (in millis), which receives changes job.pull.period.time.ms=10000

[oracle@oralin engines]$ cat ora183t-002.properties
#
# Licensed to JumpMind Inc under one or more contributor
# license agreements.  See the NOTICE file distributed
# with this work for additional information regarding
# copyright ownership.  JumpMind Inc licenses this file
# to you under the GNU General Public License, version 3.0 (GPLv3)
# (the "License"); you may not use this file except in compliance
# with the License.
#
# You should have received a copy of the GNU General Public License,
# version 3.0 (GPLv3) along with this library; if not, see
# <
http://www.gnu.org/licenses/>.
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
#

# Friendly name to refer to this node from command line
engine.name=ora183t-002

# The class name for the JDBC Driver
db.driver=oracle.jdbc.driver.OracleDriver

# The JDBC URL used to connect to the database
db.url=jdbc:oracle:thin:@10.12.1.28:1521:ora183t

# The database user that SymmetricDS should use.
db.user=hrds

# The database password
db.password=tiger

# This node will contact the root node’s sync.url to register itself.
# registration.url=
http://localhost:31415/sync/corp-000
registration.url=http://10.12.1.229:31415/sync/ora121t-001
sync.url=http://10.12.1.229:31415/sync/ora183t-002

# Node group this node belongs to, which defines what it will sync with who.
# Must match the sym_node_group configuration in database.
group.id=ora183t

# External ID for this node, which is any unique identifier you want to use.
external.id=002

# How to run routing (in millis), which puts changes into batches.
job.routing.period.time.ms=5000

# How often to run push (in millis), which sends changes to other nodes.
job.push.period.time.ms=10000

# How often to run pull (in millis), which receives changes from other nodes.
job.pull.period.time.ms=10000

Once the engine files are created, create the required symmetricDS tables by running symadmin to create the required tables in the 001 engine:

[oracle@oralin bin]$ ./bin/symadmin –engine ora121t-001 create-sym-tables

We are using the same schema for replication and running the symmetricDS application, the HRDS schema.  This is not required, but remember to grant appropriate privileges using a different schema (eg, symmetric).  Below are the commands used to build the identity information for this configuration:

insert into hrds.sym_node (node_id,node_group_id,external_id,sync_enabled,sync_url,schema_version,symmetric_version, database_type,database_version,batch_to_send_count, batch_in_error_count,created_at_node_id)
 values ('000','ora121t','000',1,null,null,null,null,null,0,0,'000');
 insert into hrds.sym_node (node_id,node_group_id,external_id,sync_enabled,sync_url,schema_version,symmetric_version, database_type,database_version,batch_to_send_count, batch_in_error_count,created_at_node_id)
 values ('001','ora183t','001',1,null,null,null,null,null,0,0,'000');
 insert into hrds.sym_node_identity values ('001');
 insert into hrds.sym_node_security (node_id,node_password,registration_enabled,registration_time,initial_load_enabled,initial_load_time, created_at_node_id)
 values ('000','5d1c92bbacbe2edb9e1ca5dbb0e481',0,current_timestamp,0,current_timestamp,'000');
 insert into hrds.sym_node_security (node_id,node_password,registration_enabled,registration_time,initial_load_enabled,initial_load_time, created_at_node_id)
 values ('001','5d1c92bbacbe2edb9e1ca5dbb0e481',1,null,1,current_timestamp,'000');
 -- note:  the insert above sets initial_load_enabled to 1 to enable initial load.
 insert into sym_node_group (node_group_id) values ('ora121t');
 insert into sym_node_group (node_group_id) values ('ora183t');
 commit;

Once the identify information is created as shown above, its time to define the replication.  Note that the tables are not particularly intuitive, so the summary below is from my previous blog to explain the table usage:

Node/Group Identification Concepts

The individual nodes are defined in the sym_node table.  This is just a list of the individual nodes involved in the replication process.

The sym_node_security table assigns creates connection information among the nodes.

The sym_node_group table contains the names of the groups involved in replication. In this example, we only have one group.

The sym_node_group_link table defines how data is replicated between groups. If the data_event_action column is set to ‘P’ data is pushed on events. If it is set to ‘W’ the group waits for a pull event. Note that for this example, we only have one group (store) and thus both stores will push data.

Driver Table Concepts

At the lowest level, the sym_channel table allocates a channel for replication.  For large loads, multiple channels can be created. 

Next, the sym_trigger table defines what tables are replicated, what events trigger events, and what channel the event is sent to.

The sym_router table defines routers that route events between groups.  A router can have the same group as the source and destination, this would imply bi-directional replication is configured.

The sym_trigger_router table identifies what events are sent to which routers.

So, the basic concept is that you define a channel, define trigger event and assign it to a channel, define a router, and assign a trigger to a router.

Below are the commands we used to load data into the driver tables:

insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('ora121t', 'ora183t', 'W');
 insert into sym_channel
 (channel_id, processing_order, max_batch_size, enabled, description)
 values('ora121t_outbound', 1, 100000, 1, 'ora121t transaciton');
 insert into sym_trigger (trigger_id, source_table_name, channel_id, last_update_time, create_time,sync_on_incoming_batch)
 values ('ora121t_outbound_tr', '*','ora121t_outbound',sysdate,sysdate,1);
 insert into sym_router
 (router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time)
 values('ora121t_2_ora183t', 'ora121t', 'ora183t', 'default',current_timestamp, current_timestamp);
 insert into sym_trigger_router
 (trigger_id,router_id,initial_load_order,last_update_time,create_time)
 values('ora121t_outbound_tr','ora121t_2_ora183t', 1, current_timestamp, current_timestamp);

commit;
 insert into sym_node_group_link (source_node_group_id, target_node_group_id, data_event_action) values ('ora183t', 'ora121t', 'P');
 insert into sym_channel
 (channel_id, processing_order, max_batch_size, enabled, description)
 values('ora183t_outbound', 1, 100000, 1, 'ora183t transaciton');
 insert into sym_trigger (trigger_id, source_table_name, channel_id, last_update_time, create_time,sync_on_incoming_batch)
 values ('ora183t_outbound_tr', '*','ora183t_outbound',sysdate,sysdate,1);
 insert into sym_router
 (router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time)
 values('ora183t_2_ora121t', 'ora183t', 'ora121t', 'default',current_timestamp, current_timestamp);
 insert into sym_trigger_router
 (trigger_id,router_id,initial_load_order,last_update_time,create_time)
 values('ora183t_outbound_tr','ora183t_2_ora121t', 100, current_timestamp, current_timestamp);
 commit

Note that even though we use ‘*’, only the tables without a sym_ name prefix will be replicated.  SymmetricDS assumes that all table names beginning with sym_ are its own tables and does not replicate them.

Next, we start the program. ./bin/sym.  Watch the output of the program.  There may be errors.  Since these are java errors, tracking down the cause and finding a fix may take time.

The error you are most likely to see is this:

[ora121t-001] – RegistrationUriHandler – ora183t:002:? was not allowed to register.

[ora183t-002] – RegistrationService – Waiting for registration to be accepted by the server. Registration is not open.

[ora183t-002] – RegistrationService – Could not register.  Sleeping for 20000ms before attempting again.

If you see the registration error, just let it run and start a session on another terminal and run this command:

./symadmin –engine ora121t-001 open-registration ora183t 002

The command above opens registration for the second engine.  If the command works, you will see this:

[ora183t-002] – DataLoaderService – Using registration URL of http://10.12.1.229:31415/sync/ora121t-001/registration?nodeGroupId=ora183t&externalId=002&syncURL=http%3A%2F%2F10.12.1.229%3A31415%2Fsync%2Fora183t-002&schemaVersion=%3F&databaseType=Oracle&databaseVersion=18.0&symmetricVersion=3.10.3&deploymentType=server&hostName=oralin.localdomain&ipAddress=10.12.1.229

[ora121t-001] – RegistrationService – Completed registration of node ora183t:002:002

[ora121t-001] – ConfigurationChangedDataRouter – About to refresh the cache of nodes because new configuration came through the data router

After successful registration, you will see output about creating a large number of triggers. As noted earlier, SymmetricDS creates triggers on the database to run replication.  The triggers you see being created are in response to the SQL statements we ran above to configure the replication.

Once again, follow the output for errors.

You will note that in the above process we only ran updates on one database instance.  SymmetricDS then read the information in the configuration tables and wrote the required information to the other instance.

At this time, the bi-directional replication is configured and running.  Run updates on individual tables and verify that the data is replicating properly.

Advertisements

Goldengate Multi-Master with Integrated Replicat and Apply

August 9, 2019

In this article we are going to discuss the simplest bi-directional Goldengate configuration that we could build.

The environment is as follows:

Two Linux VMs running Oracle Enterprise Linux, x86_64.  One server is build in virtual box, using IP address 10.12.1.8, hostname gg121too.  The other is built in VMware workstation, and is IP address 10.12.1.28, hostname gg183too.

This sort of simple bi-directional setup is what might be built in preparation for upgrade to a more recent oracle version. In our case we are replicating from Oracle 12.1.0.2 to Oracle 18.3.0 and back.

Use Case:

Prior to the upgrade, all activity is on the 12.1.0.2 database, and all changes are replicated to the oracle 18.3.0 instance.  In the near future, we plan to upgrade to 18.3.0 by redirecting our application to the 18.3.0 instance.  Since Goldengate is set up as multi-master, after switching the apps to the 18.3.0 instance, all updates will be replicated to the 12.1.0.2 instance.  At a point when everyone is content that the 18.3 instance is running properly without problems, we will remove the Goldengate replication and remove the 12.1 database.

Install Oracle Goldengate on each server. In our case, we chose the Goldengate version appropriate to the oracle version, then chose /u01/gghome for the Goldengate home directory, and chose the option to start the manager after installation. Choosing the option to start the manager also means that the directories will be created.

On each server, the manager parameters are as follows:

PORT 7809
AUTOSTART ER *

The GLOBALS has this line:

CHECKPOINTTABLE gguser.checkpoint

Next, we configured both database instances for Goldengate. First we create the Goldengate tablespace. Then we created the Goldengate users and granted them required privileges on the database:

SQL> create tablespace golden datafile ‘/u01/app/oradata/ORA121T/golden01.dbf’ size 1024m autoextend on;

SQL> create user gguser identified by gguser;

SQL> grant dba, create session, select any table, unlimited tablespace, select any dictionary to gguser;

SQL> create user ggadmin identified by ggadmin;

SQL> grant dba, create session, select any table, unlimited tablespace, select any dictionary to gguser;

SQL> exec dbms_goldengate_auth.grant_admin_privilege(‘GGUSER’);

SQL> exec dbms_goldengate_auth.grant_admin_privilege(‘GGADMIN’);

SQL> alter user gguser default tablespace golden;

The GGUSER user will be used to run the replication process itself.  The GGADMIN user will be used to run administrative commands that do not get replicated between the two systems. For example, should sequences get out of synch, manual intervention may be required to make sure the sequence is at the right value in each instance.

In order for Goldengate to work properly, supplemental log data must be added to the database, force logging must be turned on, and flashback should be turned on.  In order to turn on flashback database, run these commands:

SQL> shutdown immediate;
SQL> startup mount;
SQL> alter database flashback on;
SQL> alter database archivelog;  — Only if necessary
SQL> alter database open;
SQL> alter system set enable_goldengate_replication=TRUE scope=both;

Next, to turn on supplemental log data and force logging:

SQL> alter database add supplemental log data;
SQL> alter database force logging;
SQL> alter database switch logfile;

Note that the commands above should be run on both the source and destination databases.

Next we will set up our tnsnames.ora file.  While the entry names are the same on each server, the entries themselves are different.  Below are the tns entries for gg121too:

ORA121T =
  (DESCRIPTION =
    (SDU = 2097152)
    (ADDRESS =
      (PROTOCOL = BEQ)
      (PROGRAM = /u01/app/oracle/product/12.1.0/dbhome_1/bin/oracle)
      (ARGV0 = ora121t)
      (ARGS='(DESCRIPTION=(SDU=2097152)(LOCAL=YES)(ADDRESS=(PROTOCOL=BEQ)))’)
      (ENVS=’ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1,ORACLE_SID=ora121t’)
    )
    (CONNECT_DATA =
      (SID = ora121t)
    )
  )

ORA183T =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.12.1.28)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ora183t)
    )
  )

On ora183too below are the TNS entries:

ORA121T =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.12.1.8)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ora121t)
    )
  )

ORA183T =
  (DESCRIPTION =
    (SDU = 2097152)
    (ADDRESS =
      (PROTOCOL = BEQ)
      (PROGRAM = /u01/app/oracle/product/18.3.0/dbhome_1/bin/oracle)
       (ARGV0 = ora183t)
      (ARGS='(DESCRIPTION=(SDU=2097152)(LOCAL=YES)(ADDRESS=(PROTOCOL=BEQ)))’)
      (ENVS=’ORACLE_HOME=/u01/app/oracle/product/18.3.0/dbhome_1,ORACLE_SID=ora183t’)
    )
    (CONNECT_DATA =
      (SID = ora183t)
    )
  )

Note that these entries are optimized to ensure the best performance for the local connection.

Next, connect to Goldengate and configure the credentials. 

On gg121too:

GGSCI (gg121too.localdomain)1>add credentialstore
GGSCI (gg121too.localdomain)2> alter credentialstore add user gguser@ora121t, password gguser alias ggulocal
GGSCI (gg121too.localdomain)3>alter credentialstore add user ggadmin@ora121t, password ggadmin alias ggalocal
GGSCI (gg121too.localdomain)4>alter credentialstore add user gguser@ora183t, password gguser alias gguremote
GGSCI (gg121too.localdomain)5>alter credentialstore add user ggadmin@ora183t, password ggadmin alias ggaremote

Note what we are doing here.  The intent is that the local credentials always have the word ‘local’ in the name, and the remote system credentials will always have the word remote in the name.  Thus on gg183too, the corresponding commands are below:

GGSCI (gg183too.localdomain)1>add credentialstore
GGSCI (gg183too.localdomain)2> alter credentialstore add user gguser@ora183t, password gguser alias ggulocal
GGSCI (gg183too.localdomain)3>alter credentialstore add user ggadmin@ora183t, password ggadmin alias ggalocal
GGSCI (gg183too.localdomain)4>alter credentialstore add user gguser@ora121, password gguser alias gguremote
GGSCI (gg183too.localdomain)5>alter credentialstore add user ggadmin@ora121t, password ggadmin alias ggaremote

We will be copying the hr schema from the 12c to the 18c instance. Most often multiple schemas will need to be replicated, but we are keeping this to a simple case.

Next run these commands:

GGSCI (gg121too.localdomain)1> dblogin useridalias ggulocal
GGSCI (gg121too as gguser@ora121t) 2> add checkpointtable gguser.checkpoint
GGSCI (gg121too as gguser@ora121t) 6> add trandata hr.* cols *

At this point, Goldengate is configured on the source database.  Now create the checkpointtable on the destination database:

GGSCI (gg121too.localdomain)1> dblogin useridalias gguremote
GGSCI (gg121too as gguser@ora183t) 2> add checkpointtable gguser.checkpoint

Now we are ready to begin the configuration of the replication.

Connect back to the local instance and create the extract process:

GGSCI (gg121too) 4> dblogin useridalias ggulocal

GGSCI (gg121.localdomain as gguser@ora121t) 9> register extract capture1,  database

2019-08-08 17:35:51  INFO    OGG-02003  Extract CAPTURE1 successfully registered with database at SCN 1804072. <== Make note of this SCN it will be used later.

GGSCI (gg121.localdomain as gguser@ora121t) 10> add extract capture1, integrated tranlog, begin now
EXTRACT (Integrated) added.

GGSCI (gg121.localdomain as gguser@ora121t) 11> add rmttrail ./dirdat/rt, extract capture1
RMTTRAIL added.

Create the parameter file for the capture, and add these lines (edit params CAPTURE1):

extract capture1
useridalias ggulocal
rmthost 10.12.1.28, mgrport 7809
tranlogOptions IntegratedParams (max_sga_size 256), excludeuser ggadmin
DDL include mapped
rmttrail ./dirdat/rt
LOGALLSUPCOLS
UPDATERECORDFORMAT COMPACT
table hr.*;

Next, start the extract and monitor the file ggserr.log.  Fix any errors that occur:

GGSCI (gg121.localdomain as gguser@ora121t) 10> start extract capture1

Next, export the data in the schemas that are being replicated. In this case, we are using the HR schema.  The SCN from above is used as the flashback scn for the datapump export:

expdp directory=homedir dumpfile=hrscn.dmp flashback_scn=1804072 schemas=HR

Export: Release 12.1.0.2.0 – Production on Thu Aug 8 17:39:46 2019

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Username: system/password

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 – 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
FLASHBACK automatically enabled to preserve database integrity.
Starting “SYSTEM”.”SYS_EXPORT_SCHEMA_01″:  system/******** directory=homedir dumpfile=hrscn.dmp flashback_scn=1804071 schemas=HR
Estimate in progress using BLOCKS method…
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 512 KB
Processing object type SCHEMA_EXPORT/USER
Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
Processing object type SCHEMA_EXPORT/ROLE_GRANT
Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
Processing object type SCHEMA_EXPORT/TABLESPACE_QUOTA
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE
Processing object type SCHEMA_EXPORT/TABLE/PROCACT_INSTANCE
Processing object type SCHEMA_EXPORT/TABLE/TABLE
Processing object type SCHEMA_EXPORT/TABLE/COMMENT
Processing object type SCHEMA_EXPORT/PROCEDURE/PROCEDURE
Processing object type SCHEMA_EXPORT/PROCEDURE/ALTER_PROCEDURE
Processing object type SCHEMA_EXPORT/VIEW/VIEW
Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/TRIGGER
Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type SCHEMA_EXPORT/STATISTICS/MARKER
. . exported “HR”.”COUNTRIES”                            6.906 KB      25 rows
. . exported “HR”.”DEPARTMENTS”                          7.570 KB      27 rows
. . exported “HR”.”EMPLOYEES”                            17.53 KB     107 rows
. . exported “HR”.”JOBS”                                 7.554 KB      19 rows
. . exported “HR”.”JOB_HISTORY”                          7.640 KB      10 rows
. . exported “HR”.”LOCATIONS”                            8.882 KB      23 rows
. . exported “HR”.”REGIONS”                              5.992 KB       4 rows
. . exported “HR”.”TEST”                                 19.72 KB     123 rows
Master table “SYSTEM”.”SYS_EXPORT_SCHEMA_01″ successfully loaded/unloaded
******************************************************************************
Dump file set for SYSTEM.SYS_EXPORT_SCHEMA_01 is:
  /home/oracle/hrscn.dmp
Job “SYSTEM”.”SYS_EXPORT_SCHEMA_01″ successfully completed at Thu Aug 8 17:40:35 2019 elapsed 0 00:00:33

Next, copy the file hrscn.dmp to the destination system and import it into the database:

[oracle@gg183too gghome]$ impdp system/password directory=homedir dumpfile=hrscn.dmp

Import: Release 18.0.0.0.0 – Production on Thu Aug 8 16:55:49 2019
Version 18.7.0.0.0

Copyright (c) 1982, 2018, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 – Production
Master table “SYSTEM”.”SYS_IMPORT_FULL_01″ successfully loaded/unloaded
Starting “SYSTEM”.”SYS_IMPORT_FULL_01″:  system/******** directory=homedir dumpfile=hrscn.dmp
Processing object type SCHEMA_EXPORT/USER
Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
Processing object type SCHEMA_EXPORT/ROLE_GRANT
Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
Processing object type SCHEMA_EXPORT/TABLESPACE_QUOTA
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE
Processing object type SCHEMA_EXPORT/TABLE/PROCACT_INSTANCE
Processing object type SCHEMA_EXPORT/TABLE/TABLE
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
. . imported “HR”.”COUNTRIES”                            6.906 KB      25 rows
. . imported “HR”.”DEPARTMENTS”                          7.570 KB      27 rows
. . imported “HR”.”EMPLOYEES”                            17.53 KB     107 rows
. . imported “HR”.”JOBS”                                 7.554 KB      19 rows
. . imported “HR”.”JOB_HISTORY”                          7.640 KB      10 rows
. . imported “HR”.”LOCATIONS”                            8.882 KB      23 rows
. . imported “HR”.”REGIONS”                              5.992 KB       4 rows
. . imported “HR”.”TEST”                                 19.72 KB     123 rows
Processing object type SCHEMA_EXPORT/TABLE/COMMENT
Processing object type SCHEMA_EXPORT/PROCEDURE/PROCEDURE
Processing object type SCHEMA_EXPORT/PROCEDURE/ALTER_PROCEDURE
Processing object type SCHEMA_EXPORT/VIEW/VIEW
Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/TRIGGER
Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type SCHEMA_EXPORT/STATISTICS/MARKER
Job “SYSTEM”.”SYS_IMPORT_FULL_01″ successfully completed at Thu Aug 8 16:56:42 2019 elapsed 0 00:00:44

Now create and start the replicat process.  Connect to ggsci on the destination system and create the process.

GGSCI (gg183too) 8> edit params apply1

replicat apply1
DBOPTIONS INTEGRATEDPARAMS(parallelism 2)
assumetargetdefs
useridalias ggulocal
handlecollisions
discardfile ./dirrpt/hrerr.dsc, purge
ddl include all
map hr.*, target hr.*;

GGSCI (gg183too) 2> add replicat apply1, integrated, exttrail ./dirdat/rt

GGSCI (gg183too) 2> start replicat apply1

Once again, monitor the file ggserr.log for any errors and fix.  At this point, one way replication from the 12.1 source to the 18.3 target should be working.  Run some updates to verify that everything is working.

Next, we configure the reverse replication in case we need it.  This is basically identical to the initial replication, the only difference is we dont need to do an export/import.

GGSCI (gg183too) 2> dblogin useridalias ggulocal
GGSCI (gg183too as gguser@ora183t) 2> register extract capture2, database
GGSCI (gg183too as gguser@ora183t) 3> add extract capture2, integrated tranlog, begin now
GGSCI (gg183too as gguser@ora183t) 9> add rmttrail ./dirdat/xt, extract capture2

GGSCI (gg183too as gguser@ora183t) 3> edit params CAPTURE2

extract capture2
useridalias ggulocal
rmthost 10.12.1.8, mgrport 7809
tranlogOptions IntegratedParams (max_sga_size 256), excludeuser ggadmin
DDL include mapped
rmttrail ./dirdat/xt
LOGALLSUPCOLS
UPDATERECORDFORMAT COMPACT
table hr.*;

GGSCI (gg183too as gguser@ora183t) 3> start extract capture2

Monitor for errors in ggserr.log.

Then, go back to the source instance and configure the replicat process.

GGSCI (gg121too) 7> add replicat apply2 integrated exttrail ./dirdat/xt
R

GGSCI (gg121too) 7> edit params apply2

replicat apply2
  DBOPTIONS INTEGRATEDPARAMS(parallelism 2)
  AssumeTargetDefs
  DiscardFile ./dirrpt/rpdw.dsc, purge
  USERIDALIAS ggulocal
  DDLERROR DEFAULT DISCARD IGNOREMISSINGOBJECTS
  REPERROR (DEFAULT, EXCEPTION)
  MAP HR.*, target HR.*;

GGSCI (gg121too) 8> start replicat apply2

Once again, monitor the ggserr.log file for problems.  At this point, everything should be replicating.  Run an update to verify that changes are copying in both directions.

Multi-master replication is now configured and running between and Oracle 12.1 instance and an oracle 18.3 instance.

Implementing the GI Standalone Agent for Oracle Goldengate

August 6, 2019

In my previous post, we discussed configuring multi-master replication in a clustered environment.  In this post, we will discuss configuring and using the standalone agent to manage the Goldengate processes using a virtual IP (VIP).

The first step is to remove the current agent configuration. The command to do this is:

$ggclust11 agctl remove goldengate GG_SOURCE

$ggclust21 agctl remove goldengate GG_TARGET

We will create a virtual IP that the Goldengate processes will use. The pump processes will modified to connect to this VIP to transmit data to the replicat process.

In order to create the VIP, we first must determine the correct network.  It must be on the same network as the Virtual IPs and SCAN IPs for the database. The command below will show the correct network to use, though most likely as a DBA you would already know this:

crsctl status resource -p -attr NAME,USR_ORA_SUBNET -w "TYPE = ora.network.type" |sort | uniq

NAME=ora.net1.network
USR_ORA_SUBNET=10.12.1.0

For this process, we have chosen 10.12.1.201 for the VIP for the ggclust11/12 cluster.  The name of this VIP is ggclust1-vip.

The VIP for the ggclust21/22 cluster is ggclust2-vip, 10.12.1.202.

In order to configure a VIP, the agctl command must be run as root.  In our previous example, the commands were run as Oracle. The commands below were run to configure the VIPs on the two clusters:

ggclust11 # cd /u01/app/oracle/xag/bin

ggclust11 # ./agctl add goldengate GG_SOURCE \
--gg_home /u02/gghome \
--oracle_home /u01/app/oracle/product/19.3.0/dbhome_1 \
--db_services ora.demogg1.oggserv.svc --use_local_services \
--monitor_extracts hrext1,hrpump --monitor_replicats hrrep2 --filesystems ora.acfs.ggvol.acfs \
--network 1 \
--ip 10.12.1.201 --user oracle --group oinstall \
--nodes ggclust11,ggclust12

ggclust21 # cd /u01/app/oracle/xag/bin

ggclust21 # 

./agctl add goldengate GG_TARGET \
--gg_home /u02/gghome \
 --oracle_home /u01/app/oracle/product/19.3.0/dbhome_1 \
 --db_services ora.demogg2.oggserv.svc --use_local_services \
 --network 1 \
 --ip 10.12.1.202 \
 --monitor_extracts hr2ext1,hrpump2 --monitor_replicats hrrep1 --filesystems ora.acfs.ggdata.acfs \
 --user oracle --group oinstall \
 --nodes ggclust21,ggclust22

Next modify the parameter file for the pump (hrpump and hrpump2) process so that the remotehost line points to the newly created VIP.

For hrext1:

rmthost ggclust2-vip, mgrport 7809

For hr2ext1:

rmthost ggclust1-vip, mgrport 7809

Finally, start the goldengate replication on both clusters:

ggclust11$ agctl start goldengate GG_SOURCE

ggclust12$ agctl start goldengate GG_TARGET

At this point, if you want to relocate the replication, you can use the ‘agctl relocate goldengate’ command, for example:

agctl relocate goldengate GG_SOURCE –node ggclust12

Note that since you are moving the VIP each time you do this, the pump processes will abend.  You will either need to restart the pump processes manually, or configure them for autorestart. (AUTORESTART EXTRACT HRPUMP in the file mgr.prm).

To stop the processes, run the stop command:

agctl stop goldengate GG_SOURCE

To check the status of the Goldengate processes:

agctl status goldengate GG_SOURCE

At this point you have configured the most basic high availability for Goldengate in a cluster.  You will need to add monitoring scripts and options processes for when Goldengate runs into problems.

Goldengate Multi-Master Replication on Oracle 19

August 6, 2019

In this article we are going to configure Oracle Goldengate for mult-master replication in a RAC environment.  Note that this is a complex environment, and your environment will have different problems and issues.  Here is my environment:

We have two-2 node RAC Databases:

Cluster 1 has nodes ggclust11 and ggclust12.  Cluster 1 runs cluster database demogg1, with instances demogg11 and demogg12.

Cluster number 2, has nodes ggclust21 and ggclust22 and runs database ggclust2, with instances ggclust21 and ggclust22.

We are going to use the hr schema in the sample schemas provided by oracle for our replication. I have added a column to each table, last_upd_dt_tm of data type timestamp in order to allow us to effectively run multi-master replication. I also created a trigger for each table that fills out the last_updt_dt_tm column on update or insert.

See my previous blog (https://dbakerber.wordpress.com/2019/07/31/oracle-software-installation-bugs/) for a couple of fun bugs I ran into while building the cluster and creating the databases.

On both databases the following commands were run as part of the initial setup:

Alter database archivelog;
Alter database flashback on;
Alter database add supplemental log data;
Alter database force logging;

I have configured dnsmasq for names resolution so that the DNS works as it would in an enterprise level environment. Below is the hosts file used for the two clusters:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4 
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
10.12.1.190 ggclust1-scan.localdomain ggclust1-scan 
10.12.1.191 ggclust1-scan.localdomain ggclust1-scan 
10.12.1.192 ggclust1-scan.localdomain ggclust1-scan 
10.12.1.193 ggclust11-vip.localdomain ggclust11-vip 
10.12.1.200 ggclust12-vip.localdomain ggclust12-vip 
10.12.1.195 ggclust2-scan.localdomain ggclust2-scan 
10.12.1.196 ggclust2-scan.localdomain ggclust2-scan 
10.12.1.197 ggclust2-scan.localdomain ggclust2-scan 
10.12.1.198 ggclust21-vip.localdomain ggclust21-vip 
10.12.1.199 ggclust22-vip.localdomain ggclust22-vip
10.12.1.238 ggclust11.localdomain ggclust11 
10.12.1.46 ggclust12.localdomain ggclust12 
10.12.1.37 ggclust21.localdomain ggclust21 
10.12.1.15 ggclust22.localdomain ggclust22
192.168.31.111 ggclust11-priv.localdomain ggclust11-priv 
192.168.31.112 ggclust12-priv.localdomain ggclust12-priv 
192.168.31.121 ggclust21-priv.localdomain ggclust21-priv 
192.168.31.122 ggclust22-priv.localdomain ggclust22-priv

After creating the cluster, the next step is to configure a service for Goldengate processes to connect to from the remote host. Since you typically only want Goldengate running on one node, this service is configured to be only available on one node at a time:
srvctl add service -db demogg1 -service oggserv -preferred demogg11 -available demogg12 
srvctl add service -db demogg2 -service oggserv -preferred demogg21 -available demogg22

Next, set up tnsnames entries for the Goldengate processes.  Samples are below, the local entry is the complex entry configured to optimize network performance.  The local entry was different for each server:

DEMOGG1 = 
(DESCRIPTION = 
(SDU=2097152) 
(ADDRESS = 
(PROTOCOL = BEQ) 
(PROGRAM = /u01/app/oracle/product/19.3.0/dbhome_1/bin/oracle) 
(ARGVO=demogg11) 
(ARGS='(DESCRIPTION=(SDU=2097152)(LOCAL=YES)(ADDRESS=(PROTOCOL=BEQ)))') 
(ENVS='ORACLE_HOME=/u01/app/oracle/product/19.3.0/dbhome_1,ORACLE_SID=demogg12') 
) 
(CONNECT_DATA=(SID=demogg12)) 
)
DEMOGG2 = 
(DESCRIPTION = 
(SDU=2097152) 
(ADDRESS = (PROTOCOL = TCP)(HOST = ggclust2-scan)(PORT = 1521)) 
(CONNECT_DATA = 
(SERVER = DEDICATED) 
(SERVICE_NAME = oggserv) 
) 
) 

Next, I configured an ACFS share for the important data directories for Oracle Goldengate.  This was configured in ASMCA, as shown below:

image

image

Next, Oracle Goldengate was installed on all 4 nodes.  Goldengate does not have a cluster installation option, and the home directory was set to /u02/gghome.  The ‘start manager’ option was chosen, then the following commands were run to put the key directories on the shared file system (/u02/ggdata):

cd /u02/gghome
rmdir dirchk 
rmdir dirprm 
rmdir dirdat 
rmdir dirtmp 
rmdir dircrd 
rmdir dirrpt 
cd /u02/ggdata
mkdir dirchk 
mkdir dirprm 
mkdir dirdat 
mkdir dirtmp 
mkdir dircrd 
mkdir dirrpt 
mkdir BR
cd /u02/gghome
ln -s /u02/ggdata/dirprm /u02/gghome/dirprm 
ln -s /u02/ggdata/dirchk /u02/gghome/dirchk 
ln -s /u02/ggdata/dirdat /u02/gghome/dirdat 
ln -s /u02/ggdata/dircrd /u02/gghome/dircrd 
ln -s /u02/ggdata/dirrpt /u02/gghome/dirrpt 

Note that you may want to have different parameter files based on the cluster node that is running the Goldengate process, in which case you would not put dirprm on shared storage.

At this point, the preparation is ready.  Now to configure the replication.  The schema we are using for this study is HR, password is tiger.  The goldengate user for the replication process is gguser.  We will also create a ggadmin user whose changes will not be replicated.

Next, create the individual users on the database:

1. Create the tablespace for goldengate, in our case golden:

create tablespace golden datafile ‘+DATA’ size 1024m autoextend on;

2. Create the users.

create user gguser identified by gguser;
create user ggadmin identified by ggadmin;

3. Grant required privileges.

exec dbms_goldengate_auth.grant_admin_privilege('GGUSER'’);
exec dbms_goldengate_auth.grant_admin_privilege('GGADMIN');
grant connect, resource, alter any table to gguser, ggadmin;
@marker_setup.sql
@ddl_setup.sql
@role_setup.sql
@ddl_enable.sql
@sequence

Next, we configured the credentials we will be using.  All credentials are configured the same so that there is never any question which db to connect to.  The ggulocal useridalias is to the local database, and the gguremote user is always to the remote database:

add credentialstore
alter credentialstore add user gguser@demogg1, password gguser alias ggulocal 
alter credentialstore add user ggadmin@demogg1, password ggadmin alias ggalocal 
alter credentialstore add user gguser@demogg2, password gguser alias gguremote 
alter credentialstore add user ggadmin@demogg2, password ggadmin alias ggaremote 

Next, add the transaction data and create the checkpointtable.  Do this on both databases since we are going to do multi-master replication.

GGSCI (ggclust11.localdomain) dblogin useridalias ggulocal
Successfully logged into database.
GGSCI (ggclust11.localdomain as gguser@demogg11) 2&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt;add trandata hr.* cols *

GGSCI (ggclust11.localdomain as gguser@demogg11) 3> add checkpointtable gguser.checkpoint

Next, ensure that the manager and GLOBALS are configured properly> These files are the same on both clusters:
edit params mgr


PORT 7809
Autostart ER *

Edit the GLOBALS file and configure it as follows:

checkpointtable gguser.checkpoint 
GGSCHEMA gguser 
ALLOWOUTPUTDIR /u02/ggdata 
ALLOWOUTPUTDIR /u02/gghome 

Exit ggsci and re-enter, then restart the mgr process.
Now we are ready to begin the actual building of the replication. On ggclust11 we will build the initial extract and pump processes.

First the extract. Edit the parameter file for the extract and add these lines (edit params hrext1):

EXTRACT hrext1 
setenv (ORACLE_SID='demogg1') ç Not that this is actually the just the entry in the oratab. 
useridalias ggulocal 
tranlogOptions IntegratedParams (max_sga_size 256), excludeuser ggadmin, excludetag 00 
CACHEMGR CACHEDIRECTORY /u02/ggdata/dirtmp 
BR BRDIR /u02/ggdata/BR 
DDL include all 
Exttrail /u02/ggdata/dirdat/hr 
LOGALLSUPCOLS 
UPDATERECORDFORMAT COMPACT 
table hr.*;

As we noted earlier, we will be using the ggadmin user for maintenance, which is why we have the ‘excludeuser ggadmin’ entry. All dml and ddl done by the ggadmin user will be ignored by Goldengate. The excludetag will ensure that the we don’t run into problems with ping-pong updates. Note that the settings we are using for the excludetag are the default.

GGSCI (ggclust11.localdomain) Dblogin useridalias ggulocal 
GGSCI (ggclust11.localdomain) register extract hrext1 database # Note that this command will return an scn, eg 947747283. Take note of this scn for later use. 
GGSCI (ggclust11.localdomain) add extract hrext1 integrated tranlog, begin now 
GGSCI (ggclust11.localdomain) add exttrail /u02/ggdata/dirdat/hr, extract hrext1

Next we create the pump process to transmit the data.

Edit params hrpump:

extract hrpump 
useridalias gguremote 
CACHEMGR CACHEDIRECTORY /u02/ggdata/dirtmp 
ddl include all 
rmthost ggclust21-vip, mgrport 7809 
rmttrail /u02/ggdata/dirdat/ph 
table hr.*;
</pre>
GGSCI (ggclust11.localdomain) 1&gt; add extract hrpump, exttrailsource /u02/ggdata/dirdat/hr

GGSCI (ggclust11.localdomain) 1&gt; add rmttrail /u02/ggdata/dirdat/ph, extract hrpump
<pre>


At this point the extract and the pump process have been created. Now, run the datapump export using the scn from above:

ggclust11$ expdp system/password directory=dumpdir schemas=hr parallel=4 dumpfile=hrdump%u.dmp flashback_scn=947747283

Then, log into the target system, ggclust21 in our example, and run the import:

ggclust21$ impdp system/password directory=dumpdir parallel=4 dumpfile=hrdump%u.dmp

Now, go back to the source system, ggclust11, and start the extracts

GGSCI (ggclust11.localdomain) start extract hrext1
GGSCI (ggclust11.localdomain) start extract hrpump

Ensure that you monitor the file ggserr.log for errors. It is not uncommon to have simple typos that need to be fixed.

Next connect to ggclust21 and create the replicat process.  Replicat is the term Oracle Goldengate uses for the process that writes the data to the target system.  Note the resolveconflict section. Since we plan to set up multi-master replication, basic conflict resolution is required.  The setup below will handle the simple problems.  There will be conflicts that require manual resolution in any true bi-directional environment.

GGSCI (ggclust21.localdomain) edit params hrrep1
replicat hrrep1
SETENV(ORACLE_SID='demogg2')
DBOPTIONS INTEGRATEDPARAMS(parallelism 2), settag 00
assumetargetdefs
useridalias ggulocal
handlecollisions
discardfile /u02/ggdata/dirrpt/hrerr.dsc, purge
ddl include all
map hr.*, target hr.*
RESOLVECONFLICT (UPDATEROWEXISTS, (DEFAULT, USEMAX (last_updt_dt_tm))),
RESOLVECONFLICT (DELETEROWEXISTS, (DEFAULT, OVERWRITE)),
RESOLVECONFLICT (INSERTROWEXISTS, (DEFAULT, USEMAX (last_updt_dt_tm))),
RESOLVECONFLICT (UPDATEROWMISSING, (DEFAULT, OVERWRITE)),
RESOLVECONFLICT (DELETEROWMISSING, (DEFAULT, DISCARD)),
MAPEXCEPTION
(TARGET ggadmin.EXCEPTIONS, EXCEPTIONSONLY, INSERTALLRECORDS, COLMAP
(
excp_date = @DATENOW(),
rep_name = @GETENV ('GGENVIRONMENT', 'GROUPNAME'),
table_name = @GETENV ('GGHEADER', 'TABLENAME'),
errno = @GETENV ('LASTERR', 'DBERRNUM'),
errtype = @GETENV ('LASTERR', 'ERRTYPE'),
optype = @GETENV ('LASTERR', 'OPTYPE'),
transind = @GETENV ('GGHEADER', 'TRANSACTIONINDICATOR'),
transimgind = @GETENV ('GGHEADER', 'BEFOREAFTERINDICATOR'),
committimestamp = @GETENV ('GGHEADER', 'COMMITTIMESTAMP'),
reccsn = @GETENV ('TRANSACTION', 'CSN'),
recseqno = @GETENV ('RECORD', 'FILESEQNO'),
recrba = @GETENV ('RECORD', 'FILERBA'),
rectranspos = @GETENV ('RECORD', 'RSN'),
reclength = @GETENV ('GGHEADAER', 'RECORDLENGTH'),
logrba = @GETENV ('GGHEADER', 'LOGRBA'),
logposition = @GETENV ('GGHEADER', 'LOGPOSITION'),
grouptype = @GETENV ('GGENVIRONMENT', 'GROUPTYPE'),
filename = @GETENV ('GGFILEHEADER', 'FILENAME'),
fileno = @GETENV ('GGFILEHEADER', 'FILESEQNO'),
srcrowid = @GETENV ('TRANSACTION', 'CSN'),
srcdbcharset = @GETENV ('GGFILEHEADER', 'DBCHARSET'),
replag = @GETENV ('LAG', 'SEC'),
cnt_cdr_conflicts = @GETENV ('DELTASTATS','TABLE', @GETENV ('GGHEADER', 'TABLENAME'),'CDR_CONFLICTS'), cnt_cdr_resolutions = @GETENV ('DELTASTATS','TABLE', @GETENV ('GGHEADER', 'TABLENAME'),'CDR_RESOLUTIONS_SUCCEEDED'),
cnt_cdr_failed = @GETENV ('DELTASTATS','TABLE', @GETENV ('GGHEADER', 'TABLENAME'),'CDR_RESOLUTIONS_FAILED')
)
);

Now lets get the replicat going:

GGSCI (ggclust21.localdomain)dblogin useridalias ggulocal
GGSCI (ggclust21.localdomain ggrep21) add replicat hrrep1 integrated exttrail /u02/ggdata/dirdat/ph # note that this exttrail corresponds to the remotetrail on the pump extract
GGSCI (ggclust21.localdomain ggrep21) register replicat hrrep1 database
GGSCI (ggclust21.localdomain ggrep21) start replicat hrrep1

Once again, monitor the file ggserr.log and correct any problems that appear.

Once you have verified that the extracts and replicat are running properly, insert a row into the source schema and verify that it gets replicated to the target. Once the row replicates, you know that the processes are working properly.

Now, its time to set up the replication in the other direction. The setup is identical, with the exception that we do not need to run a datapump export/import since the initial source of the data was our source database.  Below are the commands to set up the extract and pump processes.  These are set up on the target database this time:

GGSCI (ggclust21.localdomain ggrep21)  register extract hr2ext1 database;
GGSCI (ggclust21.localdomain ggrep21) add extract hr2ext1, integrated tranlog, begin now
GGSCI (ggclust21.localdomain ggrep21) add exttrail /u02/ggdata/dirdat/xr, extract hr2ext1
EXTRACT hr2ext1
setenv (ORACLE_SID='demogg2')
useridalias ggulocal
tranlogOptions IntegratedParams (max_sga_size 256), excludeuser ggadmin, excludetag 00
CACHEMGR CACHEDIRECTORY /u02/ggdata/dirtmp
BR BRDIR /u02/ggdata/BR
DDL include all
Exttrail /u02/ggdata/dirdat/xr
LOGALLSUPCOLS
UPDATERECORDFORMAT COMPACT
table hr.*;

extract hrpump2
Dblogin useridalias gguremote
ddl include all
rmthost ggclust11-vip, mgrport 7809
rmttrail /u02/ggdata/dirdat/xh
table hr.*;

GGSCI (ggclust21.localdomain ggrep21) add extract hrpump2, exttrailsource /u02/ggdata/dirdat/xr
GGSCI (ggclust21.localdomain ggrep21) add rmttrail /u02/ggdata/dirdat/xh, extract hrpump2
GGSCI (ggclust21.localdomain ggrep21) start extract hr2ext1
GGSCI (ggclust21.localdomain ggrep21) start extract hrpump2

Now monitor the ggserr.log for errors on the extract process.  If there are no errors, return to the source system (ggclust11) and build the replicat process:

GGSCI (ggclust11.localdomain) ggrep1 register replicat hrrep2 database
GGSCI (ggclust11.localdomain) ggrep1 add replicat  hrrep2 integrated exttrail /u02/ggdata/dirdat/xh
replicat hrrep2
 SETENV(ORACLE_SID='demogg1')
 DBOPTIONS INTEGRATEDPARAMS(parallelism 2)
 AssumeTargetDefs
 DiscardFile /u02/ggdata/dirrpt/rpdw.dsc, purge
 USERIDALIAS ggulocal
 DDLERROR DEFAULT DISCARD IGNOREMISSINGOBJECTS
 REPERROR (DEFAULT, EXCEPTION)
 MAP HR.*, target HR.*
 RESOLVECONFLICT (UPDATEROWEXISTS, (DEFAULT, USEMAX (last_updt_dt_tm))),
 RESOLVECONFLICT (DELETEROWEXISTS, (DEFAULT, OVERWRITE)),
 RESOLVECONFLICT (INSERTROWEXISTS, (DEFAULT, USEMAX (last_updt_dt_tm))),
 RESOLVECONFLICT (UPDATEROWMISSING, (DEFAULT, OVERWRITE)),
 RESOLVECONFLICT (DELETEROWMISSING, (DEFAULT, DISCARD)),
 MAPEXCEPTION
 (TARGET ggadmin.EXCEPTIONS, EXCEPTIONSONLY, INSERTALLRECORDS, COLMAP
 (
 excp_date = @DATENOW(),
 rep_name = @GETENV ('GGENVIRONMENT', 'GROUPNAME'),
 table_name = @GETENV ('GGHEADER', 'TABLENAME'),
 errno = @GETENV ('LASTERR', 'DBERRNUM'),
 errtype = @GETENV ('LASTERR', 'ERRTYPE'),
 optype = @GETENV ('LASTERR', 'OPTYPE'),
 transind = @GETENV ('GGHEADER', 'TRANSACTIONINDICATOR'),
 transimgind = @GETENV ('GGHEADER', 'BEFOREAFTERINDICATOR'),
 committimestamp = @GETENV ('GGHEADER', 'COMMITTIMESTAMP'),
 reccsn = @GETENV ('TRANSACTION', 'CSN'),
 recseqno = @GETENV ('RECORD', 'FILESEQNO'),
 recrba = @GETENV ('RECORD', 'FILERBA'),
 rectranspos = @GETENV ('RECORD', 'RSN'),
 reclength = @GETENV ('GGHEADAER', 'RECORDLENGTH'),
 logrba = @GETENV ('GGHEADER', 'LOGRBA'),
 logposition = @GETENV ('GGHEADER', 'LOGPOSITION'),
 grouptype = @GETENV ('GGENVIRONMENT', 'GROUPTYPE'),
 filename = @GETENV ('GGFILEHEADER', 'FILENAME'),
 fileno = @GETENV ('GGFILEHEADER', 'FILESEQNO'),
 srcrowid = @GETENV ('TRANSACTION', 'CSN'),
 srcdbcharset = @GETENV ('GGFILEHEADER', 'DBCHARSET'),
 replag = @GETENV ('LAG', 'SEC'),
 cnt_cdr_conflicts = @GETENV ('DELTASTATS','TABLE', @GETENV ('GGHEADER', 'TABLENAME'),'CDR_CONFLICTS'), cnt_cdr_resolutions = @GETENV ('DELTASTATS','TABLE', @GETENV ('GGHEADER', 'TABLENAME'),'CDR_RESOLUTIONS_SUCCEEDED'),
 cnt_cdr_failed = @GETENV ('DELTASTATS','TABLE', @GETENV ('GGHEADER', 'TABLENAME'),'CDR_RESOLUTIONS_FAILED')
 )
 );
GGSCI (ggclust11.localdomain) ggrep1 start replicat hrrep2.

At this point, the bi-directional (multi-master) replication should be working properly.  Insert a row into each database and verify that it gets copied.  Then delete it and make sure it is deleted.

Some notes:

The exttrail for the replicat is always the same as the remotetrail for the pump process. This is /u02/ggdata/dirdat/ph on demogg1 (for hrpump/hrext1).

The exttrailsource for the pump process is the same as the exttrail for the extract process. This is /u02/ggdata/dirdat/hr on demogg1 (from hrext1).

The last step in this process is to install and configure the Grid Infrastructure standalone agent.  My experimentation with the agent shows a decent amount of potential, but substantial customization in the form of script writing is going to be required to make it really useful.  This will probably be the subject of another blog post, but for this post I will just cover the installation and configuration.

This agent is used to manage the Goldengate processes. The agent is downloaded from here: http://www.oracle.com/technetwork/database/database-technologies/clusterware/downloads/xag-agents-downloads-3636484.html

After downloading the agent, install it in a directory that is outside of both the database and GI home. I chose /u01/oracle/xag. Just unzip the file and run this command to install the agent on both nodes of each cluster. Run the command once per cluster, the setup installs the agent on both cluster nodes:

$./xagsetup.sh --install --directory /u01/oracle/xag --all_nodes

Now with the agent installed, we put the Goldengate under the control of the agent. The first step of this process it to put the standalone agent bin directory in your path statement ahead of all other oracle software. This is to ensure that the agent commands run against the correct agent. Then we put Goldengate under agent control (ggclust11):

./agctl add goldengate GG_SOURCE \
--gg_home /u02/gghome \
--oracle_home /u01/app/oracle/product/19.3.0/dbhome_1 \
--db_services ora.demogg2.oggserv.svc --use_local_services \
--monitor_extracts hrext1,hrpump1 --monitor_replicats hrrep2 \
--filesystems ora.acfs.ggvol.acfs \
--user oracle --group oinstall

The command for ggclust21:

./agctl add goldengate GG_TARGET \
 --gg_home /u02/gghome \
  --oracle_home /u01/app/oracle/product/19.3.0/dbhome_1 \
  --db_services ora.demogg2.oggserv.svc --use_local_services \
  --monitor_extracts hr2ext1,hrpump2 --monitor_replicats hrrep1 --filesystems ora.acfs.ggdata.acfs \
  --user oracle --group oinstall

As the last step in this process, add this line to the GLOBALS file for all 4 nodes, then stop and restart all the process:

XAGENABLE

With the golden gate processes managed by an agent you should see this:

GGSCI (ggclust21.localdomain) 2&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt; info all

Program Status Group Lag at Chkpt Time Since Chkpt XAG

MANAGER 				RUNNING 				MANAGED		/xag.GG_TARGET.goldengate/ggclust21
EXTRACT 				RUNNING 				HREXT1 	00:00:00 00:00:02 NON-CRITICAL
EXTRACT 				RUNNING 				HRPUMP 	00:00:00 00:00:04 NON-CRITICAL
REPLICAT 				RUNNING 				HRREP2 	00:00:00 00:00:00 NON-CRITICAL

At this point, Multi-master goldengate is completely configured in a cluster. Remember that if you want to move the Goldengate process to run on the other node in the cluster, you will also need to move the service configured for Goldengate in this step:

srvctl add service -db demogg1 -service oggserv -preferred demogg11 -available demogg12 

Please note that there is manual intervention that will be required should you want to run Goldengate on the second node of the cluster. I would recommend writing scripts to use to move the Goldengate processes to the other cluster node. For example, you will need to change the destination of the pump process on the target node to point to ggclust12-vip from ggclust11-vip. For that reason, you may find it expedient to create a virtual IP on both clusters to be used by the Goldengate processes. The agent can be used for that purpose, but still remember that the agent only works on one cluster, it will not affect the remote cluster.  You will need to run agent commands on both clusters in order to move the Goldengate processes.

Oracle Software Installation Bugs

July 31, 2019

For many years, Oracle software quality has improved every year.  Bugs were resolved and did not reappear. New bugs were rare and hard to find.  However, it seems to me that they have regressed in the last few versions of Oracle.  Some very simple errors have appeared in their software, in both the grid infrastructure and in the database software.

I have been in the process of putting together an Oracle 19c Cluster on VMware workstation for the purpose of a blog on Golden Gate.  It has been a struggle.  The first problem I ran into was in running the root.sh script:

2019/07/29 15:31:31 CLSRSC-4002: Successfully installed Oracle Trace File Analyzer (TFA) Collector.
2019/07/29 15:31:35 CLSRSC-594: Executing installation step 9 of 19: ‘ConfigOLR’.
2019/07/29 15:31:57 CLSRSC-594: Executing installation step 10 of 19: ‘ConfigCHMOS’.
2019/07/29 15:31:57 CLSRSC-594: Executing installation step 11 of 19: ‘CreateOHASD’.
2019/07/29 15:32:07 CLSRSC-594: Executing installation step 12 of 19: ‘ConfigOHASD’.
2019/07/29 15:32:08 CLSRSC-330: Adding Clusterware entries to file ‘oracle-ohasd.service’
2019/07/29 15:32:57 CLSRSC-594: Executing installation step 13 of 19: ‘InstallAFD’.
2019/07/29 15:33:10 CLSRSC-594: Executing installation step 14 of 19: ‘InstallACFS’.
2019/07/29 15:33:23 CLSRSC-594: Executing installation step 15 of 19: ‘InstallKA’.
2019/07/29 15:33:35 CLSRSC-594: Executing installation step 16 of 19: ‘InitConfig’.
2019/07/29 15:33:40 CLSRSC-119: Start of the exclusive mode cluster failed
Died at /u01/app/19.3.0/grid/crs/install/crsinstall.pm line 2439.

Died at line 2439.  Really?  No error message, nothing.  It just died.  This one took quite a bit of research, and finally I discovered the problem.  My cluster name was ggclust1-cluster.  See the problem?  It took me a while too.  The cluster name was 16 characters, and I entered the name in the installer conversation. But the maximum size of a cluster name is 15 characters. 

Next, while running DBCA to create a new database, the DBCA process terminated with ‘Recovery manager failed to restore datafiles’.  In the alert log of the database that was supposed to be created, there was this:

WARNING: failed to register ASMB with ASM instance
2019-07-30T18:42:30.735803-05:00
Errors in file /u01/app/oracle/diag/rdbms/ggdemo1/ggdemo11/trace/ggdemo11_asmb_98417.trc:
ORA-01017: invalid username/password; logon denied

DBCA had never prompted me for a password.  This one took some research, but finally I found it.  The problem was that Oracle was not in the oinstall group.  If you installed the Oracle preinstall package, it should have put Oracle in the oinstall group, and if you run id oracle, it will show oracle in the oinstall group.

However, if you run the command ‘grep oracle /etc/group’:

oinstall:x:54321:==>
dba:x:54322:oracle
racdba:x:54330:oracle

oracle does not show up in the oinstall group.  This looks like a problem with both DBCA and the Oracle preinstall package.  The fix is to manually add oracle to the oinstall group.  DBCA could also be modified to prompt for the ASM username/password, but that would require a change from Oracle.  

usermod -g oinstall -G oinstall,dba oracle

These are two problems you are likely to run into on a fresh installation of Oracle 19c. 

Oracle Cluster with DRBD, Pacemaker, and Corosync

March 15, 2019

In this post, we are going to build an Oracle active-passive cluster using pacemaker, corosync, and DRBD.  Please note that any Oracle licensing comments made in this post are purely my personal opinion, they are not binding on my employer, or Oracle, nor do they have any legal standing.

This was originally intended simply as a thought exercise, could I put together a fairly resilient, fairly highly available, oracle configuration over shared storage without using Oracle RAC?  The answer to that question is that the shared storage piece is impossible without Oracle RAC, however, using DRBD, we can get something close.

DRBD stands for Distributed Replicated Block Device.  It works, as the name implies, by replicating blocks.  One DRBD device is designated as the primary device, additional devices are designated as secondary devices, and blocks are replicated from the primary device to the secondary device.

Pacemaker and Corosync are linux clustering software that allow for communication between the cluster nodes, maintain synchronization for cluster resources, and monitor the resources for availability.  When a resource becomes unavailable, they also manage the failover.

So, lets begin. 

The servers:  VMware workstations running OEL 7.6, 8G RAM.  In addition to a 20G root device, each server has a 20G vmdk for the drbd device for the Oracle database binaries, and another 20G vmdk device for the Oracle database data files. 

The DRBD devices are configured as logical volumes (LVM) in order to make adding space easier. 

The server names are linclust1 and linclust2.  Below are the hosts file, note that the NICs for cluster management are named linclust1-hb and linclust2-hb, and the storage management nics are named lincust1-priv and linclust2-priv.  It is definitely recommended that different NICS be used for the storage and internodal communications.

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

10.12.1.58 linclust1 linclust1.localdomain
10.12.1.59 linclust2 linclust2.localdomain

192.168.31.101 linclust1-priv linclust1-priv.localdomain
192.168.31.102 linclust2-priv linclust2-priv.localdomain

192.168.57.101 linclust1-hb linclust1-hb.localdomain
192.168.57.102 linclust2-hb linclust2-hb.localdomain

10.12.1.61 linclust-vip  linclust-vip.localdomain

Before doing any other cluster configuration, we have to configure LVM and DRBD.  The drbd devices show up on each node as /dev/sdb and /dev/sdc.

Create partitions using fdisk.

  • Verify the partitions available on the server: fdisk -l
  • Run fdisk /dev/sdb
  • Type ‘n’ to create a new partition.
  • Specify where you would like the partition to end and start.  You can set the number of MB of the partition instead of the end cylinder.  For example:  +1000M
  • Type ‘p’ to view the partition, and type ‘w’ to save the partition
  • Repeat for /dev/sdc

    Since this RHEL/OEL7, the partition will automatically be created aligned on a sector boundary when we run fdisk. In early versions, we would need to align it manually.

    Next, we create the logical volumes for each device. 

    pvcreate /dev/sdb1
    vgcreate shared1 /dev/sdb1
    lvcreate –name shared1 -l 100%FREE shared1

    pvcreate /dev/sdc1
    vgcreate shared2 /dev/sdc1
    lvcreate –name shared2 -l 100%FREE shared2

    Next we install the required software on both cluster servers:

    yum –y install drbd pacemaker corosync pcs pcsd

    The above command will install all the required clustering software.

    Install the required oracle packages,

    yum –y install oracle-database-server-12cR2-preinstall

    Now we are ready to configure DRBD. DRBD continuously replicates data from the primary to the secondary device.

    1. Edit global_common.conf, it should read as follows:

    global {
    usage-count no;
    }
    common {
    net {
      protocol C;
    }
    }

    2. Create the actual definition file for our configuration.  In our case, the name of the file is drbd00.res, and has the following lines:

    resource drbd00 {
            device /dev/drbd0;
            disk /dev/shared1/shared1;
            meta-disk internal;
            net {
                      allow-two-primaries;
            }
            syncer {
               verify-alg sha1;
            }
            on linclust1.localdomain {
                    address 192.168.31.101:7789;
            }
            on linclust2.localdomain {
                    address 192.168.31.102:7789;
            }
          }
    resource drbd01 {
            device /dev/drbd1;
            disk /dev/shared2/shared2;
            meta-disk internal;
            net {
                      allow-two-primaries;
            }
            syncer {
               verify-alg sha1;
            }
            on linclust1.localdomain {
                    address 192.168.31.101:7790;
            }
            on linclust2.localdomain {
                    address 192.168.31.102:7790;
            }
          }

    3. Copy drbd00.res and global_common.conf to /etc/drbd.d on the second cluster node. 

    4. At this point we are ready to start drbd.  Note that we are using the option ‘allow two primaries’.  This is because PCS will manage the mounting of the file system for the software and data.

    Run these commands to initialize drbd:

    drbdadm create-md drbd00

    drbdadm create-md drbd01

    The above commands initialize the drbd data.

    5. Start drbd

    systemctl start drbd.service

    systmctl enable drbd.service

    6. Run the commands below to assign the primary nodes, Run these only on node 1:

    drbdadm primary drbd00 –force

    drbdadm primary drbd01 –force

    7. The primary command designates the current node as the primary node so that we can make changes to the drbd device attached to this node.  At this point DRBD is running.  We can see the DRBD devices at /dev/drbd0 and /dev/drbd1.

    8. Next, we need to create a file system on the drbd devices.  Since we are going to use oracle and active-passive, we will create an xfs file system on each device.  The commands below are run on the primary node:

    mkfs –t xfs /dev/drbd0

    mkfs –t xfs /dev/drbd1

    The mkfs command will not work where the disks are secondary.

    9. At this point the replication should be working.  Run the command below to check:

    [root@linclust1 ~]# cat /proc/drbd
    version: 8.4.5 (api:1/proto:86-101)
    srcversion: 1AEFF755B8BD61B81A0AF27
    0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r—–
        ns:406265 nr:383676 dw:790464 dr:488877 al:35 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
    1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r—–
        ns:4318663 nr:5086791 dw:9409141 dr:470229 al:211 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

    Note that if the status does not show UpToDate, wait for a bit, check again, and verify that everything is up to date.

    10. DRBD is now fully configured and working.  As part of the cluster configuration we are going to mount these as file systems, so the next step is to create the mount points. I am going to use /u01 and /u02.

    mkdir –p /u01

    mkdir –p /u02

    chown oracle:oinstall /u01

    chown oracle:oinstall /u02

    Next, we configure the cluster.  The cluster software we installed earlier includes pcs, pacemaker, and corosync.

    When we installed pcs, the user hacluster was created.  Modify the /etc/passwd file for the hacluster to allow user login, as shown below:

    hacluster:x:189:189:cluster user:/home/hacluster:/bin/bash

    Then create the directory for hacluster, and set hacluster:hacluster as the owner:

    mkdir –p /home/hacluster

    chown hacluster:hacluster /home/hacluster

    Next, set the password for the hacluster user using the passwd command:

    passwd hacluster

    Now, start the cluster services:

    systemctl start pcsd.service

    systemctl enable pcsd.service

    In order for the cluster to manage itself, we have to authorize the access for the cluster manager on each node.  The ‘pcs cluster auth’ command does that:

    pcs cluster auth lincust1 linclust2

    Next we create the cluster. Note that we are using the interface –hb for cluster management.  This should be an internal only network:

    pcs cluster setup –name DRBD_CLUSTER linclust1-hb linclust2-hb

    Disable stonith.  We do not want to fence a node that is not working, drdb and pcs should be able to handle it properly:

    pcs property set stonith-enabled=FALSE

    pcs cluster start –all

    At this point, the basic cluster services are configured and the cluster is running. Now its time to configure the required services.

    First create the virtual IP address (VIP) that users will use to connect to the database.  The users don’t want to try and figure out which IP is correct, so a simple virtual IP is created that will always run where Oracle is running is created.

    pcs resource create ClusterVIP ocf:heartbeat:IPaddr2 ip=10.12.1.61 cidr_netmask=32 op monitor interval=30s

    Next, the resources to manage drbd.  Since we have two drbd devices, we will need 4 resource.  The ‘raw’ resource (note, name chosen by me) simply tells PCS to keep track of the DRBD device.  The ‘master’ service tells PCS that it has to manage (master) the drbd service it created:

    pcs resource create binraw ocf:linbit:drbd drbd_resource="drbd00" op monitor interval=10s
    pcs resource master binmaster binraw master-max=1 master-node-max=1 clone-max=2 clone-node-max=2 notify=true
    pcs resource create dataraw ocf:linbit:drbd drbd_resource="drbd01" op monitor interval=10s
    pcs resource master datamaster dataraw master-max=1 master-node-max=1 clone-max=2 clone-node-max=2 notify=true

    Note this entry: ocf:linbit:drbd.  That is the type of service for PCS to monitor.  DRBD comes with a large collection of services already defined for it to monitor.  The user can also create her own. For a complete list of available services, run the command ‘pcs resource list’.

    Next, we mount the oracle binaries and file system and tell PCS to manage it:

    pcs resource create BINARIES filesystem device="/dev/drbd0" directory="/u01" fstype="xfs"
    pcs resource create DATAFILES filesystem device="/dev/drbd1" directory="/u02" fstype="xfs"

    Next, we configure some colocation and startup rules:

    pcs constraint colocation add binmaster with ClusterVIP INFINITY
    pcs constraint colocation add datamaster with ClusterVIP INFINITY
    pcs constraint order ClusterVIP then binmaster then datamaster

    pcs constraint colocation add BINARIES with binmaster INFINITY;
    pcs constraint colocation add DATAFILES with datamaster INFINITY;
    pcs constraint order promote binmaster then start BINARIES
    pcs constraint order promote datamaster then start DATAFILES
    pcs resource defaults migration-threshold=1
    pcs resource group add oracle ClusterVIP BINARIES  DATAFILES

    Now, its time to install the Oracle software and create the database.  Just use the regular oracle installer, and dbca to create the database. Remember that the binaries go on /u01 and the data files go on /u02.  You install on only one node. 

    Once the oracle installation is complete, and the database is up and running, copy the files coraenv, dbhome, and oraenv from /usr/local/bin on node1 to the same directory on node 2.  Make sure the privileges are copied correctly. 

    Copy the /etc/oratab and /etc/oraInst.loc files to node 2.  These five files are the only files that oracle requires that are not in /u01 or /u02.

    Edit the file $ORACLE_HOME/network/admin/listener.ora.  Change the IP address of the listener to listen on the VIP for the cluster, in this case 10.12.1.61.  This will allow for the access to the database to fail over.

    Create an oracle user to monitor the state of the database, default name is ocfmon:

    SQLPLUS> create user ocfmon identified by mypassword;

    SQLPLUS>grant create session to OCFMON;

    Create the resources to monitor oracle:

    pcs resource create oracleDB ocf:heartbeat:oracle sid="drbddb" –group=oracle

    pcs resource update oracleDB monuser="ocfmon" monpassword="ocfmon" monprofile="default"

    pcs resource create listenerdrbddb ocf:heartbeat:oralsnr sid="drbddb" listener="listener" –group=oracle

    At this point, the cluster is created and services are running.

    As I understand Oracle licensing, this configuration would require just one node to be licensed for Oracle.  The passive node cannot physically run oracle unless Oracle is shut down on the active node. And the Oracle software is not even viewable on the passive node unless the passive node has become the active node. In effect, Oracle is only installed on the active node, but when failover occurs the active node changes, oracle is dismounted from the passive node and mounted on the new active node.

    This is what you should see when you check the status:

    [root@linclust1 ~]# pcs status
    Cluster name: DRBD_CLUSTER
    Stack: corosync
    Current DC: linclust1-hb (version 1.1.19-8.el7_6.4-c3c624ea3d) – partition with quorum
    Last updated: Thu Feb 28 13:56:36 2019
    Last change: Wed Feb 20 17:18:09 2019 by hacluster via crmd on linclust1-hb

    2 nodes configured
    9 resources configured

    Online: [ linclust1-hb linclust2-hb ]

    Full list of resources:

    Master/Slave Set: binmaster [binraw]
         Masters: [ linclust1-hb ]
         Stopped: [ linclust2-hb ]
    Master/Slave Set: datamaster [dataraw]
         Masters: [ linclust1-hb ]
         Stopped: [ linclust2-hb ]
    Resource Group: oracle
         ClusterVIP (ocf::heartbeat:IPaddr2):       Started linclust1-hb
         BINARIES   (ocf::heartbeat:Filesystem):    Started linclust1-hb
         DATAFILES  (ocf::heartbeat:Filesystem):    Started linclust1-hb
         oracleDB   (ocf::heartbeat:oracle):        Started linclust1-hb
         listenerdrbddb     (ocf::heartbeat:oralsnr):       Started linclust1-hb

    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled

    Note that this is not true HA.  It takes about 10 minutes for failover to occur and for Oracle to start on the second node.  If you need true HA, you will need to spend the money required for a commercial solution.

    Whenever you failover, be sure and check the status of both drbd (cat /proc/drbd) and PCS (pcs status).  Because of slowness I saw in starting drbd, I added the following script to run after bootup to make sure drbd was up and running:

    [root@linclust2 startup]# cat postboot.sh
    #!/bin/bash
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
    export PATH
    sleep 3m
    drbdadm up drbd00
    drbdadm up drbd01
    cat /proc/drbd > /tmp/drbd.out

    To create the script above and ensure it runs each time the system starts, I created it as a service.  To do this, I created a file called postboot.service in /etc/systemd/system/. The file has the following contents:

    [root@linclust2 system]# cat postboot.service

    [Unit]
    Description=Script to run things after everything else starts
    After=network.target

    [Service]
    Type=simple
    ExecStart=/root/startup/postboot.sh
    TimeoutStartSec=0

    [Install]
    WantedBy=default.target

    Note the file name after ExecStart=.  That is the file that gets executed.  To enable this, run this command:

    systemctl enable postboot.service

    systemctl start postboot.service

    You can also modify the script at this point to include any other commands you need.

    In my next blog, I will discuss how to add space to DRBD in lvm in this cluster configuration.

  • BI-Directional Replication in SymmetricDS and Oracle

    January 31, 2019

    Symmetric DS is a very nice replication tool that is fairly easy to use.  It replicates using SQL Apply, similar to what Oracle Golden Gate does.  Oracle Golden Gate is substantially easier to configure, but SymmetricDS also gets the job done, and is available as open source software.

    For the purposes of this exercise, I am using SymmetricDS version 3.8.41, and Oracle version 18.3.0, both plain vanilla versions installed with no additional patches.

    SymmetricDS uses a group concept to manage replication.  Each node is assigned a group, and each group has a set of replication rules assigned to it.  The rules are defined in the tables created to manage the replication.  The name of each of the management tables  begin with sym_.

    Environment Summary

    Both servers run Oracle Enterprise Linux 7.5 and and Oracle database version 18.3.  The database sid for store 1 is symdb, the database sid for store 2 is osymdb.  Server names are oelafd1 and oelafd2, and ip address are 10.12.1.220 and 10.12.1.235 respectively. DNSMASQ is used for names resolution.

    Tnsnames entries:

    STORE1 =
       (DESCRIPTION=
         (ADDRESS=(PROTOCOL=TCP)(HOST=10.12.1.220)(PORT=1521))
             (CONNECT_DATA=
               (SERVER=DEDICATED)
               (SERVICE_NAME=symdb)
             )
            )

    STORE2 =
       (DESCRIPTION=
         (ADDRESS=(PROTOCOL=TCP)(HOST=10.12.1.235)(PORT=1521))
             (CONNECT_DATA=
               (SERVER=DEDICATED)
               (SERVICE_NAME=osymdb)
             )
            )

    Each store schema has 3 tables, test1, test2, and test3 that should be identical.

    Installation:

    Installing SymmetricDS is very simple.  Download the zip file from the web site, https://www.symmetricds.org/download.

    Put the file in an appropriate directory on your environment (in my case, it is owned by Oracle on the database server), and unzip the file.  Since SymmetricDS uses Java, it is ready to run once unzipped.

    Setup

    To set up SymmetricDS, there are three areas that must be configured, the engine file that defines the environment for the Java engine to use, the database setup that defines the environment, and the database setup that defines the replication rules.

    Node/Group Identification Concepts

    The individual nodes are defined in the sym_node table.  This is just a list of the individual nodes involved in the replication process.

    The sym_node_security table assigns creates connection information among the nodes.

    The sym_node_group table contains the names of the groups involved in replication. In this example, we only have one group.

    The sym_node_group_link table defines how data is replicated between groups. If the data_event_action column is set to ‘P’ data is pushed on events. If it is set to ‘W’ the group waits for a pull event. Note that for this example, we only have one group (store) and thus both stores will push data.

    Driver Table Concepts

    At the lowest level, the sym_channel table allocates a channel for replication.  For large loads, multiple channels can be created. 

    Next, the sym_trigger table defines what tables are replicated, what events trigger events, and what channel the event is sent to.

    The sym_router table defines routers that route events between groups.  A router can have the same group as the source and destination, this would imply bi-directional replication is configured.

    The sym_trigger_router table identifies what events are sent to which routers.

    So, the basic concept is that you define a channel, define trigger event and assign it to a channel, define a router, and assign a trigger to a router.

    Engine Setup

    We have two stores configured for this two way replication, both in group store.  The engine files are store-001.properties and store-002.properties.

    store-001.properties:

    engine.name=store-001

    # The class name for the JDBC Driver
    db.driver=oracle.jdbc.driver.OracleDriver

    # The JDBC URL used to connect to the database
    db.url=jdbc:oracle:thin:@10.12.1.220:1521:symdb

    # The user to login as who can create and update tables
    db.user=store

    # The password for the user to login as
    db.password=store4321

    # The HTTP URL of the root node to contact for registration
    registration.url=http://10.12.1.220:31415/sync/store-001
    sync.url=http://10.12.1.220:31415/sync/store-001

    # Do not change these for running the demo
    group.id=store
    external.id=001

    # This is how often the routing job will be run in milliseconds
    job.routing.period.time.ms=500
    # This is how often the push job will be run.
    job.push.period.time.ms=1000
    # This is how often the pull job will be run.
    job.pull.period.time.ms=1000

    store-002.properties:

    engine.name=store-002

    # The class name for the JDBC Driver
    #db.driver=com.mysql.jdbc.Driver
    db.driver=oracle.jdbc.driver.OracleDriver

    # The JDBC URL used to connect to the database
    db.url=jdbc:oracle:thin:@10.12.1.235:1521:osymdb

    # The user to login as who can create and update tables
    db.user=store

    # The password for the user to login as
    db.password=store4321

    # The HTTP URL of the root node to contact for registration
    registration.url=http://10.12.1.220:31415/sync/store-001
    sync.url=http://10.12.1.220:31415/sync/store-002

    # Do not change these for running the demo
    group.id=store
    external.id=002

    # This is how often the routing job will be run in milliseconds
    job.routing.period.time.ms=500
    # This is how often the push job will be run.
    job.push.period.time.ms=1000
    # This is how often the pull job will be run.
    job.pull.period.time.ms=1000

    Please take note of the registration URL and sync URL.  They are very poorly documented by the SymmetricDS documentation, but they are critical for the replication to work.  Both URLS use the IP address where the SymmetricDS engine is running, not the IP address where the database is running.  SymmetricDS does expect one master engine, and that is where the registration URL resides.

    Creating the driver tables

    The required tables are created by the SymmetricDS engine, with the command create-sym-tables. The syntax is shown below:

    ./bin/symadmin –engine store-001 create-sym-tables

    The output should show the creation of the tables for replication.  This command will also initialize the identification of the source node.  For reference, the commands below will also load the identification tables:

    insert into store.sym_node 
    (node_id,node_group_id,external_id,sync_enabled,sync_url,schema_version,symmetric_version, 
    database_type,database_version,heartbeat_time,timezone_offset,batch_to_send_count, 
    batch_in_error_count,created_at_node_id) values 
    ('001','store','001',1,null,null,null,null,null,current_timestamp,null,0,0,'000');
    insert 
    into store.sym_node 
    (node_id,node_group_id,external_id,sync_enabled,sync_url,schema_version,symmetric_version, 
    database_type,database_version,heartbeat_time,timezone_offset,batch_to_send_count, 
    batch_in_error_count,created_at_node_id) values 
    ('002','store','002',1,null,null,null,null,null,current_timestamp,null,0,0,'000');<BR>insert 
    into store.sym_node_identity values ('001'); 
    insert into 
    store.sym_node_security 
    (node_id,node_password,registration_enabled,registration_time,initial_load_enabled,initial_load_time, 
    created_at_node_id) values 
    ('001','5d1c92bbacbe2edb9e1ca5dbb0e481',0,current_timestamp,0,current_timestamp,'000'); 
    insert 
    into store.sym_node_security 
    (node_id,node_password,registration_enabled,registration_time,initial_load_enabled,initial_load_time, 
    created_at_node_id) values 
    ('002','5d1c92bbacbe2edb9e1ca5dbb0e481',1,null,1,current_timestamp,'000');
    -- note: the insert above sets initial_load_enabled to 1 to enable initial load
    insert into sym_node_group (node_group_id) values 
    ('store'); 
    commit;

    Be sure and verify that all the data has been properly loaded to the tables above. All tables named sym _ are required for replication to run properly.  Review the output and resolve any errors that occur.

    Next, set up the replication rules.  Below is the configuration we use:

    -- Configures the type of synch action, in this case we configure a Push 
    -- (‘P’) for store to store replication.
    insert into sym_node_group_link (source_node_group_id, target_node_group_id, 
    data_event_action) values ('store', 'store', 'P');
    --Configure a replication channel, this just says we will use the store 
    channel for replication
    insert into sym_channel (channel_id, processing_order, max_batch_size, 
    enabled, description) values('store_outpound_ch', 1, 100000, 1, 'outbound 
    channel');
    -- Configure one trigger for each table in this case, wildcards may be 
    used
    insert into sym_trigger (trigger_id, source_table_name, channel_id, 
    last_update_time, create_time,sync_on_incoming_batch) values 
    ('store_outbound_tr_t1', 'TEST1','store_outpound_ch',sysdate,sysdate,1);
    insert into sym_trigger (trigger_id, source_table_name, channel_id, 
    last_update_time, create_time,sync_on_incoming_batch) values 
    ('store_outbound_tr_t2', 'TEST2','store_outpound_ch',sysdate,sysdate,1);
    insert into sym_trigger (trigger_id, source_table_name, channel_id, 
    last_update_time, create_time,sync_on_incoming_batch) values 
    ('store_outbound_tr_t3', 'TEST3','store_outpound_ch',sysdate,sysdate,1);
    -- Router routes data from group to group.
    insert into sym_router (router_id,source_node_group_id,target_node_group_id,router_type,create_time,last_update_time) values('store_2_store', 
    'store', 'store', 'default',current_timestamp, current_timestamp);
    --Identifies which triggers use which routers. 
    insert into 
    sym_trigger_router (trigger_id,router_id,initial_load_order,last_update_time,create_time) values('store_outbound_tr_t1','store_2_store', 
    1, current_timestamp, current_timestamp);
    insert into 
    sym_trigger_router (trigger_id,router_id,initial_load_order,last_update_time,create_time) values('store_outbound_tr_t2','store_2_store', 
    1, current_timestamp, current_timestamp);
    insert into 
    sym_trigger_router (trigger_id,router_id,initial_load_order,last_update_time,create_time) values('store_outbound_tr_t3','store_2_store', 
    1, current_timestamp, current_timestamp);</P>

    Note that the sym_trigger table will default to sync on insert, update, or delete.  If you do not want to do that (eg, you may only want to track inserts and updates).

    Next, start the replication.

    ../bin/sym (from the engines subdirectory).  The initialization output will look something like this.  Watch for errors:

    [oracle@oelafd1 bin]

    $ ./sym

    Log output will be written to /home/oracle/symmetric-server-3.8.41/logs/symmetric.log


    [startup] – SymmetricWebServer – About to start SymmetricDS web server on host:port 0.0.0.0:31415


    [startup] – / – Initializing Spring root WebApplicationContext


    [store-002] – AbstractSymmetricEngine – Initializing connection to database


    [store-001] – AbstractSymmetricEngine – Initializing connection to database


    [store-001] – JdbcDatabasePlatformFactory – Detected database ‘Oracle’, version ’18’, protocol ‘oracle’


    [store-001] – JdbcDatabasePlatformFactory – The IDatabasePlatform being used is org.jumpmind.db.platform.oracle.OracleDatabasePlatform


    [store-002] – JdbcDatabasePlatformFactory – Detected database ‘Oracle’, version ’18’, protocol ‘oracle’


    [store-002] – JdbcDatabasePlatformFactory – The IDatabasePlatform being used is org.jumpmind.db.platform.oracle.OracleDatabasePlatform


    [store-002] – OracleSymmetricDialect – The DbDialect being used is org.jumpmind.symmetric.db.oracle.OracleSymmetricDialect


    [store-002] – StagingManager – The staging directory was initialized at the following location: /home/oracle/symmetric-server-3.8.41/tmp/store-002


    push request from store:001:001.


    [store-001] – PushService – Pushed data to node store:002:002. 1 data and 1 batches were processed


    [store-001] – StagingManager – Cleaning staging…


    [store-002] – StagingManager – Cleaning staging…


    [store-002] – StagingManager – Finished cleaning staging in 0 seconds.


    [store-001] – StagingManager – Finished cleaning staging in 0 seconds.


    [store-002] – RouterService – Routed 1 data events in 264 ms


    [store-002] – PushService – Push data sent to store:001:001


    [store-001] – DataLoaderService – 1 data and 1 batches loaded during push request from store:002:002.


    [store-002] – PushService – Pushed data to node store:001:001. 1 data and 1 batches were processed


    [store-001] – RouterService – Routed 1 data events in 330 ms


    [store-001] – PushService – Push data sent to store:002:002


    [store-002] – DataLoaderService – 1 data and 1 batches loaded during push request from store:001:001.


    [store-001] – PushService – Pushed data to node store:002:002. 1 data and 1 batches were processed

    The data events are data changes.

    At this point, the replication should be loaded.  Run a command on each instance to verify.

    On store1:

    select * from test1;

    1    NAME    ANDY

    10    NAME    ANDREW


    11    NAME    ANDREW

    On store2:

    select * from test1;

    1    NAME    ANDY

    10    NAME    ANDREW


    11    NAME    ANDREW

    On store1:

    insert into test1 (id,name,value) values (12,’LNAME’,’SMITH’);

    commit;

    select * from test1;

    1    NAME    ANDY

    10    NAME    ANDREW


    11    NAME    ANDREW


    12    LNAME    SMITH

    store2

    select * from test1;

    1    NAME    ANDY

    11    NAME    ANDREW


    10    NAME    ANDREW


    12    LNAME    SMITH

    insert into test1 (id,name,value) values (14,’LNAME’,’JONES’);

    commit;

    select * from test1;

    1    NAME    ANDY

    11    NAME    ANDREW


    10    NAME    ANDREW


    12    LNAME    SMITH


    14    LNAME    JONES

    store1:

    select * from test1;

    1    NAME    ANDY

    10    NAME    ANDREW


    11    NAME    ANDREW


    12    LNAME    SMITH


    14    LNAME    JONES


    Two way replication is working.  Test further with updates and deletes.

    In this blog we went over the creation of basic 2-way replication in SymmetricDS and Oracle.  Note that we have not discussed any of the classic problems of conflict resolution that will arise when running bi-directional replication.

    Installing OEM CC Patches (OEM Cloud Control 13.2)

    December 4, 2018

    Recently I was asked to patch an OEM Cloud Control 13.2 environment.  So, per the instructions, I updated the opatch to the latest version of OPatch for OEM, and I updated the omspatcher software to the latest omspatcher version.  Both of those steps are well documented in MOS.

    The patch I was installing was generic patch 28628403, which is the plugin patch for OEM Cloud Control.

    I followed all the instructions, extracted the patch, and of course the first thing I determined is the readme is incorrect.

    The readme for Oracle tells us to shut down cloud control.  However, if you do that you will discover that OMS and the database both must be up when you run omspatcher.

    Next, the instructions for applying the patch tell you to run the command:

    omspatcher apply [–property_file <location of property file>]

    Of course it doesn’t tell you what the properly file is. The property file simply contains login information for weblogic.  Its not really necessary, since omspatcher will prompt you for the username and password, so just use the command below:

    omspatcher apply

    When I tried running the command above, I received this output:

    $ omspatcher apply
    OMSPatcher Automation Tool
    Copyright (c) 2017, Oracle Corporation.  All rights reserved.

    OMSPatcher version : 13.8.0.0.3
    OUI version        : 13.9.1.0.0
    Running from       : /u01/app/oracle/middleware
    Log file location  : /u01/app/oracle/middleware/cfgtoollogs/omspatcher/opatch2018-12-04_12-36-08PM_1.log

    OMSPatcher failed: Location "/export/home/oracle/patch/28628403" is not a valid System patch location.
    Log file location: /u01/app/oracle/middleware/cfgtoollogs/omspatcher/opatch2018-12-04_12-36-08PM_1.log

    OMSPatcher failed with error code 73

    So that is really strange, why would we get that error message.  Research on the metalink site finally gave me the information.  It was a problem with the locale setting on the server. To fix it run these commands:

    export LC_ALL=en_GB.UTF-8
    export LANG=en_GB.UTF-8

    To see the full details, see metalink note 2321057.1.

    While the issue with the locale setting is the major problem, it appears that Oracle documentation for for omspatcher and Oracle cloud control in general is going down hill.  Oracle should direct some new resources at updating their documentation. 

    I chose to use oracle OEM CC 13.2 for this installation.  This was because the latest and greatest version, 13.3, seems to have multiple issues with backward compatibility that have never looked at or fixed.  In particular, we ran into issues trying to SSH the agent to SUN servers running older versions of Solaris.  Which is inexcusable since Oracle now owns that software.

    Oracle Restart On File System

    November 7, 2017

     

    Many people may not be aware of the fact that Oracle restart (OHAS) can be configured on databases that are storing their data on a file system. ASM is not required.

    The installation and configuration process took some work to figure out, since Oracle does not document too well, but I did track it down eventually.

    In my current environment, I have installed the Oracle database software and created a database using the silent installation method. See the blog here at houseofbrick.com: http://houseofbrick.com/oracle-12-2-0-1-silent-installation/ for instructions on how to do silent installation and database creation.

    Next, go to the Oracle website and download the the Grid Infrastructure software.  I noted that for some reason, Oracle has chosen to force you to download what amounts to an entire GI home, so you should place the zip file in the location of your grid directory. eg, if you intend to put the GI home in /u01/app/oracle/grid, place the zip file in /u01/app/oracle/grid.   Then unzip the file, eg:

    mkdir –p /u01/app/grid

    cp  linuxx64_12201_grid_home.zip /u01/app/grid

    unzip linuxx64_12201_grid_home.zip

    Next, install the the software.

    In the directory /u01/app/grid, run the file gridSetup.sh.  This file requires x-windows.

    Choose the setup software only option

     

    image

     

    On the next screen, select the server.  The server name should be configured in the hosts file.

    image

    On the next screen accept the defaults. 

    image

     

    When prompted, choose Yes.

    image

    Next verify the installation directory.  Its wrong, move the files and start over.  You cannot change it here.

    image

     

    Next, choose the appropriate option for the root scripts.

    image

    Verify the prerequisites, correct or ignore as necessary.

    image

    Choose install on the next screen.

    image

    Once completed, run the root scripts

     

     

    [root@rhel7 grid]# ./root.sh
    Performing root user operation.

    The following environment variables are set as:
        ORACLE_OWNER= oracle
        ORACLE_HOME=  /u01/app/grid/grid

    Enter the full pathname of the local bin directory: [/usr/local/bin]:
    The contents of "dbhome" have not changed. No need to overwrite.
    The contents of "oraenv" have not changed. No need to overwrite.
    The contents of "coraenv" have not changed. No need to overwrite.

    Entries will be added to the /etc/oratab file as needed by
    Database Configuration Assistant when a database is created
    Finished running generic part of root script.
    Now product-specific root actions will be performed.

    To configure Grid Infrastructure for a Cluster or Grid Infrastructure for a Stand-Alone Server execute the following command as oracle user:
    /u01/app/grid/grid/gridSetup.sh
    This command launches the Grid Infrastructure Setup Wizard. The wizard also supports silent operation, and the parameters can be passed through the response file that is available in the installation media.

    At this point, the GI has been installed.  Note that we had to install only because otherwise the installer will assume that you want to install an ASM instance, and there is now way around that through the installer.

    So, our next step is to enable oracle HAS. 

    su – root.  Set your environment to your newly installed GI home.  In my case, to make things easier, I put this entry in the oratab:

    grid:/oracle/app/grid:N

    So as root, I just run oraenv and choose grid:

    [root@rhel7 grid]# . oraenv
    ORACLE_SID = [root] ? grid
    The Oracle base has been set to /u01/app/oracle

    Next, run this command to configure and enable has for a file system:

    $ORACLE_HOME/perl/bin/perl-I$ORACLE_HOME/perl/lib -I$ORACLE_HOME/crs/install $ORACLE_HOME/crs/install/roothas.pl

    The output should look like this:

    root@rhel7 ~]# $ORACLE_HOME/perl/bin/perl -I$ORACLE_HOME/perl/lib -I$ORACLE_HOME/crs/install $ORACLE_HOME/crs/install/roothas.pl
    Using configuration parameter file: /u01/app/grid/grid/crs/install/crsconfig_params
    The log of current session can be found at:
      /u01/app/oracle/crsdata/rhel7/crsconfig/roothas_2017-11-06_02-32-44PM.log
    2017/11/06 14:32:49 CLSRSC-363: User ignored prerequisites during installation
    LOCAL ADD MODE
    Creating OCR keys for user ‘oracle’, privgrp ‘oinstall’..
    Operation successful.
    LOCAL ONLY MODE
    Successfully accumulated necessary OCR keys.
    Creating OCR keys for user ‘root’, privgrp ‘root’..
    Operation successful.
    CRS-4664: Node rhel7 successfully pinned.
    2017/11/06 14:33:31 CLSRSC-330: Adding Clusterware entries to file ‘oracle-ohasd.service’
    2017/11/06 14:35:08 CLSRSC-400: A system reboot is required to continue installing.

    Next, per the instructions above, reboot the server.

    Once the server has been restarted, as root, start the high availability service:

    [root@rhel7 grid]# . oraenv
    ORACLE_SID = [root] ? grid
    The Oracle base has been set to /u01/app/oracle

    crsctl start has

    Now, as oracle enable has:

    crsctl enable has

    Finally, add the database and to OHAS:

    srvctl add db -db silent -startoption open -oraclehome /u01/app/oracle/product/12.2.0.1/dbhome_1
    srvctl start db -d silent
    srvctl add listener -l listener -o $ORACLE_HOME
    srvctl start listener

    At this point, you have successfully installed Oracle HAS (High Availability Services) for a database running on file system.

    DBA Security Frustration

    February 3, 2016

    One of the most common frustrations for the Oracle DBA is dealing with security.  There are many common DBA tasks that require root or administrator access, and acquiring that access can be a frustrating and time consuming task, especially in larger organizations.

    In the world of virtualization (sometime called the private cloud), those security concerns are much easier to handle, even though the security teams  in large organizations have not caught up with that fact in most cases.  Even the primary providers of the private cloud, VMware, seem to be unaware of the how a major side-effect of their product is a real reduction in security concerns.

    A connection to a Virtual Machine (VM) looks exactly like a connection to a physical machine, whether Unix or Windows based.  And when a user is logged into a VM, as with a physical machine, the user can only see and access what is on that particular VM.  This seems obvious, so what is my point, right? 

    My point is this:  virtualization allows an organization to create a VM for each individual application within an organization.  The native networking components of virtualization products allow for even more isolation.  Typically, in a large organization, the intent of security is to prevent an individual from accessing data outside his or her particular area of application.  Meaning, for example, the DBA in charge of medical records database should not be able to access a financial database 

    Virtualization allows the database for health care records to reside on one VM, while financial data can reside on another VM, even on another network, without increasing the Oracle licensing costs. 

    The use of VM’s allows the DBA in charge of health care records database to be restricted to the VM or VM’s on which that data resides.  He or she can be granted full root and Oracle access on that server, without any risk of the DBA being able to access the financial database, which he is not authorized to see.

    In order to access the financial data, the health care records DBA would need to acquire a completely different set of passwords and access, which is easy to restrict at the server level.  This means that the security team can grant the DBA full access to every server that his or her data is on, without risking access to data to which he is not entitled. Although the same strategy could be used in the physical world, the cost would soon become prohibitive and extremely inefficient in terms of hardware resources.

    Of course this does not alleviate all security concerns. There is always the possibility of a fully compromised server. This is most often the result of an external attack rather than internal. However, a fully compromised server inside the company firewall can be a nightmare. And as the number of servers increase, whether physical or virtual, the chance of a compromised server increases. On the other hand, the chance that a virtual server becomes fully compromised is roughly the same as that of a physical server, regardless of the role based security being used.

    In addition, the risks of a compromised server can be minimized by the appropriate use of physical and virtual firewall appliances, and appropriate rules to segregate servers by function. VMware NSX is specifically designed for this sort of network micro-segmentation and isolation, and is the logical method for achieving the required network isolation within a VMware environment.

    The use of virtualization as a security method in and of itself is a completely new paradigm for the typical security team.  Normally separation of duties would require that the DBA either have no root access at all, or be restricted to specific commands in order to ensure that they cannot access information to which they are not entitled.  But with the appropriate use of VM’s, the DBA can have full and unfettered access to the server with their data, without putting any other data at risk.

    This does not absolve the DBA of basic responsibilities of course.  When installing Oracle patches that require root access, they must still take the basic precautions, backing up the database, snapshotting the operating system drive and perhaps the oracle binaries installation, and taking any other basic precautions that are required.  The DBA should also take the time to learn some operating system basics, and perhaps chat with a system administrator if they are not sure of what they are doing.  At the same time, this high level of access should allow the DBA to get their job done more quickly, and with much less frustration.

    In summary, VMware and other suppliers for the private cloud have unintentionally provided an easy and effective method of improving organizational security, with very little additional effort required.