Oracle Trace/Log File Cleanup using ADRCI

Due to time constraints, I have not completed my intended entry on the ASM Filter Driver (ASMFD). In the interim, I have written a short post on trace/log file clean up using the latest Oracle utility, ADRCI.

I recently posted a blog about trace/log file clean up in Oracle. The methodology I gave would work for all Oracle versions. However, starting with Oracle 11gR1, Oracle has released a program called adrci that can be used to clean up the log and trace files much more easily. The usage is similar to the OS supplied log file rotation utility logrotate, except it also cleans up trace files. Adrci does not however clean up audit files or the listener log files.

ADRCI stands for the Automatic Diagnostic Repository Command Interpreter. As the name implies, there is evidently much more to this utility than just the ability to clean up old trace files and trim log files. That will be the subject of another post. This post discusses cleaning up log and trace files, which is a very common problem on Oracle servers.

Starting in Oracle 11gR1, the Oracle approved method of removing trace files and cleaning database alert logs is to use adrci, but OS commands must still be used for the listener logs and the audit files. I wrote a shell script using a combination of these methods for cleaning up 11gR2 environments.

The assumptions: the code assumes that the instances are up and running, and that there are entries for the instances in the oratab. In a clustered environment, it is expected that the instance (1, 2, etc) have an entry in the oratab as well as the database name. eg, if there are two cluster nodes for a database named orcl, in node 1 there would be entries for both orcl and orcl1 in the oratab on node one, and in the oratab on node two we would see orcl and orcl2.

Next, we write the script. The first segment is basically bookkeeping, but note that we have to translate the number of days supplied in the first command line argument to minutes in the script, as adrci will expect minutes as the argument on the retention time. The second command line argument is the retention time for the listener log files. The script could easily be modified to accept additional retention requirements based on your specific needs:

# Description
# Shell script to clean up oracle diagnostic files using ADRCI, and to remove log data.
# Allows 2 arguments, $1 – retention time for trace and audit data, $2 – retention time for listener
# log files. Could be enhanced for multiple retention periods. 
###
#set -x
PATH=/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/oracle/bin
export PATH
if [[ -n "$1" ]]; then
 if [ $1 -ne 0 -o $1 -eq 0 2>/dev/null ]
 then
   if [[ $1 -lt 0 ]]; then
     echo invalid input
     exit 1
   else
     days=$1
     minutes=$((1440 * $days))
    echo days=$days
     echo minutes=$minutes
   fi
 fi
else
 echo days=7
 days=7
 minutes=$((1440 * $days))
 echo days=$days
 echo minutes=$minutes
fi
if [[ -n "$2" ]]; then
 if [ $2 -ne 0 -o $2 -eq 0 2>/dev/null ]
 then
   if [[ $2 -lt 0 ]]; then
     echo invalid input
     exit 1
   else
     log_days=$1
     echo log_days=$days
   fi
 fi
else
 echo log_days=30
 log_days=30
 echo log_days=$log_days
fi
SERVER=`hostname -s`
FDATE=`date +%d_%m_%y`
# Check user is oracle
USERID=`/usr/bin/id -u -nr`
if [ $? -ne 0 ]
then
       echo "ERROR: unable to determine uid"
       exit 99
fi
if [ "${USERID}" != "oracle" ]
then
       echo "ERROR: This script must be run as oracle"
       exit 98
fi
echo "INFO: Purge started at `date`"

Next, we use the ps command to identify the running oracle instances and loop through them. We grep for the smon process for each running instance, and remove everything up to and including the underscore in the process name to get the database sid. After extracting the sid, we set the environment for that ORACLE_SID using oraenv:

for ORACLE_SID in `ps -e -o "cmd" | grep smon|grep -v grep| awk -F "_" '{print$3}'`
do
   # uncomment 2 lines below if RAC environment and individual sids are not in oratab
   #   SID=`echo $ORACLE_SID | sed -e 's/1//g'`
   #   ORACLE_HOME=`cat /etc/oratab|grep ^$SID:| head -n 1 | cut -f2 -d':'`
 ORAENV_ASK=NO
 export ORAENV_ASK
 export ORACLE_SID
 echo $ORACLE_SID
 . /usr/local/bin/oraenv
 echo SID=$ORACLE_SID
 echo "INFO: purge started at `date`"

Next, we identify the location of the audit files and clean them up. This section is identical to my previous clean up script as adrci will not clean up audit files. Note, it is possible that your environment requires that you move these elsewhere for auditing purposes rather than remove them. These are not the standard Oracle database auditing, these are the records of everything done on the instance using the sys privilege:

 
AUDIT_DEST=`$ORACLE_HOME/bin/sqlplus -s "/ as sysdba" <<!
 set feedback off heading off verify off
 select value from v\\$parameter where name='audit_file_dest';
!`
 finaud=`echo $AUDIT_DEST | sed -e 's|\?|'"$ORACLE_HOME"'|'`
 /usr/bin/find $finaud -name \*.aud -mtime +$days
 /usr/bin/find $finaud -name *.aud -mtime +$days | xargs -i ksh -c "echo deleting {}; rm {}"
 echo $finaud
 # Purge ADR contents

Next we run the adrci command to delete the diagnostic information. In the current script, the retention is the same for all file types. It would be fairly easy to modify the script to accept multiple retention levels based on the type of data.

 echo "INFO: adrci purge started at `date`"
 adrci exec="show homes"|grep -v : | while read file_line
 do
   echo "INFO: adrci purging diagnostic destination" $file_line
   echo "INFO: purging ALERT older than $1 days."
   adrci exec="set homepath $file_line;purge -age $minutes -type ALERT"
   echo "INFO: purging INCIDENT older than $1 days."
   adrci exec="set homepath $file_line;purge -age $minutes -type INCIDENT"
   echo "INFO: purging TRACE older than $1 days."
   adrci exec="set homepath $file_line;purge -age $minutes -type TRACE"
   echo "INFO: purging CDUMP older than $1 days."
   adrci exec="set homepath $file_line;purge -age $minutes -type CDUMP"
   echo "INFO: purging HM older than $1 days."
   adrci exec="set homepath $file_line;purge -age $minutes -type HM"
   echo ""
  echo ""
 done
done
echo
echo "INFO: adrci purge finished at `date`"

Finally, we trim the listener log since this not done properly through adrci:

# for whatever reason, adrci doesn't like to remove the listener trace
# log, so we need to get it manually
for listener_log in `find $ORACLE_BASE/diag/tnslsnr -name "listener.log"`
do
 listener_file=`echo "$listener_log" | awk -Ftrace/ '{print $2}'`
 echo $listener_log
 echo $listener_file
 fname="${listener_log}_`date '+%Y%m%d'`.gz"
 fname1="${listener_log}_`date '+%Y%m%d'`"
 echo $fname
 if [ -e $fname ]
 then
   echo "Already cleared $listener_log today"
 else
   cp $listener_log $fname1
   gzip $fname1
   /usr/bin/find $ORACLE_BASE/diag/tnslsnr -name ${listener_file}*.gz -mtime +$log_days | xargs -i ksh -c "echo deleting {}; rm {}"
   echo > $listener_log
 fi
done
echo "SUCC: Purge completed successfully at `date`"
exit 0

Thus we have purged oracle log, trace, and audit files. Below is the script unedited with the commentary removed:


# Description
# Shell script to clean up oracle diagnostic files using ADRCI, and to remove log data.
# Allows 2 arguments, $1 – retention time for trace and audit data, $2 – retention time for listener
# log files. Could be enhanced for multiple retention periods.
###
#set -x
PATH=/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/oracle/bin
export PATH
if [[ -n "$1" ]]; then
 if [ $1 -ne 0 -o $1 -eq 0 2>/dev/null ]
 then
   if [[ $1 -lt 0 ]]; then
     echo invalid input
     exit 1
   else
     days=$1
     minutes=$((1440 * $days))
     echo days=$days
     echo minutes=$minutes
   fi
 fi
else
 echo days=7
 days=7
 minutes=$((1440 * $days))
 echo days=$days
 echo minutes=$minutes
fi

if [[ -n "$2" ]]; then
 if [ $2 -ne 0 -o $2 -eq 0 2>/dev/null ]
 then
   if [[ $2 -lt 0 ]]; then
     echo invalid input
     exit 1
   else
     log_days=$1
     echo log_days=$days
   fi
 fi
else
 echo log_days=30
 log_days=30
 echo log_days=$log_days
fi
SERVER=`hostname -s`
FDATE=`date +%d_%m_%y`

# Check user is oracle
USERID=`/usr/bin/id -u -nr`
if [ $? -ne 0 ]
then
       echo "ERROR: unable to determine uid"
       exit 99
fi
if [ "${USERID}" != "oracle" ]
then
       echo "ERROR: This script must be run as oracle"
       exit 98
fi
echo "INFO: Purge started at `date`"
# Establish some oracle enviroment
for ORACLE_SID in `ps -e -o "cmd" | grep smon|grep -v grep| awk -F "_" '{print$3}'`
do
   # uncomment 2 lines below if RAC environment and individual sids are not in oratab
   #   SID=`echo $ORACLE_SID | sed -e 's/1//g'`
   #   ORACLE_HOME=`cat /etc/oratab|grep ^$SID:| head -n 1 | cut -f2 -d':'`
 ORAENV_ASK=NO
 export ORAENV_ASK
 export ORACLE_SID
 echo $ORACLE_SID
 . /usr/local/bin/oraenv
 echo SID=$ORACLE_SID
 AUDIT_DEST=`$ORACLE_HOME/bin/sqlplus -s "/ as sysdba" <<!
 set feedback off heading off verify off
 select value from v\\$parameter where name='audit_file_dest';
!`
 finaud=`echo $AUDIT_DEST | sed -e 's|\?|'"$ORACLE_HOME"'|'`
 /usr/bin/find $finaud -name \*.aud -mtime +$days
 /usr/bin/find $finaud -name *.aud -mtime +$days | xargs -i ksh -c "echo deleting {}; rm {}"

 echo $finaud
 # Purge ADR contents
 echo "INFO: adrci purge started at `date`"
 adrci exec="show homes"|grep -v : | while read file_line
 do
   echo "INFO: adrci purging diagnostic destination" $file_line
   echo "INFO: purging ALERT older than $1 days."
   adrci exec="set homepath $file_line;purge -age $minutes -type ALERT"
   echo "INFO: purging INCIDENT older than $1 days."
   adrci exec="set homepath $file_line;purge -age $minutes -type INCIDENT"
  echo "INFO: purging TRACE older than $1 days."
   adrci exec="set homepath $file_line;purge -age $minutes -type TRACE"
   echo "INFO: purging CDUMP older than $1 days."
   adrci exec="set homepath $file_line;purge -age $minutes -type CDUMP"
   echo "INFO: purging HM older than $1 days."
   adrci exec="set homepath $file_line;purge -age $minutes -type HM"
   echo ""
   echo ""
 done
done
echo
echo "INFO: adrci purge finished at `date`"
# All completed
# for whatever reason, adrci doesn't like to remove the listener trace
# log, so we need to get it manually
for listener_log in `find $ORACLE_BASE/diag/tnslsnr -name "listener.log"`
do
 listener_file=`echo "$listener_log" | awk -Ftrace/ '{print $2}'`
 echo $listener_log
 echo $listener_file
 fname="${listener_log}_`date '+%Y%m%d'`.gz"
 fname1="${listener_log}_`date '+%Y%m%d'`"
 echo $fname
 if [ -e $fname ]
 then
   echo "Already cleared $listener_log today"
 else
   cp $listener_log $fname1
   gzip $fname1
   /usr/bin/find $ORACLE_BASE/diag/tnslsnr -name ${listener_file}*.gz -mtime +$log_days | xargs -i ksh -c "echo deleting {}; rm {}"
   echo > $listener_log
 fi
done
echo "SUCC: Purge completed successfully at `date`"
exit 0

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: