Friday, October 31, 2014

Core Dumps explained

Introduction
~~~~~~~~~~~~
This short article aims explain how to get a stack trace from a
core dump produced by any of the Oracle products. By following 
the steps below you can provide Oracle Support with vital 
information to help identify the cause of a problem.

Please note that it is important to include information about the
tool being used, any code involved, the operation being performed,
environment etc.. in addition to the details below.


What is a 'core dump' ?
~~~~~~~~~~~~~~~~~~~~~~~ 
A core dump is an image copy of a processes state at the instant
it 'aborted'. It is produced in the form of a file called 'core' 
usually located in the current directory. 


What causes a core dump ?
~~~~~~~~~~~~~~~~~~~~~~~~~
There are many situations which can cause a core dump to be produced, 
but it is usually because the process has attempted to do something
which the operating system does not like. The most common causes
of this are:

The program tried to access memory outside its allowed range.

The program tried to obtain a resource which was either
exhausted or unavailable.

An attempt was made to execute illegal instructions.

An attempt was made to read unaligned data

In Unix systems the offending process is sent one of a number of
signals which force a core dump to be produced. It is also possible
for a user to produce a core dump by sending one of these signals
to a process manually. 


What should I do if I get a core dump ?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As with any problem you should first note down the FULL version
numbers of the product, the RDBMS, PL/SQL (if used) and any 
related products.

You should also note the EXACT command you were running when 
this occurred. Eg: If it was a SQL*Forms problem and you were
using 'mrunform30', write this down. This command will be referred
to as 'program' below. 

Now follow the instructions below in order:

1) Check you have a 'core' file, it should be in the directory where 
the command was issued, or in $ORACLE_HOME/dbs OR
$ORACLE_HOME/dbs/core_NNNNN if it is the 'oracle' executable. 


2) Log in as ORACLE and change in to the $ORACLE_HOME/bin
directory. Enter the command:

file program

and write the result down letter for letter. If the word 'dynamic' 
or 'dynamically linked' appears in the output of this command
then please make a note of this as there are a few platforms on
which Oracle does NOT support dynamic linking and this may be
the cause of your problem.


3) Now enter:

chmod +r program

to add read permission to the program. 


4) Log out , then log in as the user who encountered the error.
The next step will vary slightly depending on which version of
Unix you are using. One of the following commands should exist
on your machine - try each in order until you find one that exists:

Command NB Exit command Stack Trace command
------- -- ------------ -------------------
dbx quit where
xdb (HPUX 10) quit t
gdb (HPUX 11) q bt
sdb q t 
adb $q (or Ctrl-D) $c
debug (PTX only) quit stack
gdb (Linux) quit bt

Change to the directory where the core dump is located and enter
the commands as in the relevant example below. If you are not
sure which program produced the 'core' file then on some Unix 
platforms the command 'file core' will tell you the executable
name that the core file is from (this does not work on ALL 
Unix platforms). 
Example commands:

DBX: $ script /tmp/mystack
$ dbx $ORACLE_HOME/bin/program core 
(dbx) where
... << Stack should appear here
(dbx) quit
$ exit

XDB: $ script /tmp/mystack
$ xdb $ORACLE_HOME/bin/program core 
(xdb) t 
... << Stack should appear here
(xdb) quit
$ exit

SDB: $ script /tmp/mystack
$ sdb $ORACLE_HOME/bin/program core 
(sdb) t 
... << Stack should appear here
(sdb) q
$ exit

(NOTE: In the 'adb' commands below literally type the $c & $q)
ADB: $ script /tmp/mystack
$ adb $ORACLE_HOME/bin/program core 
$c << NB: adb has no prompt so just enter $c
... 
$q
$ exit

DEBUG: $ script /tmp/mystack
$ debug -c core $ORACLE_HOME/bin/program 
debug> stack
... << Stack should appear here
debug> quit
$ exit

GDB: $ script /tmp/mystack
$ gdb $ORACLE_HOME/bin/program core
(gdb) bt
... << Stack should appear here
(gdb) quit
$ exit

Assuming this worked then the stack trace should be shown in the
file '/tmp/mystack'. Either FAX or EMAIL this to Oracle Support
with any other details collected above making sure you include the 
problem log number.

5) If the debug command failed to give a stack trace then try using
a different debugger from the list above (if available).
If all debuggers fail then there is probably a problem with
either the permissions or the file type - see the section below
and then contact Oracle Support with all the details you have so far.


Common reasons for not getting a sensible stack
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Filesize Limits:
Note that on some machines there may be a kernel parameter or
user limit which controls the maximum size of core file that 
can be produced - you can usually check this by typing:

limit in the C shell
OR ulimit -a in the Bourne / Korn shells. 

If this limit is too small the core file will be useless - 
raise the limit and reproduce the problem.

Stripped Executable
Some program executables are stripped of symbol information.
This makes the stack trace useless. If 'file program' shows
the word 'stripped' or 'nm program' shows no output then it
is likely that the executable is stripped of symbolic information.
In this case the problem tool must be relinked without being
stripped - on most Unix platforms this involves ensuring there is
no '-s' option on the link line. Contact Oracle Support with
details of the link line used to link the tool.

HP Unix 
Some platforms like HP Unix need a special object file linking
in at link time to ensure symbols in shared objects can be
reported by the debug tool. Typically this involves relinking the
tool including /usr/lib/end.o on the link line. The location of
this special file may be different depending on your HPUX 
version. 'xdb' generally tells you the location of this file
if it was not linked into the executable.

No comments:

Post a Comment