Sparc Traps under SunOS

SPARC traps under SunOS

By: Jim Moore, SunSoft, Sun Microsystems Inc
Email: Jim.Moore@UK.Sun.COM
Version: 1.2
Date: 12 September 1997

Chapter 4

4 DEBUGGING KERNEL TRAPS

Sometimes things go wrong and your machine panics. We often see messages on Usenet or help aliases about getting a "Bad Trap" panic and a load of hex with a plea for help.

In most cases, it is usually necessary to raise the issue with your support channel so that we at Sun get to look at the problem. This is simply because we have the source code and that, combined with a crash dump from your system, helps us to diagnose the problem.

However, if you have the source code or you want to know more about how to decipher these panic messages, then we should have a look at this here.

4.1 Finding Out What Went Wrong

The first objective in understanding the problem is to make sure we have all the information to hand. A crash dump is a pre-requisite if you are to do any real debugging. When the kernel panics, it writes the core image of the kernel (including modules and active pages of physical memory) to the primary swap device which we can retrieve using the savecore command, so make sure that savecore is enabled in the file /etc/init.d/sysetup. Simply uncomment the line containing the "savecore" command. Alternatively, if a panic has occurred and you haven't had a chance to enable it, you can run the command manually and, with luck, you can still get it out from the swap partition. Naturally, if you leave it to long so that the OS has started to use the swap device, you run the risk of damaging part or all of the crash dump image so it's a good idea to get it out as soon as possible. Once we have the crash dump, we can start to decipher that "BAD TRAP" message we saw when the machine went down.

4.1.1 Deciphering a BAD TRAP panic message

We can get a lot of information from this output. Don't worry if you don't have a copy of it from the console or /var/adm/messages because you can still see it in your crash dump. Run the "$<msgbuf" macro under adb to get it out. Let's see an example which I can talk you through:

BAD TRAP: type=1 rp=f0586c1c addr=25732564 mmu_fsr=164 rw=3
mv: Text fault
kernel read fault at addr=0x25732564, pme=0x0
MMU sfsr=164: Invalid Address on supv instr fetch at level 1
pid=14825, pc=0x25732564, sp=0xf0586c68, psr=0x40000c6, context=0
g1-g7: 10bf8484, 8000000, ffffffff, 0, 0, 1, fc44de00
Begin traceback... sp = f0586c68
Called from fc3616bc, fp=f0586cd0, args=fc42ac50 e226f000 e226f0ff 1 ffffffff 3e8
Called from fc361588, fp=f0586d98, args=fc4db30c 0 fc42ac08 fc42ac00 fc42ac50 fc4db350
Called from f00e7e24, fp=f0586e00, args=fc4db30c fc087415 fc42ac08 fc378c14 fc4ef480 200
Called from f0072054, fp=f0586ed8, args=0 effff5a4 21c0001 fc42ac08 20 effffaeb
Called from f0041aa0, fp=f0586f58, args=f0160fe8 fc6d0cd4 0 fc6d0cb0 fffffffc ffffffff
Called from 11700, fp=efffd050, args=effffaeb effff5a4 23150 1 0 0
End traceback...

This is an example of a "Bad Trap" from a sun4m running Solaris 2.3. The first few lines give us information about the type of trap that caused the panic. The fields shown in the first line have the following meanings:

rp

When the system panics, a copy of the registers at the time of the trap are saved away and this field is showing us the address where these registers are saved (rp = registers pointer). We can use this address with the "regs" adb macro to read them out.

addr

This field shows us the address that caused the text or data fault.

type

The trap type. Use the appendices of this document to decode this value. In this case the type is 0x1, which is a text fault.

rw

This is the "read/write" disposition of the access that caused the trap and can have the following values:

0: Unknown or address not touched
1: Read access attempted
2: Write access attempted
3: Execute access attempted
4: Create (if page doesn't exist). You shouldn't see this in a panic, as it's used to tell the kernel that a page create is required.

mmu_fsr

This is the value of the MMU's Fault Status Register, which can be decoded to give a more precise idea of what we were trying to do at the time of the trap. The encoding of the FSR naturally varies between MMU architectures and in this case we know it's a sun4m so we can use the definitions in /usr/platform/sun4m/include/sys/mmu.h to decode the value, which is a logical OR of the following definitions:

0x100: This is the MMU level of the fault. The SRMMU has levels of page tables and the level of page table at which the fault occurred is written into the FSR at bits 9 and 10. If we AND the FSR with 0x300 we get the level value which, in this example, gives us 0x100. We shift this value right by 8 bits to get the page table level number, in this case '1'.
0x020: SUPV (Supervisor) Access
0x040: INSTR (Instruction) Access
0x004: Invalid Address

Fortunately, we don't need to remember the various architecture specific trivia to decode this as the kernel will do it for us. As we can see on the following lines, it tells us that this was a kernel read fault and at what address it occurred. On the next line, it expands the MMU status register and in this case it tells us that it was an invalid address on a supv (supervisor) instruction fetch at level 1 (of the page table MMU hierarchy). Still, at least from the above description you can see how these fields are decoded.

The next line under the MMU decoding gives us the process id of the process that was running on that CPU when the trap occurred. It also gives us the values of the program counter (PC) which contains the address of the instruction that caused the trap. In this case, we can see that the address is the same as the address that the MMU information told us was bad. We also know from the MMU information that the problem was trying to fetch an instruction from an illegal address and so it is obvious that in this case, our PC contains a bogus address and that is what has caused the trap.

The other fields of interest on this line are sp and psr which give us the stack pointer and the processor status register value respectively. We can use the stack pointer to unwind the sequence of function calls to help diagnose the problem, although the kernel does that for us just underneath.

Next we get the global variables, from %g1 onwards. There's no point in dumping %g0 because it is hardwired to always be zero.

Finally, the stack traceback. When we call a function through the CALL instruction, the address of that CALL is written into register %o7. If the function we're calling does a SAVE to change register windows and reserve some stack space (see section 3.2 on Register Windows) then the caller's register frame gets saved on the stack and we can back-track through the stack to see the sequence of calls by looking at the address where the contents of the registers is stored on the stack. At this point, we might as well look at how to manually walk through a stack to get the call trace out.

4.1.2 Manually Extracting the Stack Trace

Okay, so we have our crash dump and we want to find out a bit more about what went wrong. We can often get a picture of what happened from the stack trace of function calls. For example, using the crash dump that accompanies the example messages shown in the previous section, if we look at this crash dump with adb we can try and see what this stack trace was:

# adb -k unix.0 vmcore.0
physmem 1e68
$c
complete_panic(0xf00494d8,0xf0586b04,0xf0586990,0xfc255600,0x0,0x1) + 10c
do_panic(?) + 1c
vcmn_err(0xf015f7e8,0xf0586b04,0xf0586b04,0x80,0xffeec000,0x3)
cmn_err(0x3,0xf015f7e8,0xf015fc00,0x35,0x35,0xf0152400) + 1c
die(0x1,0xf0586c1c,0x25732564,0x164,0x3,0xf015f7e8) + 78
trap(0x1,0xf0586c1c,0xf0182520,0x164,0x3,0xfc34b9c0) + 454
fault(0xfc42ac50,0xfc,0x25732564,0xf015379c,0x200,0x2) + 84

Here I've used adb's call trace command ($c) in an attempt to find out the sequence of calls leading up to the panic. Unfortunately, we don't see anything useful because all we have on the active stack is the sequence of calls from the "fault" trap handler upwards. What we need is the stack pointer at the time of the trap itself and fortunately that is one of the registers saved when a panic occurs. In the previous section we saw that the 'rp' field of the panic message gives us the address of the saved registers. Don't worry if you don't happen to have the messages to hand because it just so happens that the pointer to the saved registers is passed to the trap() and die() routines as the second argument. The arguments to functions (the values of %i0 thru %i5) are shown in the brackets and so we can see that the pointer to the saved registers is, in this case, 0xf0586c1c. So we take that value and run the "regs" adb macro against it:

0xf0586c1c$<regs
0xf0586c1c:     psr             pc              npc
	40000c6         25732564        25732568
0xf0586c28:     y               g1              g2              g3
	548dd520        10bf8484        8000000         ffffffff
0xf0586c38:     g4              g5              g6              g7
	0               0               1               fc44de00
0xf0586c48:     o0              o1              o2              o3
	fc42ac50        fc              25732564        f015379c
0xf0586c58:     o4              o5              o6              o7
	200             2               f0586c68        f006ea88

Here we can see the values of the registers at the time the trap occurred. The address of the caller is in %o7 and our frame pointer is in %o6. Let's take a look at who called us by looking at the instruction at the address in %o7:

f006ea88?ia
mutex_adaptive_enter+0x2c:      jmpl    %o2, %o7

Okay, so we can see that the mutex_adaptive_enter routine tried to jump to the address in %o2 and, if we look back at the saved registers above, we can see that the address in %o2 is the same as the one in %pc...the address we paniced on! In fact if we have the source we can see that in this case someone has corrupted the function pointer used by the mutex code to point to the lock enter routine.

If necessary we could trace back further by walking on through the stack because when we get window overflow, the window contents are saved on the stack. This means that we can trace our calls through as many levels as there are saved windows on the stack. The format of a stack frame in SunOS looks like this:

   %fp->|                               |
        |-------------------------------|
        |  Locals, temps, saved floats  |
        |-------------------------------|
        |  outgoing parameters past 6   |
        |-------------------------------|-\
        |  6 words for callee to dump   | |
        |  register arguments           | |
        |-------------------------------|  > minimum stack frame
        |  One word struct-ret address  | |
        |-------------------------------| |
        |  16 words to save IN and      | |
   %sp->|  LOCAL register on overflow   | |
        |-------------------------------|-/

This means that by dumping the first 16 words on the stack, we can extract the saved window's %o6 and %o7 to see the previous stack pointer and the address of the calling instruction respectively. Here's an example of this manual walk-through taking the above %o6 from the saved registers as a starting point:

f0586c68,10/X
0xf0586c68:     0               f0586ed0        f016574c        f0586eb8
                0               f00f4730        0               1
                fc4db30c        fc087415        fc42ac08        fc378c14
                fc4ef480        200             f0586e00        f00e7e24

Here we have the saved local and in registers from the previous window. The first two rows contain registers %l0 through %l7 and the last two rows contain %o0 through %o7. Let's take a peek at the instruction that called us by using the address in the saved %o7:

f00e7e24?ia
vn_rename+0x14c:                jmpl    %g1, %o7
vn_rename+0x150:

We can also take the value in the saved %o6, shown immediately to the left of the saved %o7 above and repeat the procedure to look at the previous caller:

f0586e00,10/X
0xf0586e00:     0               9               fc68490c        20e
                fc36004c        14f             fc06ce7c        21c0001
                0               effff5a4        21c0001         fc42ac08
                20              effffaeb        f0586ed8        f0072054
f0072054/ia 
syscall+0x3ec:  jmpl    %g1, %o7
syscall+0x3f0:
f0586ed8,10/X
0xf0586ed8:     1               0               f0156d28        0
                fc28b000        86              f0586fb4        fc6d0800
                f0160fe8        fc6d0cd4        0               fc6d0cb0
                fffffffc        ffffffff        f0586f58        f0041aa0
f0041aa0?ia
_sys_rtt+0x4d8: call    syscall

This shows you how you can back-track through the stack to rebuild a call trace. Sometimes, the value in %o6 cannot be dumped and this happens when the panic has occurred in such a way that the register windows have not been able to be saved to the stack. Normally when we panic, we flush all the windows to the stack to avoid this problem.

The fact that saved register window frames on the stack allows us to peek back at the local and input registers together with the call trace helps us to build up a picture of the logic flow. The hardest part of this is correlating the assembler dumps of the affected functions with the source code. The reason this is often quite tricky is that the kernel is compiled with optimisation enabled so that the sequence of events in the 'C' source code doesn't seem to relate to what you see when you look at the assembler for the function, especially when loop unfolding has done it's evil work! To make matters worse, there are a great many macro's in the source which further convolute matters. However, don't be put off. Have a go at figuring out the sequence of events, even if you don't have the source code because it can be very useful for debugging your own device drivers, as you will have the source for those.

More information on crash dump analysis can be found in a book called PANIC: UNIX System Crash Dump Analysis by my friend and colleague, Kimberley Brown.

Back to table of contents