Thursday, November 20, 2008

Kernel Panic - Oh No!

Well, it appears that the newly upgraded V240 that I was so impressed with crashed last night. It came right back up and hasn't had any issues since, but the fact that it happened at all is disturbing. There was only one user connected, and one job running at the time. Backups were running too. If anyone out there is proficient with picking through dump files, here's some mdb output for you to enjoy:

# dumpadm
Dump content: kernel pages
Dump device: /dev/zvol/dsk/rpool/dump (dedicated)
Savecore directory: /var/crash/sunfire
Savecore enabled: yes
# cd /var/crash/sunfire/
# ls
bounds unix.0 vmcore.0
# mdb 0
Loading modules: [ unix genunix specfs dtrace zfs sd pcisch ip hook neti sctp arp usba fcp fctl qlc nca lofs mpt md cpc random crypto wrsmd fcip logindmux ptm ufs sppp nfs ]
> ::status
debugging crash dump vmcore.0 (64-bit) from sunfire
operating system: 5.10 Generic_137137-09 (sun4u)
panic message: BAD TRAP: type=31 rp=2a1009768e0 addr=0 mmu_fsr=0 occurred in module "unix" due to a NULL pointer dereference
dump content: kernel pages only
> ::memstat
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 834712 6521 81%
Anon 97092 758 9%
Exec and libs 3492 27 0%
Page cache 3202 25 0%
Free (cachelist) 1543 12 0%
Free (freelist) 88943 694 9%

Total 1028984 8038
Physical 1025981 8015
> ::cpuinfo
ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC
0 0000183bb88 1b 1 0 105 no no t-0 2a100977ca0 sched
1 0000180c000 1d 1 0 41 yes no t-0 30014117080 sas.e9bb95
> panic_thread/J
panic_thread:
panic_thread: 2a100977ca0
> 2a100977ca0::findstack
stack pointer for thread 2a100977ca0: 2a100975d51
000002a100975e01 die+0x78()
000002a100975ee1 trap+0x9e0()
000002a100976031 ktl0+0x48()
000002a100976181 ip_wput_ioctl+0xc4()
000002a100976231 tcp_xmit_early_reset+0x6b8()
000002a100976341 tcp_xmit_listeners_reset+0x1f4()
000002a100976411 ip_tcp_input+0xaf8()
000002a1009764f1 ip_input+0xa70()
000002a100976661 putnext+0x218()
000002a100976711 ce_intr+0x764c()
000002a1009771e1 pci_intr_wrapper+0xb8()
000002a100977291 intr_thread+0x168()
> $r
%g0 = 0x0000000000000000 %l0 = 0x0000060016e5eef0
%g1 = 0x00000000000001c0 %l1 = 0x000000007be78638 ip_ire_delete
%g2 = 0x0000000000005316 %l2 = 0x000000007001ac00 ip_areq_template+0x24
%g3 = 0x000006001f298254 %l3 = 0x0000000000005000
%g4 = 0x000006001f2981f0 %l4 = 0x0000000000000006
%g5 = 0x000006001f2981f0 %l5 = 0x000000007be783a8 ip_ire_advise
%g6 = 0x0000000000000010 %l6 = 0x000000007001aca8 ip_ioctl_ftbl+0x30
%g7 = 0x000002a100977ca0 %l7 = 0x000006001f2981f0

%o0 = 0x0000000000000000 %i0 = 0x00000600421dfb00
%o1 = 0x000002a100977ca0 %i1 = 0x0000060016e01380
%o2 = 0x0000000000000001 %i2 = 0x0000060016dcd0c0
%o3 = 0x0000000000005316 %i3 = 0x0000000000000000
%o4 = 0x0000000000000000 %i4 = 0x0000060011003e48
%o5 = 0x0000000000000064 %i5 = 0x0000000000000000
%o6 = 0x000002a100976181 %i6 = 0x000002a100976231
%o7 = 0x000000007be6a120 ip_wput_ioctl+0xc4 %i7 = 0x000000007bed8b94 tcp_xmit_early_reset+0x6b8

%ccr = 0x44 xcc=nZvc icc=nZvc
%fprs = 0x00 fef=0 du=0 dl=0
%asi = 0x80
%y = 0x0000000000000000
%pc = 0x0000000001047824 mutex_enter+4
%npc = 0x0000000001047828 mutex_enter+8
%sp = 0x000002a100976181 unbiased=0x000002a100976980
%fp = 0x000002a100976231

%tick = 0x0000000000000000
%tba = 0x0000000000000000
%tt = 0x31
%tl = 0x0
%pil = 0x6
%pstate = 0x016 cle=0 tle=0 mm=TSO red=0 pef=1 am=0 priv=1 ie=1 ag=0

%cwp = 0x04 %cansave = 0x00
%canrestore = 0x00 %otherwin = 0x00
%wstate = 0x00 %cleanwin = 0x00
> 2a100977ca0::thread -p
ADDR PROC LWP CRED
000002a100977ca0 1839750 60015eee058 60011003e48
> 1839750::ptree
0000000001839750 sched
0000060013401848 fsflush
0000060013402468 pageout
0000060013403088 init
000006001bd804b8 bpbkar
00000600183879b0 bpbkar
0000030015371ab8 bpbkar
00000300228a5238 bpbkar
0000030035cae210 bpbkar
0000030035ee12a8 bpbkar
0000030034cf8668 bpbkar
000006002dfce180 bpbkar
000003001e6b4e58 bpbkar
000006001b98e4a8 java
000006001bd7e058 dtlogin
000006001b8910c0 fmd
0000060019aeec48 snmpXdmid
000006001b98d888 dmispd
00000600145bf850 vold
000006001b7d0038 snmpdx
000006001b98c048 sendmail
000006001b9f50d0 snmpd
000006001b98f0c8 sendmail
00000600147f3098 syslogd
000006001b8904a0 sshd
000006001b7d1878 automountd
000006001aa9c030 automountd
000006001993b860 smcboot
000006001993a020 smcboot
000006001aa9e490 smcboot
000006001b7d30b8 utmpd
0000060019aee028 inetd
0000030031aa5270 in.telnetd
0000030034f212b0 ksh
0000030026e2bab0 sas.e9bb95
000006001667cda8 elssrv
000006001b9f44b0 in.telnetd
0000030027c7a6c8 ksh
00000600291e0220 sas.e9bb95
000006001b5460e8 elssrv
000006001b98cc68 in.telnetd
000006001b7d2498 ksh
0000060016667990 sas.e9bb95
0000060015f00db0 elssrv
000003002b6a46a8 in.telnetd
> $c
mutex_enter+4(600421dfb00, 60016e01380, 60016dcd0c0, 0, 60011003e48, 0)
tcp_xmit_early_reset+0x6b8(7be25368, 0, 6001f2981f0, 10, 0, 0)
tcp_xmit_listeners_reset+0x1f4(6001c73da80, 14, 0, 60013130000, 60033df1d40, b88c608d)
ip_tcp_input+0xaf8(18, 60015f1ee10, 30000d98068, 60033df1d40, 0, 30000d98068)
ip_input+0xa70(60015f1ee10, 0, 0, 30000d98068, 0, 0)
putnext+0x218(600143b6ed0, 600143b6ce0, 6001c73da80, 100, 600143b6a50, 0)
ce_intr+0x764c(1069128, 0, 6001c73da80, 11999b8, 600143b6a50, 600141eb700)
pci_intr_wrapper+0xb8(60014b12420, 300000b8148, 0, 0, 60014bd9548, 0)
intr_thread+0x168(ffffffff75702bdc, ffffffff7a9263a4, 4, 0, 0, 3)

No comments: