can someone please let me know what this crash means?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

can someone please let me know what this crash means?

Jason Welsh
Just had my server drop into debugger this morning after running for 78 days..
and heres what the trace said..

OPTIONS: History Buffer, F-key Macros, Search History Buffer, I18n
Compiled on Dec  4 2004, 19:14:15.

Press CTRL-A Z for help on special keys
                                                                     
                                                                     
db> AT S7=45 S0=0 L1 V1 X4 &c1 E1 Q0                                  
No such command                                                      
db> trace                                                            
Tracing pid 608 tid 100103 td 0xc2688480
kdb_enter(c07ec00d) at kdb_enter+0x2b
panic(c07e8435,c07eb5f7,c07eb390,366,c299b434) at panic+0x127
mtx_destroy(c812457c,0) at mtx_destroy+0x5c
in_pcbdetach(c81244ec,c616e6f0,c616e6f0,14,e830db44) at in_pcbdetach+0x1b4
tcp_close(c616e6f0,1,0,14,c616e6f0) at tcp_close+0x94
tcp_input(c3e3b200,14,6da8d318,0,0) at tcp_input+0x16df
ip_input(c3e3b200) at ip_input+0x50d
div_output(c2728288,c3e3b200,c263b620,0,e830dc0c) at div_output+0x1f7
div_send(c2728288,0,c3e3b200,c263b620,0) at div_send+0x3f
sosend(c2728288,c263b620,e830dc40,c3e3b200,0) at sosend+0x5e7
kern_sendit(c2688480,3,e830dcbc,0,0) at kern_sendit+0x104
sendit(c2688480,3,e830dcbc,0,bfbeed98) at sendit+0x161
sendto(c2688480,e830dd04,6,6888be,292) at sendto+0x4d
syscall(2f,bfbf002f,e830002f,1,28) at syscall+0x227
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (133, FreeBSD ELF32, sendto), eip = 0x280d31cf, esp = 0xbfbeecdc, ebp =
0xbfbfed88 ---
db>

monsterjam jason $ uname -a
FreeBSD monsterjam.org 5.4-STABLE FreeBSD 5.4-STABLE #2: Fri Aug 26 15:15:59 EDT 2005    
monsterjam.org:/usr/src/sys/i386/compile/FREEBIE  i386


thanks/regards,
Jason

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: can someone please let me know what this crash means?

Robert N. M. Watson-2

On Sun, 13 Nov 2005, Jason wrote:

> Just had my server drop into debugger this morning after running for 78 days..
> and heres what the trace said..
>
> OPTIONS: History Buffer, F-key Macros, Search History Buffer, I18n
> Compiled on Dec  4 2004, 19:14:15.
>
> Press CTRL-A Z for help on special keys

If you have a scroll-back it would be helpful to see the output
immediately above this -- specifically, the panic message.

Also, is the date of the kernel (aug 26) approximately synchronized with
the source code it's built from, or is it earlier source?

If not, could you tell me the revisions of netinet/{tcp*,in_pcb}.c?

Thanks,

Robert N M Watson


> db> AT S7=45 S0=0 L1 V1 X4 &c1 E1 Q0
> No such command
> db> trace
> Tracing pid 608 tid 100103 td 0xc2688480
> kdb_enter(c07ec00d) at kdb_enter+0x2b
> panic(c07e8435,c07eb5f7,c07eb390,366,c299b434) at panic+0x127
> mtx_destroy(c812457c,0) at mtx_destroy+0x5c
> in_pcbdetach(c81244ec,c616e6f0,c616e6f0,14,e830db44) at in_pcbdetach+0x1b4
> tcp_close(c616e6f0,1,0,14,c616e6f0) at tcp_close+0x94
> tcp_input(c3e3b200,14,6da8d318,0,0) at tcp_input+0x16df
> ip_input(c3e3b200) at ip_input+0x50d
> div_output(c2728288,c3e3b200,c263b620,0,e830dc0c) at div_output+0x1f7
> div_send(c2728288,0,c3e3b200,c263b620,0) at div_send+0x3f
> sosend(c2728288,c263b620,e830dc40,c3e3b200,0) at sosend+0x5e7
> kern_sendit(c2688480,3,e830dcbc,0,0) at kern_sendit+0x104
> sendit(c2688480,3,e830dcbc,0,bfbeed98) at sendit+0x161
> sendto(c2688480,e830dd04,6,6888be,292) at sendto+0x4d
> syscall(2f,bfbf002f,e830002f,1,28) at syscall+0x227
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (133, FreeBSD ELF32, sendto), eip = 0x280d31cf, esp = 0xbfbeecdc, ebp =
> 0xbfbfed88 ---
> db>
>
> monsterjam jason $ uname -a
> FreeBSD monsterjam.org 5.4-STABLE FreeBSD 5.4-STABLE #2: Fri Aug 26 15:15:59 EDT 2005
> monsterjam.org:/usr/src/sys/i386/compile/FREEBIE  i386
>
>
> thanks/regards,
> Jason
>
> _______________________________________________
> [hidden email] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "[hidden email]"
>
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: can someone please let me know what this crash means?

Jason Welsh
> If you have a scroll-back it would be helpful to see the output
> immediately above this -- specifically, the panic message.

I did not have the terminal connected at the time of the panic so unfortunately, I dont have it..

>
> Also, is the date of the kernel (aug 26) approximately synchronized with
> the source code it's built from, or is it earlier source?
yes, I believe thats the source code my kernel was built from.

>
> If not, could you tell me the revisions of netinet/{tcp*,in_pcb}.c?

jason@monsterjam netinet $ pwd
/usr/src/sys/netinet
jason@monsterjam netinet $ grep FreeBSD tcp*.c
tcp_debug.c: * $FreeBSD: src/sys/netinet/tcp_debug.c,v 1.25.2.1 2005/01/31 23:26:36 imp Exp $
tcp_hostcache.c: * $FreeBSD: src/sys/netinet/tcp_hostcache.c,v 1.7.2.2 2005/02/13 18:18:33 rwatson Exp $
tcp_input.c: * $FreeBSD: src/sys/netinet/tcp_input.c,v 1.252.2.21 2005/07/05 19:25:42 ps Exp $
tcp_output.c: * $FreeBSD: src/sys/netinet/tcp_output.c,v 1.100.2.7 2005/05/04 13:59:26 andre Exp $
tcp_sack.c: * $FreeBSD: src/sys/netinet/tcp_sack.c,v 1.3.2.9 2005/04/19 18:37:26 ps Exp $
tcp_subr.c: * $FreeBSD: src/sys/netinet/tcp_subr.c,v 1.201.2.21 2005/06/14 11:59:46 rwatson Exp $
tcp_syncache.c: * This software was developed for the FreeBSD Project by Jonathan Lemon
tcp_syncache.c: * $FreeBSD: src/sys/netinet/tcp_syncache.c,v 1.66.2.2 2005/02/12 16:02:59 rwatson Exp $
tcp_timer.c: * $FreeBSD: src/sys/netinet/tcp_timer.c,v 1.66.2.6 2005/01/31 23:26:37 imp Exp $
tcp_usrreq.c: * $FreeBSD: src/sys/netinet/tcp_usrreq.c,v 1.107.2.6 2005/06/14 12:01:03 rwatson Exp $
jason@monsterjam netinet $ grep FreeBSD in_pcb.c
 * $FreeBSD: src/sys/netinet/in_pcb.c,v 1.153.2.10 2005/06/14 11:57:06 rwatson Exp $
jason@monsterjam netinet $

thank you.

Jason

>
> Thanks,
>
> Robert N M Watson
>
>
> >db> AT S7=45 S0=0 L1 V1 X4 &c1 E1 Q0
> >No such command
> >db> trace
> >Tracing pid 608 tid 100103 td 0xc2688480
> >kdb_enter(c07ec00d) at kdb_enter+0x2b
> >panic(c07e8435,c07eb5f7,c07eb390,366,c299b434) at panic+0x127
> >mtx_destroy(c812457c,0) at mtx_destroy+0x5c
> >in_pcbdetach(c81244ec,c616e6f0,c616e6f0,14,e830db44) at in_pcbdetach+0x1b4
> >tcp_close(c616e6f0,1,0,14,c616e6f0) at tcp_close+0x94
> >tcp_input(c3e3b200,14,6da8d318,0,0) at tcp_input+0x16df
> >ip_input(c3e3b200) at ip_input+0x50d
> >div_output(c2728288,c3e3b200,c263b620,0,e830dc0c) at div_output+0x1f7
> >div_send(c2728288,0,c3e3b200,c263b620,0) at div_send+0x3f
> >sosend(c2728288,c263b620,e830dc40,c3e3b200,0) at sosend+0x5e7
> >kern_sendit(c2688480,3,e830dcbc,0,0) at kern_sendit+0x104
> >sendit(c2688480,3,e830dcbc,0,bfbeed98) at sendit+0x161
> >sendto(c2688480,e830dd04,6,6888be,292) at sendto+0x4d
> >syscall(2f,bfbf002f,e830002f,1,28) at syscall+0x227
> >Xint0x80_syscall() at Xint0x80_syscall+0x1f
> >--- syscall (133, FreeBSD ELF32, sendto), eip = 0x280d31cf, esp =
> >0xbfbeecdc, ebp =
> >0xbfbfed88 ---
> >db>
> >
> >monsterjam jason $ uname -a
> >FreeBSD monsterjam.org 5.4-STABLE FreeBSD 5.4-STABLE #2: Fri Aug 26
> >15:15:59 EDT 2005
> >monsterjam.org:/usr/src/sys/i386/compile/FREEBIE  i386
> >
> >
> >thanks/regards,
> >Jason
> >
> >_______________________________________________
> >[hidden email] mailing list
> >http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> >To unsubscribe, send any mail to "[hidden email]"
> >

--
================================================
|    Jason Welsh   [hidden email]        |
| http://monsterjam.org    DSS PGP: 0x5E30CC98 |
|    gpg key: http://monsterjam.org/gpg/       |
================================================

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: can someone please let me know what this crash means?

Robert N. M. Watson-2

On Sun, 13 Nov 2005, Jason wrote:

>> If not, could you tell me the revisions of netinet/{tcp*,in_pcb}.c?
>
> jason@monsterjam netinet $ pwd
> /usr/src/sys/netinet
> jason@monsterjam netinet $ grep FreeBSD tcp*.c
> tcp_debug.c: * $FreeBSD: src/sys/netinet/tcp_debug.c,v 1.25.2.1 2005/01/31 23:26:36 imp Exp $
> tcp_hostcache.c: * $FreeBSD: src/sys/netinet/tcp_hostcache.c,v 1.7.2.2 2005/02/13 18:18:33 rwatson Exp $
> tcp_input.c: * $FreeBSD: src/sys/netinet/tcp_input.c,v 1.252.2.21 2005/07/05 19:25:42 ps Exp $
> tcp_output.c: * $FreeBSD: src/sys/netinet/tcp_output.c,v 1.100.2.7 2005/05/04 13:59:26 andre Exp $
> tcp_sack.c: * $FreeBSD: src/sys/netinet/tcp_sack.c,v 1.3.2.9 2005/04/19 18:37:26 ps Exp $
> tcp_subr.c: * $FreeBSD: src/sys/netinet/tcp_subr.c,v 1.201.2.21 2005/06/14 11:59:46 rwatson Exp $
> tcp_syncache.c: * This software was developed for the FreeBSD Project by Jonathan Lemon
> tcp_syncache.c: * $FreeBSD: src/sys/netinet/tcp_syncache.c,v 1.66.2.2 2005/02/12 16:02:59 rwatson Exp $
> tcp_timer.c: * $FreeBSD: src/sys/netinet/tcp_timer.c,v 1.66.2.6 2005/01/31 23:26:37 imp Exp $
> tcp_usrreq.c: * $FreeBSD: src/sys/netinet/tcp_usrreq.c,v 1.107.2.6 2005/06/14 12:01:03 rwatson Exp $
> jason@monsterjam netinet $ grep FreeBSD in_pcb.c
> * $FreeBSD: src/sys/netinet/in_pcb.c,v 1.153.2.10 2005/06/14 11:57:06 rwatson Exp $
> jason@monsterjam netinet $

Do you use IPv6 on this box, and in particular, TCP over IPv6?  Do you use
tcpdrop(8) to kill TCP connections?

There appears to be one bug fix since the revision you're using relating
to tcpdrop(8) on TCP connections in the TIMEWAIT state that might result
in a panic like the one you're seeing.  The panic and trace look familiar,
but without the panic message it's hard to confirm.  If the machine has
not already been reset, you might try showing the message buffer in DDB,
which would have the effect of printing out the ring buffer that likely
contains the panic message.

There have also been a number of fixes relating to raw sockets which may
also apply to ipdivert related configurations, but I'm not sure they could
lead to this particular panic easily.  This strikes me a a pcb/tcp race of
some sort.

Thanks,

Robert N M Watson


> thank you.
>
> Jason
>
>>
>> Thanks,
>>
>> Robert N M Watson
>>
>>
>>> db> AT S7=45 S0=0 L1 V1 X4 &c1 E1 Q0
>>> No such command
>>> db> trace
>>> Tracing pid 608 tid 100103 td 0xc2688480
>>> kdb_enter(c07ec00d) at kdb_enter+0x2b
>>> panic(c07e8435,c07eb5f7,c07eb390,366,c299b434) at panic+0x127
>>> mtx_destroy(c812457c,0) at mtx_destroy+0x5c
>>> in_pcbdetach(c81244ec,c616e6f0,c616e6f0,14,e830db44) at in_pcbdetach+0x1b4
>>> tcp_close(c616e6f0,1,0,14,c616e6f0) at tcp_close+0x94
>>> tcp_input(c3e3b200,14,6da8d318,0,0) at tcp_input+0x16df
>>> ip_input(c3e3b200) at ip_input+0x50d
>>> div_output(c2728288,c3e3b200,c263b620,0,e830dc0c) at div_output+0x1f7
>>> div_send(c2728288,0,c3e3b200,c263b620,0) at div_send+0x3f
>>> sosend(c2728288,c263b620,e830dc40,c3e3b200,0) at sosend+0x5e7
>>> kern_sendit(c2688480,3,e830dcbc,0,0) at kern_sendit+0x104
>>> sendit(c2688480,3,e830dcbc,0,bfbeed98) at sendit+0x161
>>> sendto(c2688480,e830dd04,6,6888be,292) at sendto+0x4d
>>> syscall(2f,bfbf002f,e830002f,1,28) at syscall+0x227
>>> Xint0x80_syscall() at Xint0x80_syscall+0x1f
>>> --- syscall (133, FreeBSD ELF32, sendto), eip = 0x280d31cf, esp =
>>> 0xbfbeecdc, ebp =
>>> 0xbfbfed88 ---
>>> db>
>>>
>>> monsterjam jason $ uname -a
>>> FreeBSD monsterjam.org 5.4-STABLE FreeBSD 5.4-STABLE #2: Fri Aug 26
>>> 15:15:59 EDT 2005
>>> monsterjam.org:/usr/src/sys/i386/compile/FREEBIE  i386
>>>
>>>
>>> thanks/regards,
>>> Jason
>>>
>>> _______________________________________________
>>> [hidden email] mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>> To unsubscribe, send any mail to "[hidden email]"
>>>
>
> --
> ================================================
> |    Jason Welsh   [hidden email]        |
> | http://monsterjam.org    DSS PGP: 0x5E30CC98 |
> |    gpg key: http://monsterjam.org/gpg/       |
> ================================================
>
>
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: can someone please let me know what this crash means?

Jason Welsh
> Do you use IPv6 on this box, and in particular, TCP over IPv6?  Do you use
> tcpdrop(8) to kill TCP connections?
no ipv6 at all.. and no, I havent used tcpdrop that I know of..

>
> There appears to be one bug fix since the revision you're using relating
> to tcpdrop(8) on TCP connections in the TIMEWAIT state that might result
> in a panic like the one you're seeing.  The panic and trace look familiar,
> but without the panic message it's hard to confirm.  If the machine has
> not already been reset, you might try showing the message buffer in DDB,
> which would have the effect of printing out the ring buffer that likely
> contains the panic message.
sorry, I had to reboot already.. ;)
>
> There have also been a number of fixes relating to raw sockets which may
> also apply to ipdivert related configurations, but I'm not sure they could
> lead to this particular panic easily.  This strikes me a a pcb/tcp race of
> some sort.

I am running ipfw on this box and do have
$fwcmd add divert natd all from any to any via fxp0


hmm, I guess its time to upgrade to 6.0?

Jason


_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: can someone please let me know what this crash means?

Robert N. M. Watson-2

On Sun, 13 Nov 2005, Jason wrote:

>> There have also been a number of fixes relating to raw sockets which
>> may also apply to ipdivert related configurations, but I'm not sure
>> they could lead to this particular panic easily.  This strikes me a a
>> pcb/tcp race of some sort.
>
> I am running ipfw on this box and do have $fwcmd add divert natd all
> from any to any via fxp0
>
> hmm, I guess its time to upgrade to 6.0?

While it looks like a familiar stack trace and I've fixed bugs that sound
a lot like this, I'm not entirely fixed that this specific bug has been
fixed.  Unfortunately, I'm not sure how easily we can debug it without
more information.  I spent a bit of time this evening reviewing all the
diffs between the revisions you're running and current revisions, and
other than IPv6-related and tcpdrop-related changes, I don't see anything
obvious.  I'll spent some more time looking at the stack trace tonight.
Updating to 6.x probably is a good idea, as there are some bugs fixed in
6.x that cannot easily be fixed in 5.x, but I don't promise it will fix
this particular problem.  On the other hand, it apparently took months to
trigger and has not been seen by anyone else, so the changes are low it
will recur before we do find and fix it :-).  I'll do some more reading
over the next few days and see if I see anything.

What's interesting about the ipdivert input path is that it generates
parallelism in the IP input code, which is actually somewhat unusual
unless running with net.isr.direct=1, so if a bug is hiding somewhere
here, that's probably why it's not been triggered by anyone else.

Thanks for the report -- it might not hurt to file a PR with all the
details you have (including the file revisions) and drop me the PR number
so I can grab it and make sure it doesn't fall off my todo list.

Thanks again!

Robert N M Watson
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: can someone please let me know what this crash means?

Robert N. M. Watson-2
In reply to this post by Jason Welsh

On Sun, 13 Nov 2005, Jason wrote:

> I am running ipfw on this box and do have $fwcmd add divert natd all
> from any to any via fxp0
>
> hmm, I guess its time to upgrade to 6.0?

The attached untested patch will most likely prevent the bug from
recurring by eliminating parallelism between the ip_input() call from the
divert socket and other ip_input() processing in the netisr, as it defers
that processing to the netisr.  However, it won't fix the underlying bug,
which I'll keep looking for, and needs to be fixed in order to support
net.isr.direct and various other future plans for network stack behavior.
I'll see if I can dig someone up to test ipdivert changes, since I'm not
set up to test them here easily currently.

Thanks,

Robert N M Watson

Index: ip_divert.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/ip_divert.c,v
retrieving revision 1.113
diff -u -r1.113 ip_divert.c
--- ip_divert.c 13 May 2005 11:44:37 -0000 1.113
+++ ip_divert.c 13 Nov 2005 19:27:32 -0000
@@ -61,6 +61,7 @@
  #include <vm/uma.h>

  #include <net/if.h>
+#include <net/netisr.h>
  #include <net/route.h>

  #include <netinet/in.h>
@@ -378,7 +379,7 @@
  SOCK_UNLOCK(so);
  #endif
  /* Send packet to input processing */
- ip_input(m);
+ netisr_queue(NETISR_IP, m);
  }

  return error;
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"