.NET线程异常退出,为何引发程序崩溃,深层原因究竟是什么?
摘要:一:背景 1. 讲故事 前天收到了一个.NET程序崩溃的dump,经过一顿分析之后,发现祸根是因为一个.NET托管线程(DBG=XXXX)的异常退出所致,参考如下: 0:011> !t ThreadCount: 17
一:背景
1. 讲故事
前天收到了一个.NET程序崩溃的dump,经过一顿分析之后,发现祸根是因为一个.NET托管线程(DBG=XXXX)的异常退出所致,参考如下:
0:011> !t
ThreadCount: 17
UnstartedThread: 0
BackgroundThread: 16
PendingThread: 0
DeadThread: 0
Hosted Runtime: no
Lock
DBG ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception
0 1 84d8 000001C0801EAC20 26020 Preemptive 0000000000000000:0000000000000000 000001c080266300 -00001 STA
3 2 9d78 000001C0801F8210 2b220 Preemptive 0000000000000000:0000000000000000 000001c080266300 -00001 MTA (Finalizer)
4 4 8760 000001C08466C800 102b220 Preemptive 0000000000000000:0000000000000000 000001c080266300 -00001 MTA (Threadpool Worker)
...
44 16 b2fc 000001C08F949450 102b220 Preemptive 0000000000000000:0000000000000000 000001c080266300 -00001 MTA (GC) (Threadpool Worker)
46 15 9904 000001C08F9487B0 102b220 Preemptive 0000000000000000:0000000000000000 000001c080266300 -00001 MTA (Threadpool Worker)
XXXX 3 a23c 000001C08F948E00 102b220 Preemptive 0000000000000000:0000000000000000 000001c080266300 -00001 Ukn (Threadpool Worker)
由于线程异常退出,CLR此时完全不知情,当 GC 触发时会在这个XXXX线程上寻找引用根,由于是一个不存在的线程,所以访问它的空间自然就是访问违例,从 ScanStackRoots 函数调用栈上可以清晰的看到,参考如下:
0:011> .ecxr
rax=00007ffdbefcc8a0 rbx=000000a42007f5f0 rcx=000000a42187f688
rdx=0000000000000000 rsi=000000a42007ee60 rdi=000000a42007f100
rip=00007ffdbec36cbb rsp=000000a42007f828 rbp=000001c08f948e00
r8=000000a42007f910 r9=000001c08f948e00 r10=00000fffb7da5860
r11=0555501544555545 r12=ffffffffffffffff r13=0000000000000000
r14=0000000000000000 r15=00007ffdbec14fb0
iopl=0 nv up ei pl nz ac pe cy
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010211
coreclr!InlinedCallFrame::FrameHasActiveCall+0x13:
00007ffd`bec36cbb 483b01 cmp rax,qword ptr
