주요 콘텐츠로 건너뛰기
수리 커뮤니티 가입 - 계정 만들기

A2115/2020 / 프로세서 3.1GHz 6코어 i5부터 최대 3.8GHz 8코어 i7까지. 2020년 8월 4일 출시.

Kernel panic when CPU is hot for long time

Hi everyone!

TLDR: T2 / PCH related kernel panic if CPU is hot for a long time, boots only after I let it cool down. No kernel panic during normal/light use. Possibly a faulty component on the motherboard that has a bad connection? Possible fix if opened?

EDIT: More tests in the comments

Long version:

I obtained a faulty 2020 iMac 5K with an i7-10700K and 5500XT to be used as a DIY 5K project base. The ad said that the GPU is faulty, it randomly restarts but during normal use, no problem.

The screen is the most important for me (only slight pink hue around the edge, no problem), I did not really care about the issue but here is what I discovered and it made me more interested in fixing the issue.

I benchmarked the GPU using Heaven Benchmark for 1-2 hours running at max fan speed, the GPU was at 80-90 degrees and it did not restart.

Then I benchmarked the CPU using Cinebench, survived 10 minute single-core but crashed 2-3 seconds after starting multi-core. Later when it cooled down, I tested the multi-core again and it lasted a lot longer but not 10 minutes.

When it restarts, sometimes it crashes on boot but mostly it gets to the login screen, can stay there for hours but after entering my password and it would start loading everything, it crashes until I let it cool down so it has a thermal headroom or something. Macs Fan Control turns up the fan speed immediately after login but still not early enough, I also turned off Intel Turbo Boost to decrease the temp generation.

The kernel panic logs (when present) show T2 / PCH / SEP related crashes (BAD MAGIC, x86 global reset detected - CORE 0 is the one that panicked / void AppleEmbeddedPCIeUpLinkMgmt::_linkInterruptAction(IOInterruptEventSource *, int): A link timeout has been seen after 650000 microseconds and 49999 iterations - CORE 0 is the one that panicked

But the weird thing is that I have been using this thing everyday for basic tasks, logging in, sleeping, passwd auth, everything seems to be working as usual. I guess normal tasks use the T2 as well but it does not heat up that much maybe?

What I'm planning to do in the coming weeks is to open it up, check visual defects on the motherboard, get an LGA1200 PC motherboard to test if the CPU is okay or not.

This whole issue seems to be only happening when the CPU is over 75-80 degrees for longer period of time when the nearby components are also heated up, I suspect a faulty connection somewhere that is when hot, not connecting correctly. Maybe the T2 chip's connection is bad or something?

What do you think, what would be the best steps to troubleshoot this issue? Is there a tool that only stress tests the T2 chip and not the CPU? Maybe a feature in macOS that really stresses that?

Thank you in advance!

이 질문에 답하기 저도 같은 문제가 있습니다

좋은 질문입니까?

점수 0
댓글 달기

답변 2개

가장 유용한 답변

I would try replacing the thermal pads or paste... seems like thermal throttling.

이 답변이 도움이 되었나요?

점수 1

댓글 19개:

Will definitely try but the previous owner said it has been replaced already on the CPU/GPU. During a thermal throttle, the CPU/GPU would decrease the performance, not kernel panic, no?

Maybe the T2 has thermal pad/paste as well? I suspect that T2 overheats or something there and that is the reason for the crash. That is why I want to try stressing only the T2, not the CPU to validate this theory

Could also be a GPU failure

You can also try Apple Diagnostics. Just turn off your Mac, turn it back on and immediately press and hold the D key until you see a language selection or progress bar.

I tested the GPU under full load for 1-2 hours with temps reaching above 80 degrees for the GPU and it did not restart. I once tried to use Apple Diag after many reboots and it also crashed during the test, did not get any result code, will try to take the computer outside and run the test in 10 degrees ambient temp

Just ran a diagnostic test outside, no issues were found

댓글 14개 더보기

댓글 달기

Have you installed a good thermal monitoring App which also allows you to boost the fans RPM/ I personally like TG-Pro it will allow you to see what's getting too hot and you can boost the fan's RPM so you don't cook things. I also like it as it can create a log (CVS file) tracking the temps so you can see when the error pops what was happening

I would also make sure the fan blades and the heatsink fin area is full clean of dust and debris.

이 답변이 도움이 되었나요?

점수 1

댓글 14개:

I performed many tests this evening, the results are documented under Amazing FiXeR’s answer as comments, I use Macs Fan Control to check the temps and set the fan speed.

The previous owner said that it has been cleaned in a tech shop, but I will open it up when I have time, perform a visual inspection and maybe replace the thermal paste but according to my tests, the issue is not with the CPU or GPU or RAM.

The temps are normal, or actually what is visible in the app. The CPU under heavy load can reach 100 degrees but it throttles down to 90-95 as usual but the test keeps going for 20 or more minutes (outside with ambient temps below 10 celsius) if the GPU test is not running. GPU test can go for hours even inside

During normal use (Safari, code editing, document editing, chatting) it does not heat up, I also set the fan speed manually to speed up when the CPU temp is at 55 degrees but I suspect there is another component either on the motherboard or the PSU itself that heats up and causing the crash

@scania471 - yes I think you're right this is a deeper logic board fault. There are six VRM models if I remember which regulate the power to the CPU in this series which can overheat as they sit quite close to the CPU. It could be as simple as a cold solder joint on one of these and there support components.

@danj I just opened the iFixit teardown of this iMac this afternoon and saw a comment about these VRM modules. I have been thinking about them for hours and the fact that the i5 version has less of these modules then the i7/i9 versions but this machine came with an i7 from the factory, so no problem there.

If one of these modules are bad, that could be an answer for the crashes in warm environment and making it past 20 minutes in cold weather but I can't seem to find an answer for crashing if I started both CPU and GPU tests even not fully loaded (like 70 degrees max) and crashing in 2-3 minutes but I will definitely check those modules visually when I get to open this thing

@danj I took it apart, checked the VRM modules and this is the only weird thing I could find: https://imgur.com/a/HrKm5N0

It does not seem to be cracked, just a scratch or similar. I tried to poke every component on the motherboard that is this size but none of them moved. If one side is not connecting perfectly, it should move at least a little bit, right?

@scania471 - Sorry I don't see the crack, are you speaking about the darker mold seam line on the inductor? They look OK from what I can see. The view of the VRM chips are being blocked, can you take a picture straight down nice and tight like this one?.

As far as a cold solder joint, that doesn't mean it's physically loose. I was thinking the VRM chip it's self or maybe the capacitor or resistors around them.

댓글 9개 더보기

댓글 달기

답변을 추가하세요

Martin Terhes 영원히 감사할 것입니다.
조회 통계:

지난 24시간: 0

지난 7일: 10

지난 30일: 31

전체 시간: 234