Linux on a Lenovo ThinkCentre M920x Tiny

Blinky90

New Member
Feb 19, 2021
10
1
3
Hello there.

I try to install a Linux on my ThinkCentre M920x Tiny for a few days now.
During my google research I saw that post and therefore decided to register on that forum:

So whats the problem:
When I try to install a Linux on the system (I tried Ubuntu 20.04, Ubuntu 21.10, Debian, Debian with non free-firmware and Arch Linux). I encounter an kernel Error.

The error looks like this:
mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 0:
and it occoures everytime random at different ponts during the setup, usually before the installation starts.

It looks like the the kernel is somehow incombatible with the hardware of the M920x Tiny.

During my research I saw a few posts suggesting different things. A memory error seems unlikely, a 24h memtest showed no errors.
Also e temperatur issue seems unlikely, The CPU hardly goes over 60c.

I allready did the usual steps like Legacy instead of UEFI Boot and deactivateing Secure boot.
Sadly I run out of options but it seems like the author of the post mentioned above were also able to set up a linux on that damn thing.

Has anyone some advice for me?

attached is a "screenshot" of the error message under Ubuntu
 

Attachments

gb00s

Active Member
Jul 25, 2018
411
140
43
Malta
I have Debian 10 Testing perfectly fine running on a M920 (tiny).

Check CPU. We have these at work for our people as 'desktop horse'. Ok with Win10, but .... 2 of 20 died. CPU done. Both gave the same error. In a 3rd it was a memory stick. Also without indication from memory testing. Changed RAM module and all was fine. A 4th is still working with the same error after we shut off the Turbo in Bios. A 5th still runs with just 1x 8GB module as it doesn't recognize the 2nd module.

We had 25% failure rate ... well ... :rolleyes:
 

Blinky90

New Member
Feb 19, 2021
10
1
3
Thanks for that reply!
Sadly the machine ran fine with the pre installed Win10.

But I would like to check the hardware anyway. How do you check the CPU? In order to get my warranty i need to have something reproducable :)
 

gb00s

Active Member
Jul 25, 2018
411
140
43
Malta
Oh, on Win10 it ran fine?

Have you considered playing with various Bios-settings, like turbo and C-states, some Intel optimized 'power' stuff? These tiny little things are heavily optimized for power consumption in Win10. Maybe some Intel management (power) thing optimized for Win10 just fails on Linux.
How do you check the CPU? ...
I always run Prime95 ... A broken CPU should fail it. Just set the Bios to Max Thermal Performance.
 

cageek

Member
Jun 22, 2018
33
26
18
If you want to try a bleeding edge kernel you might try something like: Fedora nightly compose finder
I only know the Fedora one, but there's probably the equivalent for other dists.

I had a similar problem with a box that wasn't compatible with kernels from about 5.6 to 5.8. The workaround was to install a much older version (Fedora 30) and then upgrade to the current version (Fedora 32). This process downloaded the most recent version of the kernel and skipped over the problem kernels in the baseline distributions.

That is if its just a kernel problem and not a machine problem.
 

Blinky90

New Member
Feb 19, 2021
10
1
3
@gb00s Yeah I didn't run the windoows for long, but as it run it ran fine.
I am gonna play around with the bios settings tomorrow and see If I can find any good combination.

I allraedy disabled to "fancy" c-states. I found a bug in the kernel bugtracker that said that high c-satates are a problem on some cpus.
But it should be fixed by kernel 5.8.

@cageek The arch I used allready had kernel 5.10 built in (The Valentines build of Linus :p) and it also failed.
What I can try is to download an old version of ubuntu, maybe a 16.04 and try that one.

if that all fails I am gonna set up a windows and let Prime95 run for a while.
Thanks for all the hints.
 

Blinky90

New Member
Feb 19, 2021
10
1
3
Sorry for douple posting, I just want to update this with my latest recognitions.
So I fiddeled around with my bios settings as requested by @gb00s.
To test I used an Arch Linux Live system with Linux kernel 5.10.

I turned on and off several settings in various combinations and found out that the error does not longer occur when I disable Multi-Core-CPU in BIOS.

When I disable it I stressed the CPU in my live system and it worked stable.
Multi-Core-CPU setting (not to be confused with hyperthreading) is the one and only setting that causes those "mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 0" errors. Even stressing the cpu over a longe period of time didnt mess with the stability of the system, so I guess its not a thermal issue.

So finally I could narrow it down. Question now is how can I get rid of it.
My linux now only operates on one of the cpu cores which is pretty "slow" ^^
 

gb00s

Active Member
Jul 25, 2018
411
140
43
Malta
So under Linux, your CPU is only able to run on 1-core. I'm just wondering why this is not the case under Win10.

Any firmware/bios updates done recently? If any ...

Edit: What are your system specs? CPU, RAM ...
Edit: It would be beneficial if you would stay on one distro, install it with one core only, then change bios and run it ... collect logs. These machine check events are nasty ... just distro-hopping won't bring us to the root of the cause
Edit: Check /var/log/messages
 
Last edited:

Blinky90

New Member
Feb 19, 2021
10
1
3
Yeah, you're right.

No, but it the latest BIOS version according to Lenovos support site is installed allready.

I've got this one:
Prozessor
Anzahl Prozessoren1 x
Prozessor-FamilieCore i9 9th Gen
ProzessortypIntel Core i9-9900T
Prozessor Taktfrequenz2.10 GHz
Max. Turbo-Taktfrequenz4.40 GHz
Max. TDPi35 W
Anzahl Prozessorkernei8 -Core
Anzahl Threads16
Cache16 MB
ProzessorsockeliLGA 1151


Arbeitsspeicher
Totaler Arbeitsspeicher16 GB
Verbaute Arbeitsspeicher Module16 GB
Max. unterstützter Arbeitsspeicher32 GB
ArbeitsspeichertypDDR4-RAM
RAM Geschwindigkeit2666 MHz
Arbeitsspeicher FormfaktorSO-DIMM 260 pin
Anzahl Arbeitsspeichersteckplätze2
Anzahl freie Arbeitsspeichersteckplätze1

I installed ubuntu using one core. Everything went fine and the system is stable when useing it with one core only.
When I reboot and switch to multi core the desktop crashes short after the GUI loads.

There is no /var/log/message.
But I'll checked var/log/syslog it also contains kernel related logs, but nothing interessting that might could have to do with the crash that occours.

Very strange :/
 

gb00s

Active Member
Jul 25, 2018
411
140
43
Malta
Just out of curiosity. Did you buy this workstation new or used? If used, what power supply came with it?

Add: Can you provide output of ‘dmesg | grep microcode’ from a 1core run and install/update intel-microcode?
 

Blinky90

New Member
Feb 19, 2021
10
1
3
No its a brand new product with just original parts.
The latest intel-microcode were allready installed


sudo dmesg | grep microcode
[ 0.906011] microcode: sig=0x906ed, pf=0x2, revision=0xde
[ 0.906066] microcode: Microcode Update Driver: v2.2.
 

gb00s

Active Member
Jul 25, 2018
411
140
43
Malta
Sorry if I keep asking but I don't have the machine in front of me.

1. Did you set the Turbo to off while Multi-Core is on?
2. You have the 65W PSU? Any chance you have a 90W or 130W PSU around? From a Thinkpad maybe?
3. Can you set Ubunto to start into multi-user.target and see if it boots into it, login and just show htop for instance.?

ADD: I have the feeling under Linux and with Multi-Core enabled and eg. Turbo enabled, this 'beast' draws much more power under Linux then it should. PSU just ready for 65W. Forget the 35W TDP of these CPU's if you go Multi-Core and Turbo ... Let it draw 50W and with the other components using 15+W you are done ... Just an idea
 
Last edited:

Blinky90

New Member
Feb 19, 2021
10
1
3
Absolutly no problem @gb00s.
I am very gratefull for the effort you put into helping me.

1. Did you set the Turbo to off while Multi-Core is on?
-> Yes, here a screen of my current CPU settings:
photo_2021-02-21_22-32-26.jpg

2. You have the 65W PSU? Any chance you have a 90W or 130W PSU around? From a Thinkpad maybe?
-> No it was delivered with this 135W PSU.
photo_2021-02-21_22-32-26 (2).jpg
I also had a smaller one from my Lenovo Thinkpad, but i guess that don't deliver more then 135W
photo_2021-02-21_22-32-25.jpg


3. Can you set Ubunto to start into multi-user.target and see if it boots into it, login and just show htop for instance.?
-> Allright, I reconfigured the runlevel target to multi-user. Then I switched on "Core Multi-Processing" and booted the system.
The first few attemps lead to that:
photo_2021-02-21_22-32-25 (2).jpg

After a few trys I managed to login and start htop. But after a few seconds the crash occoured:
photo_2021-02-21_22-32-25 (3).jpg
 

gb00s

Active Member
Jul 25, 2018
411
140
43
Malta
Oh I'm sorry. I somehow thought you had a 65W PSU only. I don't know why.

1. Here EIST is on by default.
2. Can you set Hyperthreading to off and Multi-Core on, but Turbo off. Enable C-states and set it to C1C3C6C7C8
3. Disable Virtualization and VT-d

Just try and error.

ADD: can you try the following kernel-options and update GRUB?
noapic pci=assign-busses apicmaintimer idle=poll reboot=cold,hard
 
Last edited:

Jeggs101

Well-Known Member
Dec 29, 2010
1,497
231
63
I think @gb00s is on the right track. I tried this on a Dell with Ubuntu 20.04 lts and saw something similar. I can't remember what but it was a c-state bios setting. I think I've also seen linux errors on these with security settings. These systems sometimes have bios security settings assuming Windows but when you run linux it isn't happy.

There's hope for you. I've got a cluster of 12 of these at work, and they're all running Ubuntu Server. The actual hardware in these is high-volume desktop hardware.
 

Blinky90

New Member
Feb 19, 2021
10
1
3
@gb00s I tried with various combination of the BIOS settings you recomended.
It really seems like the only trigger for that error is the Multi-Core.

Also passing the kernel options via grub don't change anything. Thats really weird, I am about to give up on this.

@Jeggs101 The only security setting that seem to matter are UEFI Boot and Secure Boot, i disabled both without any change.
Would it be possible for you to tell me with wich settings you could manage those damn things to run? :)
 
  • Like
Reactions: Jeggs101

gb00s

Active Member
Jul 25, 2018
411
140
43
Malta
I would not give up so quickly.
  1. Can you deactivate any periphery, like Bluetooth, wifi etc. ?
  2. Can you deactivate all Intel Manageability options
  3. Can you set a SECURITY CHIP option to inactive under TCG? Maybe clear the TPM if possible (not sure what it does with any warranty if your device was bought new :rolleyes:
  4. Can you start in LEGACY MODE with CSM activated and SECURE BOOT off?
  5. Can you enable CRID support for the CPU?
 

Blinky90

New Member
Feb 19, 2021
10
1
3
@gb00s
Hehe i'll keep trying it.

Don't worry about the warranty. Since im dumb, I threw away the original packing so they refuse to take it back anyway.

- > Yeah I disabled bluetooth, wifi and thunderbold.
What do you mean with Security Chip under TCO and TMP? I didn't find that options.
Also enabeling CSM and booting in Legacy only without secureboot did not help.

Also the CRID Support I didnt find? Are those also BIOS settings?

In the meanwhile i installed kernel 5.11 on my ubuntu. But the error remains.
 

Blinky90

New Member
Feb 19, 2021
10
1
3
Sorry for doubleposting but I finally have some good news.
I figured out that an Ubuntu 19.10 seems to run.
Ubuntu 19.10 ships with Kernel 5.3.

When I update to kernel 5.4 the problem occours again. So it seems the hardware does need a kernel 5.3 to run.

However ubuntu 19.10 is end of life and Ubuntu 20.04 ships with kernel 5.8.
Does anyone of you know if it is possible to downgrade the kernel of Ubuntu 20.04 to version 5.3?
 

gb00s

Active Member
Jul 25, 2018
411
140
43
Malta
Then try 18.04 LTS till end of 2023 I believe.

Thats progress but we then need to figure out what changed in 5.4.. Wasnt 5.4 the kernel with big Spectre changes. If yes you could avoid these patches with kernel message in grub. Or go the only reasonable option and go for Gentoo, mask everything >=5.4 kernel-wise and keep everything up to date.

Regarding the proposed bios options, I wasnt aware the bios differences are so big between M900 and M920x.

Ubuntu mainline kernel 5.3 is available here Index of /~kernel-ppa/mainline/v5.3 . Download in a folder, followed by a dpkg -i *.deb in this folder, then update-grub. Reboot into the new kernel and remove the newer kernel and update-grub. But Im not an Ubunto freak, so.... take my guide with caution please. But upgrading to 18.04 LTS with 5.3 should be less troublesome than downgrading 20.04 LTS to 5.4 ... maybe.

Omg ... just found this: Ubuntu 18.04.4 Released with Kernel 5.3 [How-to Install]
 
Last edited:
  • Like
Reactions: Jeggs101