SabreLite failing to boot due to failing "to start udev Coldplug all Devices"
Affected images versions
apertis_v2020-target-armhf-uboot_20200225.0.img.gz
Unaffected images versions
Unknown
Steps to reproduce
- Flash image to SD Card
- Insert SD Card and boot SabreLite
- Ensure serial console of SabreLite is connected up to host PC
- Run the following script on host PC (note, serial console used must match that to which the Sabrelite is attached):
lang=python
import os
import pexpect.fdpexpect
import sys
import time
fd = os.open("/dev/ttyUSB0", os.O_RDWR|os.O_NONBLOCK|os.O_NOCTTY)
target = pexpect.fdpexpect.fdspawn(fd, encoding='utf-8', logfile=sys.stdout)
count = 0
target.send("\n")
try:
while True:
target.expect_exact(["login:"], timeout=120)
target.send("user\n")
target.expect_exact(["word:"])
target.send("user\n\n")
target.expect_exact(["$"])
time.sleep(30)
target.send("sudo reboot\n")
count += 1
except:
print("Looped %d times before failure" % count)
- Wait for failure, it may take quite a long time. Mine was failing after area:test-failure loops.
Expected result
- Script is able to run continuously without failure.
Actual result
- Barring power loss or other new and interesting bugs, the board eventually fails with:
[FAILED] Failed to start udev Coldplug all Devices.
See 'systemctl status systemd-udev-trigger.service' for details.
At which point the board attempts to start the emergency console (which fails as the root account is locked by default).
Impact of bug
Board fails to boot, but issue seems relatively infrequent occurring approximately once every 150 boots. Hard reset results in the board booting again.
Attachments
Log from previous run: {F345549}
Root cause
Root cause is understood to be a race condition, caused by the power domain not deferring properly.
Outcomes
TBD
Management data
This section is for management only, it should be the last one in the description.
Phabricator link: https://phabricator.apertis.org/T6795